NASA Astrophysics Data System (ADS)
Yu, Leiming; Nina-Paravecino, Fanny; Kaeli, David; Fang, Qianqian
2018-01-01
We present a highly scalable Monte Carlo (MC) three-dimensional photon transport simulation platform designed for heterogeneous computing systems. Through the development of a massively parallel MC algorithm using the Open Computing Language framework, this research extends our existing graphics processing unit (GPU)-accelerated MC technique to a highly scalable vendor-independent heterogeneous computing environment, achieving significantly improved performance and software portability. A number of parallel computing techniques are investigated to achieve portable performance over a wide range of computing hardware. Furthermore, multiple thread-level and device-level load-balancing strategies are developed to obtain efficient simulations using multiple central processing units and GPUs.
Integration of an intelligent systems behavior simulator and a scalable soldier-machine interface
NASA Astrophysics Data System (ADS)
Johnson, Tony; Manteuffel, Chris; Brewster, Benjamin; Tierney, Terry
2007-04-01
As the Army's Future Combat Systems (FCS) introduce emerging technologies and new force structures to the battlefield, soldiers will increasingly face new challenges in workload management. The next generation warfighter will be responsible for effectively managing robotic assets in addition to performing other missions. Studies of future battlefield operational scenarios involving the use of automation, including the specification of existing and proposed technologies, will provide significant insight into potential problem areas regarding soldier workload. The US Army Tank Automotive Research, Development, and Engineering Center (TARDEC) is currently executing an Army technology objective program to analyze and evaluate the effect of automated technologies and their associated control devices with respect to soldier workload. The Human-Robotic Interface (HRI) Intelligent Systems Behavior Simulator (ISBS) is a human performance measurement simulation system that allows modelers to develop constructive simulations of military scenarios with various deployments of interface technologies in order to evaluate operator effectiveness. One such interface is TARDEC's Scalable Soldier-Machine Interface (SMI). The scalable SMI provides a configurable machine interface application that is capable of adapting to several hardware platforms by recognizing the physical space limitations of the display device. This paper describes the integration of the ISBS and Scalable SMI applications, which will ultimately benefit both systems. The ISBS will be able to use the Scalable SMI to visualize the behaviors of virtual soldiers performing HRI tasks, such as route planning, and the scalable SMI will benefit from stimuli provided by the ISBS simulation environment. The paper describes the background of each system and details of the system integration approach.
MADNESS: A Multiresolution, Adaptive Numerical Environment for Scientific Simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Harrison, Robert J.; Beylkin, Gregory; Bischoff, Florian A.
2016-01-01
MADNESS (multiresolution adaptive numerical environment for scientific simulation) is a high-level software environment for solving integral and differential equations in many dimensions that uses adaptive and fast harmonic analysis methods with guaranteed precision based on multiresolution analysis and separated representations. Underpinning the numerical capabilities is a powerful petascale parallel programming environment that aims to increase both programmer productivity and code scalability. This paper describes the features and capabilities of MADNESS and briefly discusses some current applications in chemistry and several areas of physics.
Generalized Operations Simulation Environment for Aircraft Maintenance Training
2004-04-01
Operations Simulation Environment ( GOSE ) project is a collaborative effort between AETC and AFRL to develop common, cost-effective, generalized VR training...maintenance training domain since it provided an opportunity to build on the VEST architecture. Development of GOSE involves re-engineering VEST as a scalable...modular, immersive VR training system comprised of PC-based hardware and software. GOSE initiatives include: (a) formalize training needs across
MADNESS: A Multiresolution, Adaptive Numerical Environment for Scientific Simulation
Harrison, Robert J.; Beylkin, Gregory; Bischoff, Florian A.; ...
2016-01-01
We present MADNESS (multiresolution adaptive numerical environment for scientific simulation) that is a high-level software environment for solving integral and differential equations in many dimensions that uses adaptive and fast harmonic analysis methods with guaranteed precision that are based on multiresolution analysis and separated representations. Underpinning the numerical capabilities is a powerful petascale parallel programming environment that aims to increase both programmer productivity and code scalability. This paper describes the features and capabilities of MADNESS and briefly discusses some current applications in chemistry and several areas of physics.
An efficient and scalable deformable model for virtual reality-based medical applications.
Choi, Kup-Sze; Sun, Hanqiu; Heng, Pheng-Ann
2004-09-01
Modeling of tissue deformation is of great importance to virtual reality (VR)-based medical simulations. Considerable effort has been dedicated to the development of interactively deformable virtual tissues. In this paper, an efficient and scalable deformable model is presented for virtual-reality-based medical applications. It considers deformation as a localized force transmittal process which is governed by algorithms based on breadth-first search (BFS). The computational speed is scalable to facilitate real-time interaction by adjusting the penetration depth. Simulated annealing (SA) algorithms are developed to optimize the model parameters by using the reference data generated with the linear static finite element method (FEM). The mechanical behavior and timing performance of the model have been evaluated. The model has been applied to simulate the typical behavior of living tissues and anisotropic materials. Integration with a haptic device has also been achieved on a generic personal computer (PC) platform. The proposed technique provides a feasible solution for VR-based medical simulations and has the potential for multi-user collaborative work in virtual environment.
Novel high-fidelity realistic explosion damage simulation for urban environments
NASA Astrophysics Data System (ADS)
Liu, Xiaoqing; Yadegar, Jacob; Zhu, Youding; Raju, Chaitanya; Bhagavathula, Jaya
2010-04-01
Realistic building damage simulation has a significant impact in modern modeling and simulation systems especially in diverse panoply of military and civil applications where these simulation systems are widely used for personnel training, critical mission planning, disaster management, etc. Realistic building damage simulation should incorporate accurate physics-based explosion models, rubble generation, rubble flyout, and interactions between flying rubble and their surrounding entities. However, none of the existing building damage simulation systems sufficiently faithfully realize the criteria of realism required for effective military applications. In this paper, we present a novel physics-based high-fidelity and runtime efficient explosion simulation system to realistically simulate destruction to buildings. In the proposed system, a family of novel blast models is applied to accurately and realistically simulate explosions based on static and/or dynamic detonation conditions. The system also takes account of rubble pile formation and applies a generic and scalable multi-component based object representation to describe scene entities and highly scalable agent-subsumption architecture and scheduler to schedule clusters of sequential and parallel events. The proposed system utilizes a highly efficient and scalable tetrahedral decomposition approach to realistically simulate rubble formation. Experimental results demonstrate that the proposed system has the capability to realistically simulate rubble generation, rubble flyout and their primary and secondary impacts on surrounding objects including buildings, constructions, vehicles and pedestrians in clusters of sequential and parallel damage events.
Jeong, Seol Young; Jo, Hyeong Gon; Kang, Soon Ju
2015-01-01
Indoor location-based services (iLBS) are extremely dynamic and changeable, and include numerous resources and mobile devices. In particular, the network infrastructure requires support for high scalability in the indoor environment, and various resource lookups are requested concurrently and frequently from several locations based on the dynamic network environment. A traditional map-based centralized approach for iLBSs has several disadvantages: it requires global knowledge to maintain a complete geographic indoor map; the central server is a single point of failure; it can also cause low scalability and traffic congestion; and it is hard to adapt to a change of service area in real time. This paper proposes a self-organizing and fully distributed platform for iLBSs. The proposed self-organizing distributed platform provides a dynamic reconfiguration of locality accuracy and service coverage by expanding and contracting dynamically. In order to verify the suggested platform, scalability performance according to the number of inserted or deleted nodes composing the dynamic infrastructure was evaluated through a simulation similar to the real environment. PMID:26016908
Drawert, Brian; Trogdon, Michael; Toor, Salman; Petzold, Linda; Hellander, Andreas
2016-01-01
Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools and a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments.
NASA Technical Reports Server (NTRS)
West, Jeff; Yang, H. Q.
2014-01-01
There are many instances involving liquid/gas interfaces and their dynamics in the design of liquid engine powered rockets such as the Space Launch System (SLS). Some examples of these applications are: Propellant tank draining and slosh, subcritical condition injector analysis for gas generators, preburners and thrust chambers, water deluge mitigation for launch induced environments and even solid rocket motor liquid slag dynamics. Commercially available CFD programs simulating gas/liquid interfaces using the Volume of Fluid approach are currently limited in their parallel scalability. In 2010 for instance, an internal NASA/MSFC review of three commercial tools revealed that parallel scalability was seriously compromised at 8 cpus and no additional speedup was possible after 32 cpus. Other non-interface CFD applications at the time were demonstrating useful parallel scalability up to 4,096 processors or more. Based on this review, NASA/MSFC initiated an effort to implement a Volume of Fluid implementation within the unstructured mesh, pressure-based algorithm CFD program, Loci-STREAM. After verification was achieved by comparing results to the commercial CFD program CFD-Ace+, and validation by direct comparison with data, Loci-STREAM-VoF is now the production CFD tool for propellant slosh force and slosh damping rate simulations at NASA/MSFC. On these applications, good parallel scalability has been demonstrated for problems sizes of tens of millions of cells and thousands of cpu cores. Ongoing efforts are focused on the application of Loci-STREAM-VoF to predict the transient flow patterns of water on the SLS Mobile Launch Platform in order to support the phasing of water for launch environment mitigation so that vehicle determinantal effects are not realized.
Automation of multi-agent control for complex dynamic systems in heterogeneous computational network
NASA Astrophysics Data System (ADS)
Oparin, Gennady; Feoktistov, Alexander; Bogdanova, Vera; Sidorov, Ivan
2017-01-01
The rapid progress of high-performance computing entails new challenges related to solving large scientific problems for various subject domains in a heterogeneous distributed computing environment (e.g., a network, Grid system, or Cloud infrastructure). The specialists in the field of parallel and distributed computing give the special attention to a scalability of applications for problem solving. An effective management of the scalable application in the heterogeneous distributed computing environment is still a non-trivial issue. Control systems that operate in networks, especially relate to this issue. We propose a new approach to the multi-agent management for the scalable applications in the heterogeneous computational network. The fundamentals of our approach are the integrated use of conceptual programming, simulation modeling, network monitoring, multi-agent management, and service-oriented programming. We developed a special framework for an automation of the problem solving. Advantages of the proposed approach are demonstrated on the parametric synthesis example of the static linear regulator for complex dynamic systems. Benefits of the scalable application for solving this problem include automation of the multi-agent control for the systems in a parallel mode with various degrees of its detailed elaboration.
Shahriari, Mohammadali; Biglarbegian, Mohammad
2018-01-01
This paper presents a new conflict resolution methodology for multiple mobile robots while ensuring their motion-liveness, especially for cluttered and dynamic environments. Our method constructs a mathematical formulation in a form of an optimization problem by minimizing the overall travel times of the robots subject to resolving all the conflicts in their motion. This optimization problem can be easily solved through coordinating only the robots' speeds. To overcome the computational cost in executing the algorithm for very cluttered environments, we develop an innovative method through clustering the environment into independent subproblems that can be solved using parallel programming techniques. We demonstrate the scalability of our approach through performing extensive simulations. Simulation results showed that our proposed method is capable of resolving the conflicts of 100 robots in less than 1.23 s in a cluttered environment that has 4357 intersections in the paths of the robots. We also developed an experimental testbed and demonstrated that our approach can be implemented in real time. We finally compared our approach with other existing methods in the literature both quantitatively and qualitatively. This comparison shows while our approach is mathematically sound, it is more computationally efficient, scalable for very large number of robots, and guarantees the live and smooth motion of robots.
NASA Astrophysics Data System (ADS)
Jamroz, Benjamin F.; Klöfkorn, Robert
2016-08-01
The scalability of computational applications on current and next-generation supercomputers is increasingly limited by the cost of inter-process communication. We implement non-blocking asynchronous communication in the High-Order Methods Modeling Environment for the time integration of the hydrostatic fluid equations using both the spectral-element and discontinuous Galerkin methods. This allows the overlap of computation with communication, effectively hiding some of the costs of communication. A novel detail about our approach is that it provides some data movement to be performed during the asynchronous communication even in the absence of other computations. This method produces significant performance and scalability gains in large-scale simulations.
NASA Astrophysics Data System (ADS)
Prasad, Guru; Jayaram, Sanjay; Ward, Jami; Gupta, Pankaj
2004-09-01
In this paper, Aximetric proposes a decentralized Command and Control (C2) architecture for a distributed control of a cluster of on-board health monitoring and software enabled control systems called
Drawert, Brian; Trogdon, Michael; Toor, Salman; Petzold, Linda; Hellander, Andreas
2017-01-01
Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools and a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments. PMID:28190948
Scalable Algorithms for Parallel Discrete Event Simulation Systems in Multicore Environments
2013-05-01
consolidated at the sender side. At the receiver side, the messages are deconsolidated and delivered to the appropriate thread. This approach bears some...Jiang, S. Kini, W. Yu, D. Buntinas, P. Wyckoff, and D. Panda . Performance comparison of mpi implementations over infiniband, myrinet and quadrics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jamroz, Benjamin F.; Klofkorn, Robert
The scalability of computational applications on current and next-generation supercomputers is increasingly limited by the cost of inter-process communication. We implement non-blocking asynchronous communication in the High-Order Methods Modeling Environment for the time integration of the hydrostatic fluid equations using both the spectral-element and discontinuous Galerkin methods. This allows the overlap of computation with communication, effectively hiding some of the costs of communication. A novel detail about our approach is that it provides some data movement to be performed during the asynchronous communication even in the absence of other computations. This method produces significant performance and scalability gains in large-scalemore » simulations.« less
Jamroz, Benjamin F.; Klofkorn, Robert
2016-08-26
The scalability of computational applications on current and next-generation supercomputers is increasingly limited by the cost of inter-process communication. We implement non-blocking asynchronous communication in the High-Order Methods Modeling Environment for the time integration of the hydrostatic fluid equations using both the spectral-element and discontinuous Galerkin methods. This allows the overlap of computation with communication, effectively hiding some of the costs of communication. A novel detail about our approach is that it provides some data movement to be performed during the asynchronous communication even in the absence of other computations. This method produces significant performance and scalability gains in large-scalemore » simulations.« less
HRLSim: a high performance spiking neural network simulator for GPGPU clusters.
Minkovich, Kirill; Thibeault, Corey M; O'Brien, Michael John; Nogin, Aleksey; Cho, Youngkwan; Srinivasa, Narayan
2014-02-01
Modeling of large-scale spiking neural models is an important tool in the quest to understand brain function and subsequently create real-world applications. This paper describes a spiking neural network simulator environment called HRL Spiking Simulator (HRLSim). This simulator is suitable for implementation on a cluster of general purpose graphical processing units (GPGPUs). Novel aspects of HRLSim are described and an analysis of its performance is provided for various configurations of the cluster. With the advent of inexpensive GPGPU cards and compute power, HRLSim offers an affordable and scalable tool for design, real-time simulation, and analysis of large-scale spiking neural networks.
A Systems Approach to Scalable Transportation Network Modeling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perumalla, Kalyan S
2006-01-01
Emerging needs in transportation network modeling and simulation are raising new challenges with respect to scal-ability of network size and vehicular traffic intensity, speed of simulation for simulation-based optimization, and fidel-ity of vehicular behavior for accurate capture of event phe-nomena. Parallel execution is warranted to sustain the re-quired detail, size and speed. However, few parallel simulators exist for such applications, partly due to the challenges underlying their development. Moreover, many simulators are based on time-stepped models, which can be computationally inefficient for the purposes of modeling evacuation traffic. Here an approach is presented to de-signing a simulator with memory andmore » speed efficiency as the goals from the outset, and, specifically, scalability via parallel execution. The design makes use of discrete event modeling techniques as well as parallel simulation meth-ods. Our simulator, called SCATTER, is being developed, incorporating such design considerations. Preliminary per-formance results are presented on benchmark road net-works, showing scalability to one million vehicles simu-lated on one processor.« less
Superlinearly scalable noise robustness of redundant coupled dynamical systems.
Kohar, Vivek; Kia, Behnam; Lindner, John F; Ditto, William L
2016-03-01
We illustrate through theory and numerical simulations that redundant coupled dynamical systems can be extremely robust against local noise in comparison to uncoupled dynamical systems evolving in the same noisy environment. Previous studies have shown that the noise robustness of redundant coupled dynamical systems is linearly scalable and deviations due to noise can be minimized by increasing the number of coupled units. Here, we demonstrate that the noise robustness can actually be scaled superlinearly if some conditions are met and very high noise robustness can be realized with very few coupled units. We discuss these conditions and show that this superlinear scalability depends on the nonlinearity of the individual dynamical units. The phenomenon is demonstrated in discrete as well as continuous dynamical systems. This superlinear scalability not only provides us an opportunity to exploit the nonlinearity of physical systems without being bogged down by noise but may also help us in understanding the functional role of coupled redundancy found in many biological systems. Moreover, engineers can exploit superlinear noise suppression by starting a coupled system near (not necessarily at) the appropriate initial condition.
AN-CASE NET-CENTRIC modeling and simulation
NASA Astrophysics Data System (ADS)
Baskinger, Patricia J.; Chruscicki, Mary Carol; Turck, Kurt
2009-05-01
The objective of mission training exercises is to immerse the trainees into an environment that enables them to train like they would fight. The integration of modeling and simulation environments that can seamlessly leverage Live systems, and Virtual or Constructive models (LVC) as they are available offers a flexible and cost effective solution to extending the "war-gaming" environment to a realistic mission experience while evolving the development of the net-centric enterprise. From concept to full production, the impact of new capabilities on the infrastructure and concept of operations, can be assessed in the context of the enterprise, while also exposing them to the warfighter. Training is extended to tomorrow's tools, processes, and Tactics, Techniques and Procedures (TTPs). This paper addresses the challenges of a net-centric modeling and simulation environment that is capable of representing a net-centric enterprise. An overview of the Air Force Research Laboratory's (AFRL) Airborne Networking Component Architecture Simulation Environment (AN-CASE) is provide as well as a discussion on how it is being used to assess technologies for the purpose of experimenting with new infrastructure mechanisms that enhance the scalability and reliability of the distributed mission operations environment.
Minimizing communication cost among distributed controllers in software defined networks
NASA Astrophysics Data System (ADS)
Arlimatti, Shivaleela; Elbreiki, Walid; Hassan, Suhaidi; Habbal, Adib; Elshaikh, Mohamed
2016-08-01
Software Defined Networking (SDN) is a new paradigm to increase the flexibility of today's network by promising for a programmable network. The fundamental idea behind this new architecture is to simplify network complexity by decoupling control plane and data plane of the network devices, and by making the control plane centralized. Recently controllers have distributed to solve the problem of single point of failure, and to increase scalability and flexibility during workload distribution. Even though, controllers are flexible and scalable to accommodate more number of network switches, yet the problem of intercommunication cost between distributed controllers is still challenging issue in the Software Defined Network environment. This paper, aims to fill the gap by proposing a new mechanism, which minimizes intercommunication cost with graph partitioning algorithm, an NP hard problem. The methodology proposed in this paper is, swapping of network elements between controller domains to minimize communication cost by calculating communication gain. The swapping of elements minimizes inter and intra communication cost among network domains. We validate our work with the OMNeT++ simulation environment tool. Simulation results show that the proposed mechanism minimizes the inter domain communication cost among controllers compared to traditional distributed controllers.
Scalability of grid- and subbasin-based land surface modeling approaches for hydrologic simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tesfa, Teklu K.; Ruby Leung, L.; Huang, Maoyi
2014-03-27
This paper investigates the relative merits of grid- and subbasin-based land surface modeling approaches for hydrologic simulations, with a focus on their scalability (i.e., abilities to perform consistently across a range of spatial resolutions) in simulating runoff generation. Simulations produced by the grid- and subbasin-based configurations of the Community Land Model (CLM) are compared at four spatial resolutions (0.125o, 0.25o, 0.5o and 1o) over the topographically diverse region of the U.S. Pacific Northwest. Using the 0.125o resolution simulation as the “reference”, statistical skill metrics are calculated and compared across simulations at 0.25o, 0.5o and 1o spatial resolutions of each modelingmore » approach at basin and topographic region levels. Results suggest significant scalability advantage for the subbasin-based approach compared to the grid-based approach for runoff generation. Basin level annual average relative errors of surface runoff at 0.25o, 0.5o, and 1o compared to 0.125o are 3%, 4%, and 6% for the subbasin-based configuration and 4%, 7%, and 11% for the grid-based configuration, respectively. The scalability advantages of the subbasin-based approach are more pronounced during winter/spring and over mountainous regions. The source of runoff scalability is found to be related to the scalability of major meteorological and land surface parameters of runoff generation. More specifically, the subbasin-based approach is more consistent across spatial scales than the grid-based approach in snowfall/rainfall partitioning, which is related to air temperature and surface elevation. Scalability of a topographic parameter used in the runoff parameterization also contributes to improved scalability of the rain driven saturated surface runoff component, particularly during winter. Hence this study demonstrates the importance of spatial structure for multi-scale modeling of hydrological processes, with implications to surface heat fluxes in coupled land-atmosphere modeling.« less
A Study on Fast Gates for Large-Scale Quantum Simulation with Trapped Ions
Taylor, Richard L.; Bentley, Christopher D. B.; Pedernales, Julen S.; Lamata, Lucas; Solano, Enrique; Carvalho, André R. R.; Hope, Joseph J.
2017-01-01
Large-scale digital quantum simulations require thousands of fundamental entangling gates to construct the simulated dynamics. Despite success in a variety of small-scale simulations, quantum information processing platforms have hitherto failed to demonstrate the combination of precise control and scalability required to systematically outmatch classical simulators. We analyse how fast gates could enable trapped-ion quantum processors to achieve the requisite scalability to outperform classical computers without error correction. We analyze the performance of a large-scale digital simulator, and find that fidelity of around 70% is realizable for π-pulse infidelities below 10−5 in traps subject to realistic rates of heating and dephasing. This scalability relies on fast gates: entangling gates faster than the trap period. PMID:28401945
A Study on Fast Gates for Large-Scale Quantum Simulation with Trapped Ions.
Taylor, Richard L; Bentley, Christopher D B; Pedernales, Julen S; Lamata, Lucas; Solano, Enrique; Carvalho, André R R; Hope, Joseph J
2017-04-12
Large-scale digital quantum simulations require thousands of fundamental entangling gates to construct the simulated dynamics. Despite success in a variety of small-scale simulations, quantum information processing platforms have hitherto failed to demonstrate the combination of precise control and scalability required to systematically outmatch classical simulators. We analyse how fast gates could enable trapped-ion quantum processors to achieve the requisite scalability to outperform classical computers without error correction. We analyze the performance of a large-scale digital simulator, and find that fidelity of around 70% is realizable for π-pulse infidelities below 10 -5 in traps subject to realistic rates of heating and dephasing. This scalability relies on fast gates: entangling gates faster than the trap period.
: A Scalable and Transparent System for Simulating MPI Programs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perumalla, Kalyan S
2010-01-01
is a scalable, transparent system for experimenting with the execution of parallel programs on simulated computing platforms. The level of simulated detail can be varied for application behavior as well as for machine characteristics. Unique features of are repeatability of execution, scalability to millions of simulated (virtual) MPI ranks, scalability to hundreds of thousands of host (real) MPI ranks, portability of the system to a variety of host supercomputing platforms, and the ability to experiment with scientific applications whose source-code is available. The set of source-code interfaces supported by is being expanded to support a wider set of applications, andmore » MPI-based scientific computing benchmarks are being ported. In proof-of-concept experiments, has been successfully exercised to spawn and sustain very large-scale executions of an MPI test program given in source code form. Low slowdowns are observed, due to its use of purely discrete event style of execution, and due to the scalability and efficiency of the underlying parallel discrete event simulation engine, sik. In the largest runs, has been executed on up to 216,000 cores of a Cray XT5 supercomputer, successfully simulating over 27 million virtual MPI ranks, each virtual rank containing its own thread context, and all ranks fully synchronized by virtual time.« less
NASA Astrophysics Data System (ADS)
Pattke, Marco; Martin, Manuel; Voit, Michael
2017-05-01
Tracking people with cameras in public areas is common today. However with an increasing number of cameras it becomes harder and harder to view the data manually. Especially in safety critical areas automatic image exploitation could help to solve this problem. Setting up such a system can however be difficult because of its increased complexity. Sensor placement is critical to ensure that people are detected and tracked reliably. We try to solve this problem using a simulation framework that is able to simulate different camera setups in the desired environment including animated characters. We combine this framework with our self developed distributed and scalable system for people tracking to test its effectiveness and can show the results of the tracking system in real time in the simulated environment.
Large Scale Simulation Platform for NODES Validation Study
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sotorrio, P.; Qin, Y.; Min, L.
2017-04-27
This report summarizes the Large Scale (LS) simulation platform created for the Eaton NODES project. The simulation environment consists of both wholesale market simulator and distribution simulator and includes the CAISO wholesale market model and a PG&E footprint of 25-75 feeders to validate the scalability under a scenario of 33% RPS in California with additional 17% of DERS coming from distribution and customers. The simulator can generate hourly unit commitment, 5-minute economic dispatch, and 4-second AGC regulation signals. The simulator is also capable of simulating greater than 10k individual controllable devices. Simulated DERs include water heaters, EVs, residential and lightmore » commercial HVAC/buildings, and residential-level battery storage. Feeder-level voltage regulators and capacitor banks are also simulated for feeder-level real and reactive power management and Vol/Var control.« less
NASA Astrophysics Data System (ADS)
Prasad, Guru; Jayaram, Sanjay; Ward, Jami; Gupta, Pankaj
2004-08-01
In this paper, Aximetric proposes a decentralized Command and Control (C2) architecture for a distributed control of a cluster of on-board health monitoring and software enabled control systems called SimBOX that will use some of the real-time infrastructure (RTI) functionality from the current military real-time simulation architecture. The uniqueness of the approach is to provide a "plug and play environment" for various system components that run at various data rates (Hz) and the ability to replicate or transfer C2 operations to various subsystems in a scalable manner. This is possible by providing a communication bus called "Distributed Shared Data Bus" and a distributed computing environment used to scale the control needs by providing a self-contained computing, data logging and control function module that can be rapidly reconfigured to perform different functions. This kind of software-enabled control is very much needed to meet the needs of future aerospace command and control functions.
A scalable parallel black oil simulator on distributed memory parallel computers
NASA Astrophysics Data System (ADS)
Wang, Kun; Liu, Hui; Chen, Zhangxin
2015-11-01
This paper presents our work on developing a parallel black oil simulator for distributed memory computers based on our in-house parallel platform. The parallel simulator is designed to overcome the performance issues of common simulators that are implemented for personal computers and workstations. The finite difference method is applied to discretize the black oil model. In addition, some advanced techniques are employed to strengthen the robustness and parallel scalability of the simulator, including an inexact Newton method, matrix decoupling methods, and algebraic multigrid methods. A new multi-stage preconditioner is proposed to accelerate the solution of linear systems from the Newton methods. Numerical experiments show that our simulator is scalable and efficient, and is capable of simulating extremely large-scale black oil problems with tens of millions of grid blocks using thousands of MPI processes on parallel computers.
Scalability enhancement of AODV using local link repairing
NASA Astrophysics Data System (ADS)
Jain, Jyoti; Gupta, Roopam; Bandhopadhyay, T. K.
2014-09-01
Dynamic change in the topology of an ad hoc network makes it difficult to design an efficient routing protocol. Scalability of an ad hoc network is also one of the important criteria of research in this field. Most of the research works in ad hoc network focus on routing and medium access protocols and produce simulation results for limited-size networks. Ad hoc on-demand distance vector (AODV) is one of the best reactive routing protocols. In this article, modified routing protocols based on local link repairing of AODV are proposed. Method of finding alternate routes for next-to-next node is proposed in case of link failure. These protocols are beacon-less, means periodic hello message is removed from the basic AODV to improve scalability. Few control packet formats have been changed to accommodate suggested modification. Proposed protocols are simulated to investigate scalability performance and compared with basic AODV protocol. This also proves that local link repairing of proposed protocol improves scalability of the network. From simulation results, it is clear that scalability performance of routing protocol is improved because of link repairing method. We have tested protocols for different terrain area with approximate constant node densities and different traffic load.
Almazyad, Abdulaziz S.; Seddiq, Yasser M.; Alotaibi, Ahmed M.; Al-Nasheri, Ahmed Y.; BenSaleh, Mohammed S.; Obeid, Abdulfattah M.; Qasim, Syed Manzoor
2014-01-01
Anomalies such as leakage and bursts in water pipelines have severe consequences for the environment and the economy. To ensure the reliability of water pipelines, they must be monitored effectively. Wireless Sensor Networks (WSNs) have emerged as an effective technology for monitoring critical infrastructure such as water, oil and gas pipelines. In this paper, we present a scalable design and simulation of a water pipeline leakage monitoring system using Radio Frequency IDentification (RFID) and WSN technology. The proposed design targets long-distance aboveground water pipelines that have special considerations for maintenance, energy consumption and cost. The design is based on deploying a group of mobile wireless sensor nodes inside the pipeline and allowing them to work cooperatively according to a prescheduled order. Under this mechanism, only one node is active at a time, while the other nodes are sleeping. The node whose turn is next wakes up according to one of three wakeup techniques: location-based, time-based and interrupt-driven. In this paper, mathematical models are derived for each technique to estimate the corresponding energy consumption and memory size requirements. The proposed equations are analyzed and the results are validated using simulation. PMID:24561404
Almazyad, Abdulaziz S; Seddiq, Yasser M; Alotaibi, Ahmed M; Al-Nasheri, Ahmed Y; BenSaleh, Mohammed S; Obeid, Abdulfattah M; Qasim, Syed Manzoor
2014-02-20
Anomalies such as leakage and bursts in water pipelines have severe consequences for the environment and the economy. To ensure the reliability of water pipelines, they must be monitored effectively. Wireless Sensor Networks (WSNs) have emerged as an effective technology for monitoring critical infrastructure such as water, oil and gas pipelines. In this paper, we present a scalable design and simulation of a water pipeline leakage monitoring system using Radio Frequency IDentification (RFID) and WSN technology. The proposed design targets long-distance aboveground water pipelines that have special considerations for maintenance, energy consumption and cost. The design is based on deploying a group of mobile wireless sensor nodes inside the pipeline and allowing them to work cooperatively according to a prescheduled order. Under this mechanism, only one node is active at a time, while the other nodes are sleeping. The node whose turn is next wakes up according to one of three wakeup techniques: location-based, time-based and interrupt-driven. In this paper, mathematical models are derived for each technique to estimate the corresponding energy consumption and memory size requirements. The proposed equations are analyzed and the results are validated using simulation.
Self-organizing network services with evolutionary adaptation.
Nakano, Tadashi; Suda, Tatsuya
2005-09-01
This paper proposes a novel framework for developing adaptive and scalable network services. In the proposed framework, a network service is implemented as a group of autonomous agents that interact in the network environment. Agents in the proposed framework are autonomous and capable of simple behaviors (e.g., replication, migration, and death). In this paper, an evolutionary adaptation mechanism is designed using genetic algorithms (GAs) for agents to evolve their behaviors and improve their fitness values (e.g., response time to a service request) to the environment. The proposed framework is evaluated through simulations, and the simulation results demonstrate the ability of autonomous agents to adapt to the network environment. The proposed framework may be suitable for disseminating network services in dynamic and large-scale networks where a large number of data and services need to be replicated, moved, and deleted in a decentralized manner.
NASA Astrophysics Data System (ADS)
Yan, Beichuan; Regueiro, Richard A.
2018-02-01
A three-dimensional (3D) DEM code for simulating complex-shaped granular particles is parallelized using message-passing interface (MPI). The concepts of link-block, ghost/border layer, and migration layer are put forward for design of the parallel algorithm, and theoretical scalability function of 3-D DEM scalability and memory usage is derived. Many performance-critical implementation details are managed optimally to achieve high performance and scalability, such as: minimizing communication overhead, maintaining dynamic load balance, handling particle migrations across block borders, transmitting C++ dynamic objects of particles between MPI processes efficiently, eliminating redundant contact information between adjacent MPI processes. The code executes on multiple US Department of Defense (DoD) supercomputers and tests up to 2048 compute nodes for simulating 10 million three-axis ellipsoidal particles. Performance analyses of the code including speedup, efficiency, scalability, and granularity across five orders of magnitude of simulation scale (number of particles) are provided, and they demonstrate high speedup and excellent scalability. It is also discovered that communication time is a decreasing function of the number of compute nodes in strong scaling measurements. The code's capability of simulating a large number of complex-shaped particles on modern supercomputers will be of value in both laboratory studies on micromechanical properties of granular materials and many realistic engineering applications involving granular materials.
Providing scalable system software for high-end simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Greenberg, D.
1997-12-31
Detailed, full-system, complex physics simulations have been shown to be feasible on systems containing thousands of processors. In order to manage these computer systems it has been necessary to create scalable system services. In this talk Sandia`s research on scalable systems will be described. The key concepts of low overhead data movement through portals and of flexible services through multi-partition architectures will be illustrated in detail. The talk will conclude with a discussion of how these techniques can be applied outside of the standard monolithic MPP system.
An MPI-based MoSST core dynamics model
NASA Astrophysics Data System (ADS)
Jiang, Weiyuan; Kuang, Weijia
2008-09-01
Distributed systems are among the main cost-effective and expandable platforms for high-end scientific computing. Therefore scalable numerical models are important for effective use of such systems. In this paper, we present an MPI-based numerical core dynamics model for simulation of geodynamo and planetary dynamos, and for simulation of core-mantle interactions. The model is developed based on MPI libraries. Two algorithms are used for node-node communication: a "master-slave" architecture and a "divide-and-conquer" architecture. The former is easy to implement but not scalable in communication. The latter is scalable in both computation and communication. The model scalability is tested on Linux PC clusters with up to 128 nodes. This model is also benchmarked with a published numerical dynamo model solution.
BactoGeNIE: A large-scale comparative genome visualization for big displays
Aurisano, Jillian; Reda, Khairi; Johnson, Andrew; ...
2015-08-13
The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visual scalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visually scalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE throughmore » a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. In conclusion, BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visual scalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics.« less
BactoGeNIE: a large-scale comparative genome visualization for big displays
2015-01-01
Background The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visual scalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. Results In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visually scalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE through a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. Conclusions BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visual scalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics. PMID:26329021
BactoGeNIE: A large-scale comparative genome visualization for big displays
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aurisano, Jillian; Reda, Khairi; Johnson, Andrew
The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visual scalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visually scalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE throughmore » a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. In conclusion, BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visual scalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics.« less
A Cloud-Based Simulation Architecture for Pandemic Influenza Simulation
Eriksson, Henrik; Raciti, Massimiliano; Basile, Maurizio; Cunsolo, Alessandro; Fröberg, Anders; Leifler, Ola; Ekberg, Joakim; Timpka, Toomas
2011-01-01
High-fidelity simulations of pandemic outbreaks are resource consuming. Cluster-based solutions have been suggested for executing such complex computations. We present a cloud-based simulation architecture that utilizes computing resources both locally available and dynamically rented online. The approach uses the Condor framework for job distribution and management of the Amazon Elastic Computing Cloud (EC2) as well as local resources. The architecture has a web-based user interface that allows users to monitor and control simulation execution. In a benchmark test, the best cost-adjusted performance was recorded for the EC2 H-CPU Medium instance, while a field trial showed that the job configuration had significant influence on the execution time and that the network capacity of the master node could become a bottleneck. We conclude that it is possible to develop a scalable simulation environment that uses cloud-based solutions, while providing an easy-to-use graphical user interface. PMID:22195089
Large-Scale Networked Virtual Environments: Architecture and Applications
ERIC Educational Resources Information Center
Lamotte, Wim; Quax, Peter; Flerackers, Eddy
2008-01-01
Purpose: Scalability is an important research topic in the context of networked virtual environments (NVEs). This paper aims to describe the ALVIC (Architecture for Large-scale Virtual Interactive Communities) approach to NVE scalability. Design/methodology/approach: The setup and results from two case studies are shown: a 3-D learning environment…
Digital quantum simulators in a scalable architecture of hybrid spin-photon qubits
Chiesa, Alessandro; Santini, Paolo; Gerace, Dario; Raftery, James; Houck, Andrew A.; Carretta, Stefano
2015-01-01
Resolving quantum many-body problems represents one of the greatest challenges in physics and physical chemistry, due to the prohibitively large computational resources that would be required by using classical computers. A solution has been foreseen by directly simulating the time evolution through sequences of quantum gates applied to arrays of qubits, i.e. by implementing a digital quantum simulator. Superconducting circuits and resonators are emerging as an extremely promising platform for quantum computation architectures, but a digital quantum simulator proposal that is straightforwardly scalable, universal, and realizable with state-of-the-art technology is presently lacking. Here we propose a viable scheme to implement a universal quantum simulator with hybrid spin-photon qubits in an array of superconducting resonators, which is intrinsically scalable and allows for local control. As representative examples we consider the transverse-field Ising model, a spin-1 Hamiltonian, and the two-dimensional Hubbard model and we numerically simulate the scheme by including the main sources of decoherence. PMID:26563516
Grindon, Christina; Harris, Sarah; Evans, Tom; Novik, Keir; Coveney, Peter; Laughton, Charles
2004-07-15
Molecular modelling played a central role in the discovery of the structure of DNA by Watson and Crick. Today, such modelling is done on computers: the more powerful these computers are, the more detailed and extensive can be the study of the dynamics of such biological macromolecules. To fully harness the power of modern massively parallel computers, however, we need to develop and deploy algorithms which can exploit the structure of such hardware. The Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) is a scalable molecular dynamics code including long-range Coulomb interactions, which has been specifically designed to function efficiently on parallel platforms. Here we describe the implementation of the AMBER98 force field in LAMMPS and its validation for molecular dynamics investigations of DNA structure and flexibility against the benchmark of results obtained with the long-established code AMBER6 (Assisted Model Building with Energy Refinement, version 6). Extended molecular dynamics simulations on the hydrated DNA dodecamer d(CTTTTGCAAAAG)(2), which has previously been the subject of extensive dynamical analysis using AMBER6, show that it is possible to obtain excellent agreement in terms of static, dynamic and thermodynamic parameters between AMBER6 and LAMMPS. In comparison with AMBER6, LAMMPS shows greatly improved scalability in massively parallel environments, opening up the possibility of efficient simulations of order-of-magnitude larger systems and/or for order-of-magnitude greater simulation times.
Innovative HPC architectures for the study of planetary plasma environments
NASA Astrophysics Data System (ADS)
Amaya, Jorge; Wolf, Anna; Lembège, Bertrand; Zitz, Anke; Alvarez, Damian; Lapenta, Giovanni
2016-04-01
DEEP-ER is an European Commission founded project that develops a new type of High Performance Computer architecture. The revolutionary system is currently used by KU Leuven to study the effects of the solar wind on the global environments of the Earth and Mercury. The new architecture combines the versatility of Intel Xeon computing nodes with the power of the upcoming Intel Xeon Phi accelerators. Contrary to classical heterogeneous HPC architectures, where it is customary to find CPU and accelerators in the same computing nodes, in the DEEP-ER system CPU nodes are grouped together (Cluster) and independently from the accelerator nodes (Booster). The system is equipped with a state of the art interconnection network, a highly scalable and fast I/O and a fail recovery resiliency system. The final objective of the project is to introduce a scalable system that can be used to create the next generation of exascale supercomputers. The code iPic3D from KU Leuven is being adapted to this new architecture. This particle-in-cell code can now perform the computation of the electromagnetic fields in the Cluster while the particles are moved in the Booster side. Using fast and scalable Xeon Phi accelerators in the Booster we can introduce many more particles per cell in the simulation than what is possible in the current generation of HPC systems, allowing to calculate fully kinetic plasmas with very low interpolation noise. The system will be used to perform fully kinetic, low noise, 3D simulations of the interaction of the solar wind with the magnetosphere of the Earth and Mercury. Preliminary simulations have been performed in other HPC centers in order to compare the results in different systems. In this presentation we show the complexity of the plasma flow around the planets, including the development of hydrodynamic instabilities at the flanks, the presence of the collision-less shock, the magnetosheath, the magnetopause, reconnection zones, the formation of the plasma sheet and the magnetotail, and the variation of ion/electron plasma flows when crossing these frontiers. The simulations also give access to detailed information about the particle dynamics and their velocity distribution at locations that can be used for comparison with satellite data.
Space Situational Awareness Data Processing Scalability Utilizing Google Cloud Services
NASA Astrophysics Data System (ADS)
Greenly, D.; Duncan, M.; Wysack, J.; Flores, F.
Space Situational Awareness (SSA) is a fundamental and critical component of current space operations. The term SSA encompasses the awareness, understanding and predictability of all objects in space. As the population of orbital space objects and debris increases, the number of collision avoidance maneuvers grows and prompts the need for accurate and timely process measures. The SSA mission continually evolves to near real-time assessment and analysis demanding the need for higher processing capabilities. By conventional methods, meeting these demands requires the integration of new hardware to keep pace with the growing complexity of maneuver planning algorithms. SpaceNav has implemented a highly scalable architecture that will track satellites and debris by utilizing powerful virtual machines on the Google Cloud Platform. SpaceNav algorithms for processing CDMs outpace conventional means. A robust processing environment for tracking data, collision avoidance maneuvers and various other aspects of SSA can be created and deleted on demand. Migrating SpaceNav tools and algorithms into the Google Cloud Platform will be discussed and the trials and tribulations involved. Information will be shared on how and why certain cloud products were used as well as integration techniques that were implemented. Key items to be presented are: 1.Scientific algorithms and SpaceNav tools integrated into a scalable architecture a) Maneuver Planning b) Parallel Processing c) Monte Carlo Simulations d) Optimization Algorithms e) SW Application Development/Integration into the Google Cloud Platform 2. Compute Engine Processing a) Application Engine Automated Processing b) Performance testing and Performance Scalability c) Cloud MySQL databases and Database Scalability d) Cloud Data Storage e) Redundancy and Availability
LLNL Mercury Project Trinity Open Science Final Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brantley, Patrick; Dawson, Shawn; McKinley, Scott
2016-04-20
The Mercury Monte Carlo particle transport code developed at Lawrence Livermore National Laboratory (LLNL) is used to simulate the transport of radiation through urban environments. These challenging calculations include complicated geometries and require significant computational resources to complete. As a result, a question arises as to the level of convergence of the calculations with Monte Carlo simulation particle count. In the Trinity Open Science calculations, one main focus was to investigate convergence of the relevant simulation quantities with Monte Carlo particle count to assess the current simulation methodology. Both for this application space but also of more general applicability, wemore » also investigated the impact of code algorithms on parallel scaling on the Trinity machine as well as the utilization of the Trinity DataWarp burst buffer technology in Mercury via the LLNL Scalable Checkpoint/Restart (SCR) library.« less
CAM-SE: A scalable spectral element dynamical core for the Community Atmosphere Model.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dennis, John; Edwards, Jim; Evans, Kate J
2012-01-01
The Community Atmosphere Model (CAM) version 5 includes a spectral element dynamical core option from NCAR's High-Order Method Modeling Environment. It is a continuous Galerkin spectral finite element method designed for fully unstructured quadrilateral meshes. The current configurations in CAM are based on the cubed-sphere grid. The main motivation for including a spectral element dynamical core is to improve the scalability of CAM by allowing quasi-uniform grids for the sphere that do not require polar filters. In addition, the approach provides other state-of-the-art capabilities such as improved conservation properties. Spectral elements are used for the horizontal discretization, while most othermore » aspects of the dynamical core are a hybrid of well tested techniques from CAM's finite volume and global spectral dynamical core options. Here we first give a overview of the spectral element dynamical core as used in CAM. We then give scalability and performance results from CAM running with three different dynamical core options within the Community Earth System Model, using a pre-industrial time-slice configuration. We focus on high resolution simulations of 1/4 degree, 1/8 degree, and T340 spectral truncation.« less
NASA Astrophysics Data System (ADS)
Tolba, Khaled Ibrahim; Morgenthal, Guido
2018-01-01
This paper presents an analysis of the scalability and efficiency of a simulation framework based on the vortex particle method. The code is applied for the numerical aerodynamic analysis of line-like structures. The numerical code runs on multicore CPU and GPU architectures using OpenCL framework. The focus of this paper is the analysis of the parallel efficiency and scalability of the method being applied to an engineering test case, specifically the aeroelastic response of a long-span bridge girder at the construction stage. The target is to assess the optimal configuration and the required computer architecture, such that it becomes feasible to efficiently utilise the method within the computational resources available for a regular engineering office. The simulations and the scalability analysis are performed on a regular gaming type computer.
NASA Technical Reports Server (NTRS)
Kikuchi, Hideaki; Kalia, Rajiv K.; Nakano, Aiichiro; Vashishta, Priya; Shimojo, Fuyuki; Saini, Subhash
2003-01-01
Scalability of a low-cost, Intel Xeon-based, multi-Teraflop Linux cluster is tested for two high-end scientific applications: Classical atomistic simulation based on the molecular dynamics method and quantum mechanical calculation based on the density functional theory. These scalable parallel applications use space-time multiresolution algorithms and feature computational-space decomposition, wavelet-based adaptive load balancing, and spacefilling-curve-based data compression for scalable I/O. Comparative performance tests are performed on a 1,024-processor Linux cluster and a conventional higher-end parallel supercomputer, 1,184-processor IBM SP4. The results show that the performance of the Linux cluster is comparable to that of the SP4. We also study various effects, such as the sharing of memory and L2 cache among processors, on the performance.
Dynamic full-scalability conversion in scalable video coding
NASA Astrophysics Data System (ADS)
Lee, Dong Su; Bae, Tae Meon; Thang, Truong Cong; Ro, Yong Man
2007-02-01
For outstanding coding efficiency with scalability functions, SVC (Scalable Video Coding) is being standardized. SVC can support spatial, temporal and SNR scalability and these scalabilities are useful to provide a smooth video streaming service even in a time varying network such as a mobile environment. But current SVC is insufficient to support dynamic video conversion with scalability, thereby the adaptation of bitrate to meet a fluctuating network condition is limited. In this paper, we propose dynamic full-scalability conversion methods for QoS adaptive video streaming in SVC. To accomplish full scalability dynamic conversion, we develop corresponding bitstream extraction, encoding and decoding schemes. At the encoder, we insert the IDR NAL periodically to solve the problems of spatial scalability conversion. At the extractor, we analyze the SVC bitstream to get the information which enable dynamic extraction. Real time extraction is achieved by using this information. Finally, we develop the decoder so that it can manage the changing scalability. Experimental results showed that dynamic full-scalability conversion was verified and it was necessary for time varying network condition.
Conceptual Architecture for Obtaining Cyber Situational Awareness
2014-06-01
1-893723-17-8. [10] SKYBOX SECURITY. Developer´s Guide. Skybox View. Manual.Version 11. 2010. [11] SCALABLE Network. EXata communications...E. Understanding command and control. Washington, D.C.: CCRP Publication Series, 2006. 255 p. ISBN 1-893723-17-8. • [10] SKYBOX SECURITY. Developer...s Guide. Skybox View. Manual.Version 11. 2010. • [11] SCALABLE Network. EXata communications simulation platform. Available: <http://www.scalable
Scalable Cloning on Large-Scale GPU Platforms with Application to Time-Stepped Simulations on Grids
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoginath, Srikanth B.; Perumalla, Kalyan S.
Cloning is a technique to efficiently simulate a tree of multiple what-if scenarios that are unraveled during the course of a base simulation. However, cloned execution is highly challenging to realize on large, distributed memory computing platforms, due to the dynamic nature of the computational load across clones, and due to the complex dependencies spanning the clone tree. In this paper, we present the conceptual simulation framework, algorithmic foundations, and runtime interface of CloneX, a new system we designed for scalable simulation cloning. It efficiently and dynamically creates whole logical copies of a dynamic tree of simulations across a largemore » parallel system without full physical duplication of computation and memory. The performance of a prototype implementation executed on up to 1,024 graphical processing units of a supercomputing system has been evaluated with three benchmarks—heat diffusion, forest fire, and disease propagation models—delivering a speed up of over two orders of magnitude compared to replicated runs. Finally, the results demonstrate a significantly faster and scalable way to execute many what-if scenario ensembles of large simulations via cloning using the CloneX interface.« less
Scalable Cloning on Large-Scale GPU Platforms with Application to Time-Stepped Simulations on Grids
Yoginath, Srikanth B.; Perumalla, Kalyan S.
2018-01-31
Cloning is a technique to efficiently simulate a tree of multiple what-if scenarios that are unraveled during the course of a base simulation. However, cloned execution is highly challenging to realize on large, distributed memory computing platforms, due to the dynamic nature of the computational load across clones, and due to the complex dependencies spanning the clone tree. In this paper, we present the conceptual simulation framework, algorithmic foundations, and runtime interface of CloneX, a new system we designed for scalable simulation cloning. It efficiently and dynamically creates whole logical copies of a dynamic tree of simulations across a largemore » parallel system without full physical duplication of computation and memory. The performance of a prototype implementation executed on up to 1,024 graphical processing units of a supercomputing system has been evaluated with three benchmarks—heat diffusion, forest fire, and disease propagation models—delivering a speed up of over two orders of magnitude compared to replicated runs. Finally, the results demonstrate a significantly faster and scalable way to execute many what-if scenario ensembles of large simulations via cloning using the CloneX interface.« less
Scalable free energy calculation of proteins via multiscale essential sampling
NASA Astrophysics Data System (ADS)
Moritsugu, Kei; Terada, Tohru; Kidera, Akinori
2010-12-01
A multiscale simulation method, "multiscale essential sampling (MSES)," is proposed for calculating free energy surface of proteins in a sizable dimensional space with good scalability. In MSES, the configurational sampling of a full-dimensional model is enhanced by coupling with the accelerated dynamics of the essential degrees of freedom. Applying the Hamiltonian exchange method to MSES can remove the biasing potential from the coupling term, deriving the free energy surface of the essential degrees of freedom. The form of the coupling term ensures good scalability in the Hamiltonian exchange. As a test application, the free energy surface of the folding process of a miniprotein, chignolin, was calculated in the continuum solvent model. Results agreed with the free energy surface derived from the multicanonical simulation. Significantly improved scalability with the MSES method was clearly shown in the free energy calculation of chignolin in explicit solvent, which was achieved without increasing the number of replicas in the Hamiltonian exchange.
Development of a Robust and Efficient Parallel Solver for Unsteady Turbomachinery Flows
NASA Technical Reports Server (NTRS)
West, Jeff; Wright, Jeffrey; Thakur, Siddharth; Luke, Ed; Grinstead, Nathan
2012-01-01
The traditional design and analysis practice for advanced propulsion systems relies heavily on expensive full-scale prototype development and testing. Over the past decade, use of high-fidelity analysis and design tools such as CFD early in the product development cycle has been identified as one way to alleviate testing costs and to develop these devices better, faster and cheaper. In the design of advanced propulsion systems, CFD plays a major role in defining the required performance over the entire flight regime, as well as in testing the sensitivity of the design to the different modes of operation. Increased emphasis is being placed on developing and applying CFD models to simulate the flow field environments and performance of advanced propulsion systems. This necessitates the development of next generation computational tools which can be used effectively and reliably in a design environment. The turbomachinery simulation capability presented here is being developed in a computational tool called Loci-STREAM [1]. It integrates proven numerical methods for generalized grids and state-of-the-art physical models in a novel rule-based programming framework called Loci [2] which allows: (a) seamless integration of multidisciplinary physics in a unified manner, and (b) automatic handling of massively parallel computing. The objective is to be able to routinely simulate problems involving complex geometries requiring large unstructured grids and complex multidisciplinary physics. An immediate application of interest is simulation of unsteady flows in rocket turbopumps, particularly in cryogenic liquid rocket engines. The key components of the overall methodology presented in this paper are the following: (a) high fidelity unsteady simulation capability based on Detached Eddy Simulation (DES) in conjunction with second-order temporal discretization, (b) compliance with Geometric Conservation Law (GCL) in order to maintain conservative property on moving meshes for second-order time-stepping scheme, (c) a novel cloud-of-points interpolation method (based on a fast parallel kd-tree search algorithm) for interfaces between turbomachinery components in relative motion which is demonstrated to be highly scalable, and (d) demonstrated accuracy and parallel scalability on large grids (approx 250 million cells) in full turbomachinery geometries.
NAVO MSRC Navigator. Fall 2006
2006-01-01
UNIX Manual Pages: xdm (1x). 7. Buddenhagen, Oswald, “The KDM Handbook,” KDE Documentation, http://docs.kde.org/development/ en /kdebase/kdm/. 8... Linux Opteron cluster was recently determined through a series of simulations that employed both fixed and adaptive meshes. The fixed-mesh scalability...approximately eight in the total number of cells in the 3-D simulation. The fixed-mesh and AMR scalability results on the Linux Opteron cluster are
Shadow-Bitcoin: Scalable Simulation via Direct Execution of Multi-Threaded Applications
2015-08-10
Shadow- Bitcoin : Scalable Simulation via Direct Execution of Multi-threaded Applications Andrew Miller University of Maryland amiller@cs.umd.edu Rob...Shadow plug-in that directly executes the Bitcoin reference client software. To demonstrate the usefulness of this tool, we present novel denial-of...service attacks against the Bit- coin software that exploit low-level implementation ar- tifacts in the Bitcoin reference client; our determinis- tic
NASA Technical Reports Server (NTRS)
Morgan, Philip E.
2004-01-01
This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.
Highly efficient simulation environment for HDTV video decoder in VLSI design
NASA Astrophysics Data System (ADS)
Mao, Xun; Wang, Wei; Gong, Huimin; He, Yan L.; Lou, Jian; Yu, Lu; Yao, Qingdong; Pirsch, Peter
2002-01-01
With the increase of the complex of VLSI such as the SoC (System on Chip) of MPEG-2 Video decoder with HDTV scalability especially, simulation and verification of the full design, even as high as the behavior level in HDL, often proves to be very slow, costly and it is difficult to perform full verification until late in the design process. Therefore, they become bottleneck of the procedure of HDTV video decoder design, and influence it's time-to-market mostly. In this paper, the architecture of Hardware/Software Interface of HDTV video decoder is studied, and a Hardware-Software Mixed Simulation (HSMS) platform is proposed to check and correct error in the early design stage, based on the algorithm of MPEG-2 video decoding. The application of HSMS to target system could be achieved by employing several introduced approaches. Those approaches speed up the simulation and verification task without decreasing performance.
NASA Technical Reports Server (NTRS)
Jedlovec, Gary; Srikishen, Jayanthi; Edwards, Rita; Cross, David; Welch, Jon; Smith, Matt
2013-01-01
The use of collaborative scientific visualization systems for the analysis, visualization, and sharing of "big data" available from new high resolution remote sensing satellite sensors or four-dimensional numerical model simulations is propelling the wider adoption of ultra-resolution tiled display walls interconnected by high speed networks. These systems require a globally connected and well-integrated operating environment that provides persistent visualization and collaboration services. This abstract and subsequent presentation describes a new collaborative visualization system installed for NASA's Shortterm Prediction Research and Transition (SPoRT) program at Marshall Space Flight Center and its use for Earth science applications. The system consists of a 3 x 4 array of 1920 x 1080 pixel thin bezel video monitors mounted on a wall in a scientific collaboration lab. The monitors are physically and virtually integrated into a 14' x 7' for video display. The display of scientific data on the video wall is controlled by a single Alienware Aurora PC with a 2nd Generation Intel Core 4.1 GHz processor, 32 GB memory, and an AMD Fire Pro W600 video card with 6 mini display port connections. Six mini display-to-dual DVI cables are used to connect the 12 individual video monitors. The open source Scalable Adaptive Graphics Environment (SAGE) windowing and media control framework, running on top of the Ubuntu 12 Linux operating system, allows several users to simultaneously control the display and storage of high resolution still and moving graphics in a variety of formats, on tiled display walls of any size. The Ubuntu operating system supports the open source Scalable Adaptive Graphics Environment (SAGE) software which provides a common environment, or framework, enabling its users to access, display and share a variety of data-intensive information. This information can be digital-cinema animations, high-resolution images, high-definition video-teleconferences, presentation slides, documents, spreadsheets or laptop screens. SAGE is cross-platform, community-driven, open-source visualization and collaboration middleware that utilizes shared national and international cyberinfrastructure for the advancement of scientific research and education.
NASA Astrophysics Data System (ADS)
Jedlovec, G.; Srikishen, J.; Edwards, R.; Cross, D.; Welch, J. D.; Smith, M. R.
2013-12-01
The use of collaborative scientific visualization systems for the analysis, visualization, and sharing of 'big data' available from new high resolution remote sensing satellite sensors or four-dimensional numerical model simulations is propelling the wider adoption of ultra-resolution tiled display walls interconnected by high speed networks. These systems require a globally connected and well-integrated operating environment that provides persistent visualization and collaboration services. This abstract and subsequent presentation describes a new collaborative visualization system installed for NASA's Short-term Prediction Research and Transition (SPoRT) program at Marshall Space Flight Center and its use for Earth science applications. The system consists of a 3 x 4 array of 1920 x 1080 pixel thin bezel video monitors mounted on a wall in a scientific collaboration lab. The monitors are physically and virtually integrated into a 14' x 7' for video display. The display of scientific data on the video wall is controlled by a single Alienware Aurora PC with a 2nd Generation Intel Core 4.1 GHz processor, 32 GB memory, and an AMD Fire Pro W600 video card with 6 mini display port connections. Six mini display-to-dual DVI cables are used to connect the 12 individual video monitors. The open source Scalable Adaptive Graphics Environment (SAGE) windowing and media control framework, running on top of the Ubuntu 12 Linux operating system, allows several users to simultaneously control the display and storage of high resolution still and moving graphics in a variety of formats, on tiled display walls of any size. The Ubuntu operating system supports the open source Scalable Adaptive Graphics Environment (SAGE) software which provides a common environment, or framework, enabling its users to access, display and share a variety of data-intensive information. This information can be digital-cinema animations, high-resolution images, high-definition video-teleconferences, presentation slides, documents, spreadsheets or laptop screens. SAGE is cross-platform, community-driven, open-source visualization and collaboration middleware that utilizes shared national and international cyberinfrastructure for the advancement of scientific research and education.
NASA Astrophysics Data System (ADS)
Byun, Hye Suk; El-Naggar, Mohamed Y.; Kalia, Rajiv K.; Nakano, Aiichiro; Vashishta, Priya
2017-10-01
Kinetic Monte Carlo (KMC) simulations are used to study long-time dynamics of a wide variety of systems. Unfortunately, the conventional KMC algorithm is not scalable to larger systems, since its time scale is inversely proportional to the simulated system size. A promising approach to resolving this issue is the synchronous parallel KMC (SPKMC) algorithm, which makes the time scale size-independent. This paper introduces a formal derivation of the SPKMC algorithm based on local transition-state and time-dependent Hartree approximations, as well as its scalable parallel implementation based on a dual linked-list cell method. The resulting algorithm has achieved a weak-scaling parallel efficiency of 0.935 on 1024 Intel Xeon processors for simulating biological electron transfer dynamics in a 4.2 billion-heme system, as well as decent strong-scaling parallel efficiency. The parallel code has been used to simulate a lattice of cytochrome complexes on a bacterial-membrane nanowire, and it is broadly applicable to other problems such as computational synthesis of new materials.
Developing a scalable modeling architecture for studying survivability technologies
NASA Astrophysics Data System (ADS)
Mohammad, Syed; Bounker, Paul; Mason, James; Brister, Jason; Shady, Dan; Tucker, David
2006-05-01
To facilitate interoperability of models in a scalable environment, and provide a relevant virtual environment in which Survivability technologies can be evaluated, the US Army Research Development and Engineering Command (RDECOM) Modeling Architecture for Technology Research and Experimentation (MATREX) Science and Technology Objective (STO) program has initiated the Survivability Thread which will seek to address some of the many technical and programmatic challenges associated with the effort. In coordination with different Thread customers, such as the Survivability branches of various Army labs, a collaborative group has been formed to define the requirements for the simulation environment that would in turn provide them a value-added tool for assessing models and gauge system-level performance relevant to Future Combat Systems (FCS) and the Survivability requirements of other burgeoning programs. An initial set of customer requirements has been generated in coordination with the RDECOM Survivability IPT lead, through the Survivability Technology Area at RDECOM Tank-automotive Research Development and Engineering Center (TARDEC, Warren, MI). The results of this project are aimed at a culminating experiment and demonstration scheduled for September, 2006, which will include a multitude of components from within RDECOM and provide the framework for future experiments to support Survivability research. This paper details the components with which the MATREX Survivability Thread was created and executed, and provides insight into the capabilities currently demanded by the Survivability faculty within RDECOM.
High Fidelity Simulations of Large-Scale Wireless Networks (Plus-Up)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Onunkwo, Uzoma
Sandia has built a strong reputation in scalable network simulation and emulation for cyber security studies to protect our nation’s critical information infrastructures. Georgia Tech has preeminent reputation in academia for excellence in scalable discrete event simulations, with strong emphasis on simulating cyber networks. Many of the experts in this field, such as Dr. Richard Fujimoto, Dr. George Riley, and Dr. Chris Carothers, have strong affiliations with Georgia Tech. The collaborative relationship that we intend to immediately pursue is in high fidelity simulations of practical large-scale wireless networks using ns-3 simulator via Dr. George Riley. This project will have mutualmore » benefits in bolstering both institutions’ expertise and reputation in the field of scalable simulation for cyber-security studies. This project promises to address high fidelity simulations of large-scale wireless networks. This proposed collaboration is directly in line with Georgia Tech’s goals for developing and expanding the Communications Systems Center, the Georgia Tech Broadband Institute, and Georgia Tech Information Security Center along with its yearly Emerging Cyber Threats Report. At Sandia, this work benefits the defense systems and assessment area with promise for large-scale assessment of cyber security needs and vulnerabilities of our nation’s critical cyber infrastructures exposed to wireless communications.« less
Scalable Domain Decomposed Monte Carlo Particle Transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Brien, Matthew Joseph
2013-12-05
In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation.
PROTO-PLASM: parallel language for adaptive and scalable modelling of biosystems.
Bajaj, Chandrajit; DiCarlo, Antonio; Paoluzzi, Alberto
2008-09-13
This paper discusses the design goals and the first developments of PROTO-PLASM, a novel computational environment to produce libraries of executable, combinable and customizable computer models of natural and synthetic biosystems, aiming to provide a supporting framework for predictive understanding of structure and behaviour through multiscale geometric modelling and multiphysics simulations. Admittedly, the PROTO-PLASM platform is still in its infancy. Its computational framework--language, model library, integrated development environment and parallel engine--intends to provide patient-specific computational modelling and simulation of organs and biosystem, exploiting novel functionalities resulting from the symbolic combination of parametrized models of parts at various scales. PROTO-PLASM may define the model equations, but it is currently focused on the symbolic description of model geometry and on the parallel support of simulations. Conversely, CellML and SBML could be viewed as defining the behavioural functions (the model equations) to be used within a PROTO-PLASM program. Here we exemplify the basic functionalities of PROTO-PLASM, by constructing a schematic heart model. We also discuss multiscale issues with reference to the geometric and physical modelling of neuromuscular junctions.
Proto-Plasm: parallel language for adaptive and scalable modelling of biosystems
Bajaj, Chandrajit; DiCarlo, Antonio; Paoluzzi, Alberto
2008-01-01
This paper discusses the design goals and the first developments of Proto-Plasm, a novel computational environment to produce libraries of executable, combinable and customizable computer models of natural and synthetic biosystems, aiming to provide a supporting framework for predictive understanding of structure and behaviour through multiscale geometric modelling and multiphysics simulations. Admittedly, the Proto-Plasm platform is still in its infancy. Its computational framework—language, model library, integrated development environment and parallel engine—intends to provide patient-specific computational modelling and simulation of organs and biosystem, exploiting novel functionalities resulting from the symbolic combination of parametrized models of parts at various scales. Proto-Plasm may define the model equations, but it is currently focused on the symbolic description of model geometry and on the parallel support of simulations. Conversely, CellML and SBML could be viewed as defining the behavioural functions (the model equations) to be used within a Proto-Plasm program. Here we exemplify the basic functionalities of Proto-Plasm, by constructing a schematic heart model. We also discuss multiscale issues with reference to the geometric and physical modelling of neuromuscular junctions. PMID:18559320
Generating a Reduced Gravity Environment on Earth
NASA Technical Reports Server (NTRS)
Dungan, Larry K.; Cunningham, Tom; Poncia, Dina
2010-01-01
Since the 1950s several reduced gravity simulators have been designed and utilized in preparing humans for spaceflight and in reduced gravity system development. The Active Response Gravity Offload System (ARGOS) is the newest and most realistic gravity offload simulator. ARGOS provides three degrees of motion within the test area and is scalable for full building deployment. The inertia of the overhead system is eliminated by an active motor and control system. This presentation will discuss what ARGOS is, how it functions, and the unique challenges of interfacing to the human. Test data and video for human and robotic systems will be presented. A major variable in the human machine interaction is the interface of ARGOS to the human. These challenges along with design solutions will be discussed.
Parametric investigation of scalable tactile sensors
NASA Astrophysics Data System (ADS)
Saadatzi, Mohammad Nasser; Yang, Zhong; Baptist, Joshua R.; Sahasrabuddhe, Ritvij R.; Wijayasinghe, Indika B.; Popa, Dan O.
2017-05-01
In the near future, robots and humans will share the same environment and perform tasks cooperatively. For intuitive, safe, and reliable physical human-robot interaction (pHRI), sensorized robot skins for tactile measurements of contact are necessary. In a previous study, we presented skins consisting of strain gauge arrays encased in silicone encapsulants. Although these structures could measure normal forces applied directly onto the sensing elements, they also exhibited blind spots and response asymmetry to certain loading patterns. This study presents a parametric investigation of piezoresistive polymeric strain gauge that exhibits a symmetric omniaxial response thanks to its novel star-shaped structure. This strain gauge relies on the use of gold micro-patterned star-shaped structures with a thin layer of PEDOT:PSS which is a flexible polymer with piezoresistive properties. In this paper, the sensor is first modeled and comprehensively analyzed in the finite-element simulation environment COMSOL. Simulations include stress-strain loading for a variety of structure parameters such as gauge lengths, widths, and spacing, as well as multiple load locations relative to the gauge. Subsequently, sensors with optimized configurations obtained through simulations were fabricated using cleanroom photolithographic and spin-coating processes, and then experimentally tested. Results show a trend-wise agreement between experiments and simulations.
NASA Astrophysics Data System (ADS)
Boyle, P.; Chen, D.; Christ, N.; Clark, M.; Cohen, S.; Cristian, C.; Dong, Z.; Gara, A.; Joo, B.; Jung, C.; Kim, C.; Levkova, L.; Liao, X.; Liu, G.; Li, S.; Lin, H.; Mawhinney, R.; Ohta, S.; Petrov, K.; Wettig, T.; Yamaguchi, A.
2005-03-01
The QCDOC project has developed a supercomputer optimised for the needs of Lattice QCD simulations. It provides a very competitive price to sustained performance ratio of around $1 USD per sustained Megaflop/s in combination with outstanding scalability. Thus very large systems delivering over 5 TFlop/s of performance on the evolution of a single lattice is possible. Large prototypes have been built and are functioning correctly. The software environment raises the state of the art in such custom supercomputers. It is based on a lean custom node operating system that eliminates many unnecessary overheads that plague other systems. Despite the custom nature, the operating system implements a standards compliant UNIX-like programming environment easing the porting of software from other systems. The SciDAC QMP interface adds internode communication in a fashion that provides a uniform cross-platform programming environment.
Rocket Engine Nozzle Side Load Transient Analysis Methodology: A Practical Approach
NASA Technical Reports Server (NTRS)
Shi, John J.
2005-01-01
During the development stage, in order to design/to size the rocket engine components and to reduce the risks, the local dynamic environments as well as dynamic interface loads must be defined. There are two kinds of dynamic environment, i.e. shock transients and steady-state random and sinusoidal vibration environments. Usually, the steady-state random and sinusoidal vibration environments are scalable, but the shock environments are not scalable. In other words, based on similarities only random vibration environments can be defined for a new engine. The methodology covered in this paper provides a way to predict the shock environments and the dynamic loads for new engine systems and new engine components in the early stage of new engine development or engine nozzle modifications.
SCTP as scalable video coding transport
NASA Astrophysics Data System (ADS)
Ortiz, Jordi; Graciá, Eduardo Martínez; Skarmeta, Antonio F.
2013-12-01
This study presents an evaluation of the Stream Transmission Control Protocol (SCTP) for the transport of the scalable video codec (SVC), proposed by MPEG as an extension to H.264/AVC. Both technologies fit together properly. On the one hand, SVC permits to split easily the bitstream into substreams carrying different video layers, each with different importance for the reconstruction of the complete video sequence at the receiver end. On the other hand, SCTP includes features, such as the multi-streaming and multi-homing capabilities, that permit to transport robustly and efficiently the SVC layers. Several transmission strategies supported on baseline SCTP and its concurrent multipath transfer (CMT) extension are compared with the classical solutions based on the Transmission Control Protocol (TCP) and the Realtime Transmission Protocol (RTP). Using ns-2 simulations, it is shown that CMT-SCTP outperforms TCP and RTP in error-prone networking environments. The comparison is established according to several performance measurements, including delay, throughput, packet loss, and peak signal-to-noise ratio of the received video.
NASA Technical Reports Server (NTRS)
Campbell, David; Wysong, Ingrid; Kaplan, Carolyn; Mott, David; Wadsworth, Dean; VanGilder, Douglas
2000-01-01
An AFRL/NRL team has recently been selected to develop a scalable, parallel, reacting, multidimensional (SUPREM) Direct Simulation Monte Carlo (DSMC) code for the DoD user community under the High Performance Computing Modernization Office (HPCMO) Common High Performance Computing Software Support Initiative (CHSSI). This paper will introduce the JANNAF Exhaust Plume community to this three-year development effort and present the overall goals, schedule, and current status of this new code.
Argonne Simulation Framework for Intelligent Transportation Systems
DOT National Transportation Integrated Search
1996-01-01
A simulation framework has been developed which defines a high-level architecture for a large-scale, comprehensive, scalable simulation of an Intelligent Transportation System (ITS). The simulator is designed to run on parallel computers and distribu...
Advanced technologies for scalable ATLAS conditions database access on the grid
NASA Astrophysics Data System (ADS)
Basset, R.; Canali, L.; Dimitrov, G.; Girone, M.; Hawkings, R.; Nevski, P.; Valassi, A.; Vaniachine, A.; Viegas, F.; Walker, R.; Wong, A.
2010-04-01
During massive data reprocessing operations an ATLAS Conditions Database application must support concurrent access from numerous ATLAS data processing jobs running on the Grid. By simulating realistic work-flow, ATLAS database scalability tests provided feedback for Conditions Db software optimization and allowed precise determination of required distributed database resources. In distributed data processing one must take into account the chaotic nature of Grid computing characterized by peak loads, which can be much higher than average access rates. To validate database performance at peak loads, we tested database scalability at very high concurrent jobs rates. This has been achieved through coordinated database stress tests performed in series of ATLAS reprocessing exercises at the Tier-1 sites. The goal of database stress tests is to detect scalability limits of the hardware deployed at the Tier-1 sites, so that the server overload conditions can be safely avoided in a production environment. Our analysis of server performance under stress tests indicates that Conditions Db data access is limited by the disk I/O throughput. An unacceptable side-effect of the disk I/O saturation is a degradation of the WLCG 3D Services that update Conditions Db data at all ten ATLAS Tier-1 sites using the technology of Oracle Streams. To avoid such bottlenecks we prototyped and tested a novel approach for database peak load avoidance in Grid computing. Our approach is based upon the proven idea of pilot job submission on the Grid: instead of the actual query, an ATLAS utility library sends to the database server a pilot query first.
RF Wave Simulation Using the MFEM Open Source FEM Package
NASA Astrophysics Data System (ADS)
Stillerman, J.; Shiraiwa, S.; Bonoli, P. T.; Wright, J. C.; Green, D. L.; Kolev, T.
2016-10-01
A new plasma wave simulation environment based on the finite element method is presented. MFEM, a scalable open-source FEM library, is used as the basis for this capability. MFEM allows for assembling an FEM matrix of arbitrarily high order in a parallel computing environment. A 3D frequency domain RF physics layer was implemented using a python wrapper for MFEM and a cold collisional plasma model was ported. This physics layer allows for defining the plasma RF wave simulation model without user knowledge of the FEM weak-form formulation. A graphical user interface is built on πScope, a python-based scientific workbench, such that a user can build a model definition file interactively. Benchmark cases have been ported to this new environment, with results being consistent with those obtained using COMSOL multiphysics, GENRAY, and TORIC/TORLH spectral solvers. This work is a first step in bringing to bear the sophisticated computational tool suite that MFEM provides (e.g., adaptive mesh refinement, solver suite, element types) to the linear plasma-wave interaction problem, and within more complicated integrated workflows, such as coupling with core spectral solver, or incorporating additional physics such as an RF sheath potential model or kinetic effects. USDoE Awards DE-FC02-99ER54512, DE-FC02-01ER54648.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code.
Kunkel, Susanne; Schenck, Wolfram
2017-01-01
NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code
Kunkel, Susanne; Schenck, Wolfram
2017-01-01
NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling. PMID:28701946
NASA Astrophysics Data System (ADS)
West, Ruth G.; Margolis, Todd; Prudhomme, Andrew; Schulze, Jürgen P.; Mostafavi, Iman; Lewis, J. P.; Gossmann, Joachim; Singh, Rajvikram
2014-02-01
Scalable Metadata Environments (MDEs) are an artistic approach for designing immersive environments for large scale data exploration in which users interact with data by forming multiscale patterns that they alternatively disrupt and reform. Developed and prototyped as part of an art-science research collaboration, we define an MDE as a 4D virtual environment structured by quantitative and qualitative metadata describing multidimensional data collections. Entire data sets (e.g.10s of millions of records) can be visualized and sonified at multiple scales and at different levels of detail so they can be explored interactively in real-time within MDEs. They are designed to reflect similarities and differences in the underlying data or metadata such that patterns can be visually/aurally sorted in an exploratory fashion by an observer who is not familiar with the details of the mapping from data to visual, auditory or dynamic attributes. While many approaches for visual and auditory data mining exist, MDEs are distinct in that they utilize qualitative and quantitative data and metadata to construct multiple interrelated conceptual coordinate systems. These "regions" function as conceptual lattices for scalable auditory and visual representations within virtual environments computationally driven by multi-GPU CUDA-enabled fluid dyamics systems.
Adaptive UEP and Packet Size Assignment for Scalable Video Transmission over Burst-Error Channels
NASA Astrophysics Data System (ADS)
Lee, Chen-Wei; Yang, Chu-Sing; Su, Yih-Ching
2006-12-01
This work proposes an adaptive unequal error protection (UEP) and packet size assignment scheme for scalable video transmission over a burst-error channel. An analytic model is developed to evaluate the impact of channel bit error rate on the quality of streaming scalable video. A video transmission scheme, which combines the adaptive assignment of packet size with unequal error protection to increase the end-to-end video quality, is proposed. Several distinct scalable video transmission schemes over burst-error channel have been compared, and the simulation results reveal that the proposed transmission schemes can react to varying channel conditions with less and smoother quality degradation.
High-Order/Low-Order methods for ocean modeling
Newman, Christopher; Womeldorff, Geoff; Chacón, Luis; ...
2015-06-01
In this study, we examine a High Order/Low Order (HOLO) approach for a z-level ocean model and show that the traditional semi-implicit and split-explicit methods, as well as a recent preconditioning strategy, can easily be cast in the framework of HOLO methods. The HOLO formulation admits an implicit-explicit method that is algorithmically scalable and second-order accurate, allowing timesteps much larger than the barotropic time scale. We show how HOLO approaches, in particular the implicit-explicit method, can provide a solid route for ocean simulation to heterogeneous computing and exascale environments.
Cloud-Based Orchestration of a Model-Based Power and Data Analysis Toolchain
NASA Technical Reports Server (NTRS)
Post, Ethan; Cole, Bjorn; Dinkel, Kevin; Kim, Hongman; Lee, Erich; Nairouz, Bassem
2016-01-01
The proposed Europa Mission concept contains many engineering and scientific instruments that consume varying amounts of power and produce varying amounts of data throughout the mission. System-level power and data usage must be well understood and analyzed to verify design requirements. Numerous cross-disciplinary tools and analysis models are used to simulate the system-level spacecraft power and data behavior. This paper addresses the problem of orchestrating a consistent set of models, tools, and data in a unified analysis toolchain when ownership is distributed among numerous domain experts. An analysis and simulation environment was developed as a way to manage the complexity of the power and data analysis toolchain and to reduce the simulation turnaround time. A system model data repository is used as the trusted store of high-level inputs and results while other remote servers are used for archival of larger data sets and for analysis tool execution. Simulation data passes through numerous domain-specific analysis tools and end-to-end simulation execution is enabled through a web-based tool. The use of a cloud-based service facilitates coordination among distributed developers and enables scalable computation and storage needs, and ensures a consistent execution environment. Configuration management is emphasized to maintain traceability between current and historical simulation runs and their corresponding versions of models, tools and data.
Scalability Assessments for the Malicious Activity Simulation Tool (MAST)
2012-09-01
the scalability characteristics of MAST. Specifically, we show that an exponential increase in clients using the MAST software does not impact...an exponential increase in clients using the MAST software does not impact network and system resources significantly. Additionally, we...31 1. Hardware .....................................31 2. Software .....................................32 3. Common PC
Hybrid-optimization strategy for the communication of large-scale Kinetic Monte Carlo simulation
NASA Astrophysics Data System (ADS)
Wu, Baodong; Li, Shigang; Zhang, Yunquan; Nie, Ningming
2017-02-01
The parallel Kinetic Monte Carlo (KMC) algorithm based on domain decomposition has been widely used in large-scale physical simulations. However, the communication overhead of the parallel KMC algorithm is critical, and severely degrades the overall performance and scalability. In this paper, we present a hybrid optimization strategy to reduce the communication overhead for the parallel KMC simulations. We first propose a communication aggregation algorithm to reduce the total number of messages and eliminate the communication redundancy. Then, we utilize the shared memory to reduce the memory copy overhead of the intra-node communication. Finally, we optimize the communication scheduling using the neighborhood collective operations. We demonstrate the scalability and high performance of our hybrid optimization strategy by both theoretical and experimental analysis. Results show that the optimized KMC algorithm exhibits better performance and scalability than the well-known open-source library-SPPARKS. On 32-node Xeon E5-2680 cluster (total 640 cores), the optimized algorithm reduces the communication time by 24.8% compared with SPPARKS.
Rohmer, Kai; Jendersie, Johannes; Grosch, Thorsten
2017-11-01
Augmented Reality offers many applications today, especially on mobile devices. Due to the lack of mobile hardware for illumination measurements, photorealistic rendering with consistent appearance of virtual objects is still an area of active research. In this paper, we present a full two-stage pipeline for environment acquisition and augmentation of live camera images using a mobile device with a depth sensor. We show how to directly work on a recorded 3D point cloud of the real environment containing high dynamic range color values. For unknown and automatically changing camera settings, a color compensation method is introduced. Based on this, we show photorealistic augmentations using variants of differential light simulation techniques. The presented methods are tailored for mobile devices and run at interactive frame rates. However, our methods are scalable to trade performance for quality and can produce quality renderings on desktop hardware.
A scalable infrastructure for CMS data analysis based on OpenStack Cloud and Gluster file system
NASA Astrophysics Data System (ADS)
Toor, S.; Osmani, L.; Eerola, P.; Kraemer, O.; Lindén, T.; Tarkoma, S.; White, J.
2014-06-01
The challenge of providing a resilient and scalable computational and data management solution for massive scale research environments requires continuous exploration of new technologies and techniques. In this project the aim has been to design a scalable and resilient infrastructure for CERN HEP data analysis. The infrastructure is based on OpenStack components for structuring a private Cloud with the Gluster File System. We integrate the state-of-the-art Cloud technologies with the traditional Grid middleware infrastructure. Our test results show that the adopted approach provides a scalable and resilient solution for managing resources without compromising on performance and high availability.
Implementation of quantum key distribution network simulation module in the network simulator NS-3
NASA Astrophysics Data System (ADS)
Mehic, Miralem; Maurhart, Oliver; Rass, Stefan; Voznak, Miroslav
2017-10-01
As the research in quantum key distribution (QKD) technology grows larger and becomes more complex, the need for highly accurate and scalable simulation technologies becomes important to assess the practical feasibility and foresee difficulties in the practical implementation of theoretical achievements. Due to the specificity of the QKD link which requires optical and Internet connection between the network nodes, to deploy a complete testbed containing multiple network hosts and links to validate and verify a certain network algorithm or protocol would be very costly. Network simulators in these circumstances save vast amounts of money and time in accomplishing such a task. The simulation environment offers the creation of complex network topologies, a high degree of control and repeatable experiments, which in turn allows researchers to conduct experiments and confirm their results. In this paper, we described the design of the QKD network simulation module which was developed in the network simulator of version 3 (NS-3). The module supports simulation of the QKD network in an overlay mode or in a single TCP/IP mode. Therefore, it can be used to simulate other network technologies regardless of QKD.
Scalable isosurface visualization of massive datasets on commodity off-the-shelf clusters
Bajaj, Chandrajit
2009-01-01
Tomographic imaging and computer simulations are increasingly yielding massive datasets. Interactive and exploratory visualizations have rapidly become indispensable tools to study large volumetric imaging and simulation data. Our scalable isosurface visualization framework on commodity off-the-shelf clusters is an end-to-end parallel and progressive platform, from initial data access to the final display. Interactive browsing of extracted isosurfaces is made possible by using parallel isosurface extraction, and rendering in conjunction with a new specialized piece of image compositing hardware called Metabuffer. In this paper, we focus on the back end scalability by introducing a fully parallel and out-of-core isosurface extraction algorithm. It achieves scalability by using both parallel and out-of-core processing and parallel disks. It statically partitions the volume data to parallel disks with a balanced workload spectrum, and builds I/O-optimal external interval trees to minimize the number of I/O operations of loading large data from disk. We also describe an isosurface compression scheme that is efficient for progress extraction, transmission and storage of isosurfaces. PMID:19756231
Evolution of Collective Behaviors for a Real Swarm of Aquatic Surface Robots.
Duarte, Miguel; Costa, Vasco; Gomes, Jorge; Rodrigues, Tiago; Silva, Fernando; Oliveira, Sancho Moura; Christensen, Anders Lyhne
2016-01-01
Swarm robotics is a promising approach for the coordination of large numbers of robots. While previous studies have shown that evolutionary robotics techniques can be applied to obtain robust and efficient self-organized behaviors for robot swarms, most studies have been conducted in simulation, and the few that have been conducted on real robots have been confined to laboratory environments. In this paper, we demonstrate for the first time a swarm robotics system with evolved control successfully operating in a real and uncontrolled environment. We evolve neural network-based controllers in simulation for canonical swarm robotics tasks, namely homing, dispersion, clustering, and monitoring. We then assess the performance of the controllers on a real swarm of up to ten aquatic surface robots. Our results show that the evolved controllers transfer successfully to real robots and achieve a performance similar to the performance obtained in simulation. We validate that the evolved controllers display key properties of swarm intelligence-based control, namely scalability, flexibility, and robustness on the real swarm. We conclude with a proof-of-concept experiment in which the swarm performs a complete environmental monitoring task by combining multiple evolved controllers.
Evolution of Collective Behaviors for a Real Swarm of Aquatic Surface Robots
Duarte, Miguel; Costa, Vasco; Gomes, Jorge; Rodrigues, Tiago; Silva, Fernando; Oliveira, Sancho Moura; Christensen, Anders Lyhne
2016-01-01
Swarm robotics is a promising approach for the coordination of large numbers of robots. While previous studies have shown that evolutionary robotics techniques can be applied to obtain robust and efficient self-organized behaviors for robot swarms, most studies have been conducted in simulation, and the few that have been conducted on real robots have been confined to laboratory environments. In this paper, we demonstrate for the first time a swarm robotics system with evolved control successfully operating in a real and uncontrolled environment. We evolve neural network-based controllers in simulation for canonical swarm robotics tasks, namely homing, dispersion, clustering, and monitoring. We then assess the performance of the controllers on a real swarm of up to ten aquatic surface robots. Our results show that the evolved controllers transfer successfully to real robots and achieve a performance similar to the performance obtained in simulation. We validate that the evolved controllers display key properties of swarm intelligence-based control, namely scalability, flexibility, and robustness on the real swarm. We conclude with a proof-of-concept experiment in which the swarm performs a complete environmental monitoring task by combining multiple evolved controllers. PMID:26999614
Functionality limit of classical simulated annealing
NASA Astrophysics Data System (ADS)
Hasegawa, M.
2015-09-01
By analyzing the system dynamics in the landscape paradigm, optimization function of classical simulated annealing is reviewed on the random traveling salesman problems. The properly functioning region of the algorithm is experimentally determined in the size-time plane and the influence of its boundary on the scalability test is examined in the standard framework of this method. From both results, an empirical choice of temperature length is plausibly explained as a minimum requirement that the algorithm maintains its scalability within its functionality limit. The study exemplifies the applicability of computational physics analysis to the optimization algorithm research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curry, Matthew L.; Ferreira, Kurt Brian; Pedretti, Kevin Thomas Tauke
2012-03-01
This report documents thirteen of Sandia's contributions to the Computational Systems and Software Environment (CSSE) within the Advanced Simulation and Computing (ASC) program between fiscal years 2009 and 2012. It describes their impact on ASC applications. Most contributions are implemented in lower software levels allowing for application improvement without source code changes. Improvements are identified in such areas as reduced run time, characterizing power usage, and Input/Output (I/O). Other experiments are more forward looking, demonstrating potential bottlenecks using mini-application versions of the legacy codes and simulating their network activity on Exascale-class hardware. The purpose of this report is to provemore » that the team has completed milestone 4467-Demonstration of a Legacy Application's Path to Exascale. Cielo is expected to be the last capability system on which existing ASC codes can run without significant modifications. This assertion will be tested to determine where the breaking point is for an existing highly scalable application. The goal is to stretch the performance boundaries of the application by applying recent CSSE RD in areas such as resilience, power, I/O, visualization services, SMARTMAP, lightweight LWKs, virtualization, simulation, and feedback loops. Dedicated system time reservations and/or CCC allocations will be used to quantify the impact of system-level changes to extend the life and performance of the ASC code base. Finally, a simulation of anticipated exascale-class hardware will be performed using SST to supplement the calculations. Determine where the breaking point is for an existing highly scalable application: Chapter 15 presented the CSSE work that sought to identify the breaking point in two ASC legacy applications-Charon and CTH. Their mini-app versions were also employed to complete the task. There is no single breaking point as more than one issue was found with the two codes. The results were that applications can expect to encounter performance issues related to the computing environment, system software, and algorithms. Careful profiling of runtime performance will be needed to identify the source of an issue, in strong combination with knowledge of system software and application source code.« less
NASA Technical Reports Server (NTRS)
Liever, Peter A.; West, Jeffrey S.
2016-01-01
A hybrid Computational Fluid Dynamics and Computational Aero-Acoustics (CFD/CAA) modeling framework has been developed for launch vehicle liftoff acoustic environment predictions. The framework couples the existing highly-scalable NASA production CFD code, Loci/CHEM, with a high-order accurate discontinuous Galerkin solver developed in the same production framework, Loci/THRUST, to accurately resolve and propagate acoustic physics across the entire launch environment. Time-accurate, Hybrid RANS/LES CFD modeling is applied for predicting the acoustic generation physics at the plume source, and a high-order accurate unstructured discontinuous Galerkin (DG) method is employed to propagate acoustic waves away from the source across large distances using high-order accurate schemes. The DG solver is capable of solving 2nd, 3rd, and 4th order Euler solutions for non-linear, conservative acoustic field propagation. Initial application testing and validation has been carried out against high resolution acoustic data from the Ares Scale Model Acoustic Test (ASMAT) series to evaluate the capabilities and production readiness of the CFD/CAA system to resolve the observed spectrum of acoustic frequency content. This paper presents results from this validation and outlines efforts to mature and improve the computational simulation framework.
Vortex Filaments in Grids for Scalable, Fine Smoke Simulation.
Meng, Zhang; Weixin, Si; Yinling, Qian; Hanqiu, Sun; Jing, Qin; Heng, Pheng-Ann
2015-01-01
Vortex modeling can produce attractive visual effects of dynamic fluids, which are widely applicable for dynamic media, computer games, special effects, and virtual reality systems. However, it is challenging to effectively simulate intensive and fine detailed fluids such as smoke with fast increasing vortex filaments and smoke particles. The authors propose a novel vortex filaments in grids scheme in which the uniform grids dynamically bridge the vortex filaments and smoke particles for scalable, fine smoke simulation with macroscopic vortex structures. Using the vortex model, their approach supports the trade-off between simulation speed and scale of details. After computing the whole velocity, external control can be easily exerted on the embedded grid to guide the vortex-based smoke motion. The experimental results demonstrate the efficiency of using the proposed scheme for a visually plausible smoke simulation with macroscopic vortex structures.
Lagardère, Louis; Jolly, Luc-Henri; Lipparini, Filippo; Aviat, Félix; Stamm, Benjamin; Jing, Zhifeng F; Harger, Matthew; Torabifard, Hedieh; Cisneros, G Andrés; Schnieders, Michael J; Gresh, Nohad; Maday, Yvon; Ren, Pengyu Y; Ponder, Jay W; Piquemal, Jean-Philip
2018-01-28
We present Tinker-HP, a massively MPI parallel package dedicated to classical molecular dynamics (MD) and to multiscale simulations, using advanced polarizable force fields (PFF) encompassing distributed multipoles electrostatics. Tinker-HP is an evolution of the popular Tinker package code that conserves its simplicity of use and its reference double precision implementation for CPUs. Grounded on interdisciplinary efforts with applied mathematics, Tinker-HP allows for long polarizable MD simulations on large systems up to millions of atoms. We detail in the paper the newly developed extension of massively parallel 3D spatial decomposition to point dipole polarizable models as well as their coupling to efficient Krylov iterative and non-iterative polarization solvers. The design of the code allows the use of various computer systems ranging from laboratory workstations to modern petascale supercomputers with thousands of cores. Tinker-HP proposes therefore the first high-performance scalable CPU computing environment for the development of next generation point dipole PFFs and for production simulations. Strategies linking Tinker-HP to Quantum Mechanics (QM) in the framework of multiscale polarizable self-consistent QM/MD simulations are also provided. The possibilities, performances and scalability of the software are demonstrated via benchmarks calculations using the polarizable AMOEBA force field on systems ranging from large water boxes of increasing size and ionic liquids to (very) large biosystems encompassing several proteins as well as the complete satellite tobacco mosaic virus and ribosome structures. For small systems, Tinker-HP appears to be competitive with the Tinker-OpenMM GPU implementation of Tinker. As the system size grows, Tinker-HP remains operational thanks to its access to distributed memory and takes advantage of its new algorithmic enabling for stable long timescale polarizable simulations. Overall, a several thousand-fold acceleration over a single-core computation is observed for the largest systems. The extension of the present CPU implementation of Tinker-HP to other computational platforms is discussed.
Taming Wild Horses: The Need for Virtual Time-based Scheduling of VMs in Network Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoginath, Srikanth B; Perumalla, Kalyan S; Henz, Brian J
2012-01-01
The next generation of scalable network simulators employ virtual machines (VMs) to act as high-fidelity models of traffic producer/consumer nodes in simulated networks. However, network simulations could be inaccurate if VMs are not scheduled according to virtual time, especially when many VMs are hosted per simulator core in a multi-core simulator environment. Since VMs are by default free-running, on the outset, it is not clear if, and to what extent, their untamed execution affects the results in simulated scenarios. Here, we provide the first quantitative basis for establishing the need for generalized virtual time scheduling of VMs in network simulators,more » based on an actual prototyped implementations. To exercise breadth, our system is tested with multiple disparate applications: (a) a set of message passing parallel programs, (b) a computer worm propagation phenomenon, and (c) a mobile ad-hoc wireless network simulation. We define and use error metrics and benchmarks in scaled tests to empirically report the poor match of traditional, fairness-based VM scheduling to VM-based network simulation, and also clearly show the better performance of our simulation-specific scheduler, with up to 64 VMs hosted on a 12-core simulator node.« less
Folding Proteins at 500 ns/hour with Work Queue.
Abdul-Wahid, Badi'; Yu, Li; Rajan, Dinesh; Feng, Haoyun; Darve, Eric; Thain, Douglas; Izaguirre, Jesús A
2012-10-01
Molecular modeling is a field that traditionally has large computational costs. Until recently, most simulation techniques relied on long trajectories, which inherently have poor scalability. A new class of methods is proposed that requires only a large number of short calculations, and for which minimal communication between computer nodes is required. We considered one of the more accurate variants called Accelerated Weighted Ensemble Dynamics (AWE) and for which distributed computing can be made efficient. We implemented AWE using the Work Queue framework for task management and applied it to an all atom protein model (Fip35 WW domain). We can run with excellent scalability by simultaneously utilizing heterogeneous resources from multiple computing platforms such as clouds (Amazon EC2, Microsoft Azure), dedicated clusters, grids, on multiple architectures (CPU/GPU, 32/64bit), and in a dynamic environment in which processes are regularly added or removed from the pool. This has allowed us to achieve an aggregate sampling rate of over 500 ns/hour. As a comparison, a single process typically achieves 0.1 ns/hour.
Folding Proteins at 500 ns/hour with Work Queue
Abdul-Wahid, Badi’; Yu, Li; Rajan, Dinesh; Feng, Haoyun; Darve, Eric; Thain, Douglas; Izaguirre, Jesús A.
2014-01-01
Molecular modeling is a field that traditionally has large computational costs. Until recently, most simulation techniques relied on long trajectories, which inherently have poor scalability. A new class of methods is proposed that requires only a large number of short calculations, and for which minimal communication between computer nodes is required. We considered one of the more accurate variants called Accelerated Weighted Ensemble Dynamics (AWE) and for which distributed computing can be made efficient. We implemented AWE using the Work Queue framework for task management and applied it to an all atom protein model (Fip35 WW domain). We can run with excellent scalability by simultaneously utilizing heterogeneous resources from multiple computing platforms such as clouds (Amazon EC2, Microsoft Azure), dedicated clusters, grids, on multiple architectures (CPU/GPU, 32/64bit), and in a dynamic environment in which processes are regularly added or removed from the pool. This has allowed us to achieve an aggregate sampling rate of over 500 ns/hour. As a comparison, a single process typically achieves 0.1 ns/hour. PMID:25540799
Lee, Chankyun; Cao, Xiaoyuan; Yoshikane, Noboru; Tsuritani, Takehiro; Rhee, June-Koo Kevin
2015-10-19
The feasibility of software-defined optical networking (SDON) for a practical application critically depends on scalability of centralized control performance. The paper, highly scalable routing and wavelength assignment (RWA) algorithms are investigated on an OpenFlow-based SDON testbed for proof-of-concept demonstration. Efficient RWA algorithms are proposed to achieve high performance in achieving network capacity with reduced computation cost, which is a significant attribute in a scalable centralized-control SDON. The proposed heuristic RWA algorithms differ in the orders of request processes and in the procedures of routing table updates. Combined in a shortest-path-based routing algorithm, a hottest-request-first processing policy that considers demand intensity and end-to-end distance information offers both the highest throughput of networks and acceptable computation scalability. We further investigate trade-off relationship between network throughput and computation complexity in routing table update procedure by a simulation study.
Development of a device to simulate tooth mobility.
Erdelt, Kurt-Jürgen; Lamper, Timea
2010-10-01
The testing of new materials under simulation of oral conditions is essential in medicine. For simulation of fracture strength different simulation devices are used for test set-up. The results of these in vitro tests differ because there is no standardization of tooth mobility in simulation devices. The aim of this study is to develop a simulation device that depicts the tooth mobility curve as accurately as possible and creates reproducible and scalable mobility curves. With the aid of published literature and with the help of dentists, average forms of tooth classes were generated. Based on these tooth data, different abutment tooth shapes and different simulation devices were designed with a CAD system and were generated with a Rapid Prototyping system. Then, for all simulation devices the displacement curves were created with a universal testing machine and compared with the tooth mobility curve. With this new information, an improved adapted simulation device was constructed. A simulations device that is able to simulate the mobility curve of natural teeth with high accuracy and where mobility is reproducible and scalable was developed.
Low-complexity transcoding algorithm from H.264/AVC to SVC using data mining
NASA Astrophysics Data System (ADS)
Garrido-Cantos, Rosario; De Cock, Jan; Martínez, Jose Luis; Van Leuven, Sebastian; Cuenca, Pedro; Garrido, Antonio
2013-12-01
Nowadays, networks and terminals with diverse characteristics of bandwidth and capabilities coexist. To ensure a good quality of experience, this diverse environment demands adaptability of the video stream. In general, video contents are compressed to save storage capacity and to reduce the bandwidth required for its transmission. Therefore, if these compressed video streams were compressed using scalable video coding schemes, they would be able to adapt to those heterogeneous networks and a wide range of terminals. Since the majority of the multimedia contents are compressed using H.264/AVC, they cannot benefit from that scalability. This paper proposes a low-complexity algorithm to convert an H.264/AVC bitstream without scalability to scalable bitstreams with temporal scalability in baseline and main profiles by accelerating the mode decision task of the scalable video coding encoding stage using machine learning tools. The results show that when our technique is applied, the complexity is reduced by 87% while maintaining coding efficiency.
Solar wind interaction with Venus and Mars in a parallel hybrid code
NASA Astrophysics Data System (ADS)
Jarvinen, Riku; Sandroos, Arto
2013-04-01
We discuss the development and applications of a new parallel hybrid simulation, where ions are treated as particles and electrons as a charge-neutralizing fluid, for the interaction between the solar wind and Venus and Mars. The new simulation code under construction is based on the algorithm of the sequential global planetary hybrid model developed at the Finnish Meteorological Institute (FMI) and on the Corsair parallel simulation platform also developed at the FMI. The FMI's sequential hybrid model has been used for studies of plasma interactions of several unmagnetized and weakly magnetized celestial bodies for more than a decade. Especially, the model has been used to interpret in situ particle and magnetic field observations from plasma environments of Mars, Venus and Titan. Further, Corsair is an open source MPI (Message Passing Interface) particle and mesh simulation platform, mainly aimed for simulations of diffusive shock acceleration in solar corona and interplanetary space, but which is now also being extended for global planetary hybrid simulations. In this presentation we discuss challenges and strategies of parallelizing a legacy simulation code as well as possible applications and prospects of a scalable parallel hybrid model for the solar wind interactions of Venus and Mars.
Efficient Prediction Structures for H.264 Multi View Coding Using Temporal Scalability
NASA Astrophysics Data System (ADS)
Guruvareddiar, Palanivel; Joseph, Biju K.
2014-03-01
Prediction structures with "disposable view components based" hierarchical coding have been proven to be efficient for H.264 multi view coding. Though these prediction structures along with the QP cascading schemes provide superior compression efficiency when compared to the traditional IBBP coding scheme, the temporal scalability requirements of the bit stream could not be met to the fullest. On the other hand, a fully scalable bit stream, obtained by "temporal identifier based" hierarchical coding, provides a number of advantages including bit rate adaptations and improved error resilience, but lacks in compression efficiency when compared to the former scheme. In this paper it is proposed to combine the two approaches such that a fully scalable bit stream could be realized with minimal reduction in compression efficiency when compared to state-of-the-art "disposable view components based" hierarchical coding. Simulation results shows that the proposed method enables full temporal scalability with maximum BDPSNR reduction of only 0.34 dB. A novel method also has been proposed for the identification of temporal identifier for the legacy H.264/AVC base layer packets. Simulation results also show that this enables the scenario where the enhancement views could be extracted at a lower frame rate (1/2nd or 1/4th of base view) with average extraction time for a view component of only 0.38 ms.
Quality Scalability Aware Watermarking for Visual Content.
Bhowmik, Deepayan; Abhayaratne, Charith
2016-11-01
Scalable coding-based content adaptation poses serious challenges to traditional watermarking algorithms, which do not consider the scalable coding structure and hence cannot guarantee correct watermark extraction in media consumption chain. In this paper, we propose a novel concept of scalable blind watermarking that ensures more robust watermark extraction at various compression ratios while not effecting the visual quality of host media. The proposed algorithm generates scalable and robust watermarked image code-stream that allows the user to constrain embedding distortion for target content adaptations. The watermarked image code-stream consists of hierarchically nested joint distortion-robustness coding atoms. The code-stream is generated by proposing a new wavelet domain blind watermarking algorithm guided by a quantization based binary tree. The code-stream can be truncated at any distortion-robustness atom to generate the watermarked image with the desired distortion-robustness requirements. A blind extractor is capable of extracting watermark data from the watermarked images. The algorithm is further extended to incorporate a bit-plane discarding-based quantization model used in scalable coding-based content adaptation, e.g., JPEG2000. This improves the robustness against quality scalability of JPEG2000 compression. The simulation results verify the feasibility of the proposed concept, its applications, and its improved robustness against quality scalable content adaptation. Our proposed algorithm also outperforms existing methods showing 35% improvement. In terms of robustness to quality scalable video content adaptation using Motion JPEG2000 and wavelet-based scalable video coding, the proposed method shows major improvement for video watermarking.
NASA Astrophysics Data System (ADS)
Yan, Hui; Wang, K. G.; Jones, Jim E.
2016-06-01
A parallel algorithm for large-scale three-dimensional phase-field simulations of phase coarsening is developed and implemented on high-performance architectures. From the large-scale simulations, a new kinetics in phase coarsening in the region of ultrahigh volume fraction is found. The parallel implementation is capable of harnessing the greater computer power available from high-performance architectures. The parallelized code enables increase in three-dimensional simulation system size up to a 5123 grid cube. Through the parallelized code, practical runtime can be achieved for three-dimensional large-scale simulations, and the statistical significance of the results from these high resolution parallel simulations are greatly improved over those obtainable from serial simulations. A detailed performance analysis on speed-up and scalability is presented, showing good scalability which improves with increasing problem size. In addition, a model for prediction of runtime is developed, which shows a good agreement with actual run time from numerical tests.
Scalable Methods for Eulerian-Lagrangian Simulation Applied to Compressible Multiphase Flows
NASA Astrophysics Data System (ADS)
Zwick, David; Hackl, Jason; Balachandar, S.
2017-11-01
Multiphase flows can be found in countless areas of physics and engineering. Many of these flows can be classified as dispersed two-phase flows, meaning that there are solid particles dispersed in a continuous fluid phase. A common technique for simulating such flow is the Eulerian-Lagrangian method. While useful, this method can suffer from scaling issues on larger problem sizes that are typical of many realistic geometries. Here we present scalable techniques for Eulerian-Lagrangian simulations and apply it to the simulation of a particle bed subjected to expansion waves in a shock tube. The results show that the methods presented here are viable for simulation of larger problems on modern supercomputers. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-1315138. This work was supported in part by the U.S. Department of Energy under Contract No. DE-NA0002378.
Cheung, Kit; Schultz, Simon R; Luk, Wayne
2015-01-01
NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation.
Cheung, Kit; Schultz, Simon R.; Luk, Wayne
2016-01-01
NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation. PMID:26834542
LoRa Scalability: A Simulation Model Based on Interference Measurements
Haxhibeqiri, Jetmir; Van den Abeele, Floris; Moerman, Ingrid; Hoebeke, Jeroen
2017-01-01
LoRa is a long-range, low power, low bit rate and single-hop wireless communication technology. It is intended to be used in Internet of Things (IoT) applications involving battery-powered devices with low throughput requirements. A LoRaWAN network consists of multiple end nodes that communicate with one or more gateways. These gateways act like a transparent bridge towards a common network server. The amount of end devices and their throughput requirements will have an impact on the performance of the LoRaWAN network. This study investigates the scalability in terms of the number of end devices per gateway of single-gateway LoRaWAN deployments. First, we determine the intra-technology interference behavior with two physical end nodes, by checking the impact of an interfering node on a transmitting node. Measurements show that even under concurrent transmission, one of the packets can be received under certain conditions. Based on these measurements, we create a simulation model for assessing the scalability of a single gateway LoRaWAN network. We show that when the number of nodes increases up to 1000 per gateway, the losses will be up to 32%. In such a case, pure Aloha will have around 90% losses. However, when the duty cycle of the application layer becomes lower than the allowed radio duty cycle of 1%, losses will be even lower. We also show network scalability simulation results for some IoT use cases based on real data. PMID:28545239
LoRa Scalability: A Simulation Model Based on Interference Measurements.
Haxhibeqiri, Jetmir; Van den Abeele, Floris; Moerman, Ingrid; Hoebeke, Jeroen
2017-05-23
LoRa is a long-range, low power, low bit rate and single-hop wireless communication technology. It is intended to be used in Internet of Things (IoT) applications involving battery-powered devices with low throughput requirements. A LoRaWAN network consists of multiple end nodes that communicate with one or more gateways. These gateways act like a transparent bridge towards a common network server. The amount of end devices and their throughput requirements will have an impact on the performance of the LoRaWAN network. This study investigates the scalability in terms of the number of end devices per gateway of single-gateway LoRaWAN deployments. First, we determine the intra-technology interference behavior with two physical end nodes, by checking the impact of an interfering node on a transmitting node. Measurements show that even under concurrent transmission, one of the packets can be received under certain conditions. Based on these measurements, we create a simulation model for assessing the scalability of a single gateway LoRaWAN network. We show that when the number of nodes increases up to 1000 per gateway, the losses will be up to 32%. In such a case, pure Aloha will have around 90% losses. However, when the duty cycle of the application layer becomes lower than the allowed radio duty cycle of 1%, losses will be even lower. We also show network scalability simulation results for some IoT use cases based on real data.
Efficient Data Gathering in 3D Linear Underwater Wireless Sensor Networks Using Sink Mobility
Akbar, Mariam; Javaid, Nadeem; Khan, Ayesha Hussain; Imran, Muhammad; Shoaib, Muhammad; Vasilakos, Athanasios
2016-01-01
Due to the unpleasant and unpredictable underwater environment, designing an energy-efficient routing protocol for underwater wireless sensor networks (UWSNs) demands more accuracy and extra computations. In the proposed scheme, we introduce a mobile sink (MS), i.e., an autonomous underwater vehicle (AUV), and also courier nodes (CNs), to minimize the energy consumption of nodes. MS and CNs stop at specific stops for data gathering; later on, CNs forward the received data to the MS for further transmission. By the mobility of CNs and MS, the overall energy consumption of nodes is minimized. We perform simulations to investigate the performance of the proposed scheme and compare it to preexisting techniques. Simulation results are compared in terms of network lifetime, throughput, path loss, transmission loss and packet drop ratio. The results show that the proposed technique performs better in terms of network lifetime, throughput, path loss and scalability. PMID:27007373
Efficient Data Gathering in 3D Linear Underwater Wireless Sensor Networks Using Sink Mobility.
Akbar, Mariam; Javaid, Nadeem; Khan, Ayesha Hussain; Imran, Muhammad; Shoaib, Muhammad; Vasilakos, Athanasios
2016-03-19
Due to the unpleasant and unpredictable underwater environment, designing an energy-efficient routing protocol for underwater wireless sensor networks (UWSNs) demands more accuracy and extra computations. In the proposed scheme, we introduce a mobile sink (MS), i.e., an autonomous underwater vehicle (AUV), and also courier nodes (CNs), to minimize the energy consumption of nodes. MS and CNs stop at specific stops for data gathering; later on, CNs forward the received data to the MS for further transmission. By the mobility of CNs and MS, the overall energy consumption of nodes is minimized. We perform simulations to investigate the performance of the proposed scheme and compare it to preexisting techniques. Simulation results are compared in terms of network lifetime, throughput, path loss, transmission loss and packet drop ratio. The results show that the proposed technique performs better in terms of network lifetime, throughput, path loss and scalability.
A Petascale Non-Hydrostatic Atmospheric Dynamical Core in the HOMME Framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tufo, Henry
The High-Order Method Modeling Environment (HOMME) is a framework for building scalable, conserva- tive atmospheric models for climate simulation and general atmospheric-modeling applications. Its spatial discretizations are based on Spectral-Element (SE) and Discontinuous Galerkin (DG) methods. These are local methods employing high-order accurate spectral basis-functions that have been shown to perform well on massively parallel supercomputers at any resolution and scale particularly well at high resolutions. HOMME provides the framework upon which the CAM-SE community atmosphere model dynamical-core is constructed. In its current incarnation, CAM-SE employs the hydrostatic primitive-equations (PE) of motion, which limits its resolution to simulations coarser thanmore » 0.1 per grid cell. The primary objective of this project is to remove this resolution limitation by providing HOMME with the capabilities needed to build nonhydrostatic models that solve the compressible Euler/Navier-Stokes equations.« less
NASA Technical Reports Server (NTRS)
Liever, Peter A.; West, Jeffrey S.; Harris, Robert E.
2016-01-01
A hybrid Computational Fluid Dynamics and Computational Aero-Acoustics (CFD/CAA) modeling framework has been developed for launch vehicle liftoff acoustic environment predictions. The framework couples the existing highly-scalable NASA production CFD code, Loci/CHEM, with a high-order accurate Discontinuous Galerkin solver developed in the same production framework, Loci/THRUST, to accurately resolve and propagate acoustic physics across the entire launch environment. Time-accurate, Hybrid RANS/LES CFD modeling is applied for predicting the acoustic generation physics at the plume source, and a high-order accurate unstructured mesh Discontinuous Galerkin (DG) method is employed to propagate acoustic waves away from the source across large distances using high-order accurate schemes. The DG solver is capable of solving 2nd, 3rd, and 4th order Euler solutions for non-linear, conservative acoustic field propagation. Initial application testing and validation has been carried out against high resolution acoustic data from the Ares Scale Model Acoustic Test (ASMAT) series to evaluate the capabilities and production readiness of the CFD/CAA system to resolve the observed spectrum of acoustic frequency content. This paper presents results from this validation and outlines efforts to mature and improve the computational simulation framework.
Enabling Efficient Climate Science Workflows in High Performance Computing Environments
NASA Astrophysics Data System (ADS)
Krishnan, H.; Byna, S.; Wehner, M. F.; Gu, J.; O'Brien, T. A.; Loring, B.; Stone, D. A.; Collins, W.; Prabhat, M.; Liu, Y.; Johnson, J. N.; Paciorek, C. J.
2015-12-01
A typical climate science workflow often involves a combination of acquisition of data, modeling, simulation, analysis, visualization, publishing, and storage of results. Each of these tasks provide a myriad of challenges when running on a high performance computing environment such as Hopper or Edison at NERSC. Hurdles such as data transfer and management, job scheduling, parallel analysis routines, and publication require a lot of forethought and planning to ensure that proper quality control mechanisms are in place. These steps require effectively utilizing a combination of well tested and newly developed functionality to move data, perform analysis, apply statistical routines, and finally, serve results and tools to the greater scientific community. As part of the CAlibrated and Systematic Characterization, Attribution and Detection of Extremes (CASCADE) project we highlight a stack of tools our team utilizes and has developed to ensure that large scale simulation and analysis work are commonplace and provide operations that assist in everything from generation/procurement of data (HTAR/Globus) to automating publication of results to portals like the Earth Systems Grid Federation (ESGF), all while executing everything in between in a scalable environment in a task parallel way (MPI). We highlight the use and benefit of these tools by showing several climate science analysis use cases they have been applied to.
Scalable nuclear density functional theory with Sky3D
NASA Astrophysics Data System (ADS)
Afibuzzaman, Md; Schuetrumpf, Bastian; Aktulga, Hasan Metin
2018-02-01
In nuclear astrophysics, quantum simulations of large inhomogeneous dense systems as they appear in the crusts of neutron stars present big challenges. The number of particles in a simulation with periodic boundary conditions is strongly limited due to the immense computational cost of the quantum methods. In this paper, we describe techniques for an efficient and scalable parallel implementation of Sky3D, a nuclear density functional theory solver that operates on an equidistant grid. Presented techniques allow Sky3D to achieve good scaling and high performance on a large number of cores, as demonstrated through detailed performance analysis on a Cray XC40 supercomputer.
Scalable tuning of building models to hourly data
Garrett, Aaron; New, Joshua Ryan
2015-03-31
Energy models of existing buildings are unreliable unless calibrated so they correlate well with actual energy usage. Manual tuning requires a skilled professional, is prohibitively expensive for small projects, imperfect, non-repeatable, non-transferable, and not scalable to the dozens of sensor channels that smart meters, smart appliances, and cheap/ubiquitous sensors are beginning to make available today. A scalable, automated methodology is needed to quickly and intelligently calibrate building energy models to all available data, increase the usefulness of those models, and facilitate speed-and-scale penetration of simulation-based capabilities into the marketplace for actualized energy savings. The "Autotune'' project is a novel, model-agnosticmore » methodology which leverages supercomputing, large simulation ensembles, and big data mining with multiple machine learning algorithms to allow automatic calibration of simulations that match measured experimental data in a way that is deployable on commodity hardware. This paper shares several methodologies employed to reduce the combinatorial complexity to a computationally tractable search problem for hundreds of input parameters. Furthermore, accuracy metrics are provided which quantify model error to measured data for either monthly or hourly electrical usage from a highly-instrumented, emulated-occupancy research home.« less
ls1 mardyn: The Massively Parallel Molecular Dynamics Code for Large Systems.
Niethammer, Christoph; Becker, Stefan; Bernreuther, Martin; Buchholz, Martin; Eckhardt, Wolfgang; Heinecke, Alexander; Werth, Stephan; Bungartz, Hans-Joachim; Glass, Colin W; Hasse, Hans; Vrabec, Jadran; Horsch, Martin
2014-10-14
The molecular dynamics simulation code ls1 mardyn is presented. It is a highly scalable code, optimized for massively parallel execution on supercomputing architectures and currently holds the world record for the largest molecular simulation with over four trillion particles. It enables the application of pair potentials to length and time scales that were previously out of scope for molecular dynamics simulation. With an efficient dynamic load balancing scheme, it delivers high scalability even for challenging heterogeneous configurations. Presently, multicenter rigid potential models based on Lennard-Jones sites, point charges, and higher-order polarities are supported. Due to its modular design, ls1 mardyn can be extended to new physical models, methods, and algorithms, allowing future users to tailor it to suit their respective needs. Possible applications include scenarios with complex geometries, such as fluids at interfaces, as well as nonequilibrium molecular dynamics simulation of heat and mass transfer.
NASA Technical Reports Server (NTRS)
Carr, Gregory A.; Iannello, Christopher J.; Chen, Yuan; Hunter, Don J.; DelCastillo, Linda; Bradley, Arthur T.; Stell, Christopher; Mojarradi, Mohammad M.
2013-01-01
This paper is to present a concept of a modular and scalable High Temperature Boost (HTB) Power Processing Unit (PPU) capable of operating at temperatures beyond the standard military temperature range. The various extreme environments technologies are also described as the fundamental technology path to this concept. The proposed HTB PPU is intended for power processing in the area of space solar electric propulsion, where reduction of in-space mass and volume are desired, and sometimes even critical, to achieve the goals of future space flight missions. The concept of the HTB PPU can also be applied to other extreme environment applications, such as geothermal and petroleum deep-well drilling, where higher temperature operation is required.
NASA Technical Reports Server (NTRS)
Carr, Gregory A.; Iannello, Christopher J.; Chen, Yuan; Hunter, Don J.; Del Castillo, Linda; Bradley, Arthur T.; Stell, Christopher; Mojarradi, Mohammad M.
2013-01-01
This paper is to present a concept of a modular and scalable High Temperature Boost (HTB) Power Processing Unit (PPU) capable of operating at temperatures beyond the standard military temperature range. The various extreme environments technologies are also described as the fundamental technology path to this concept. The proposed HTB PPU is intended for power processing in the area of space solar electric propulsion, where the reduction of in-space mass and volume are desired, and sometimes even critical, to achieve the goals of future space flight missions. The concept of the HTB PPU can also be applied to other extreme environment applications, such as geothermal and petroleum deep-well drilling, where higher temperature operation is required.
On noise and the performance benefit of nonblocking collectives
DOE Office of Scientific and Technical Information (OSTI.GOV)
Widener, Patrick M.; Levy, Scott; Ferreira, Kurt B.
Relaxed synchronization offers the potential of maintaining application scalability by allowing many processes to make independent progress when some processes suffer delays. Yet, the benefits of this approach in important parallel workloads have not been investigated in detail. In this paper, we use a validated simulation approach to explore the noise mitigation effects of idealized nonblocking collectives in workloads where these collectives are a major contributor to total execution time. In conclusion, although nonblocking collectives are unlikely to provide significant noise mitigation to applications in the low-OS-noise environments expected in next-generation HPC systems, we show that they can potentially improvemore » application runtime with respect to other noise types.« less
Parallel processing implementation for the coupled transport of photons and electrons using OpenMP
NASA Astrophysics Data System (ADS)
Doerner, Edgardo
2016-05-01
In this work the use of OpenMP to implement the parallel processing of the Monte Carlo (MC) simulation of the coupled transport for photons and electrons is presented. This implementation was carried out using a modified EGSnrc platform which enables the use of the Microsoft Visual Studio 2013 (VS2013) environment, together with the developing tools available in the Intel Parallel Studio XE 2015 (XE2015). The performance study of this new implementation was carried out in a desktop PC with a multi-core CPU, taking as a reference the performance of the original platform. The results were satisfactory, both in terms of scalability as parallelization efficiency.
On noise and the performance benefit of nonblocking collectives
Widener, Patrick M.; Levy, Scott; Ferreira, Kurt B.; ...
2015-11-02
Relaxed synchronization offers the potential of maintaining application scalability by allowing many processes to make independent progress when some processes suffer delays. Yet, the benefits of this approach in important parallel workloads have not been investigated in detail. In this paper, we use a validated simulation approach to explore the noise mitigation effects of idealized nonblocking collectives in workloads where these collectives are a major contributor to total execution time. In conclusion, although nonblocking collectives are unlikely to provide significant noise mitigation to applications in the low-OS-noise environments expected in next-generation HPC systems, we show that they can potentially improvemore » application runtime with respect to other noise types.« less
NASA Astrophysics Data System (ADS)
Vincenti, Henri; Vay, Jean-Luc
2018-07-01
The advent of massively parallel supercomputers, with their distributed-memory technology using many processing units, has favored the development of highly-scalable local low-order solvers at the expense of harder-to-scale global very high-order spectral methods. Indeed, FFT-based methods, which were very popular on shared memory computers, have been largely replaced by finite-difference (FD) methods for the solution of many problems, including plasmas simulations with electromagnetic Particle-In-Cell methods. For some problems, such as the modeling of so-called "plasma mirrors" for the generation of high-energy particles and ultra-short radiations, we have shown that the inaccuracies of standard FD-based PIC methods prevent the modeling on present supercomputers at sufficient accuracy. We demonstrate here that a new method, based on the use of local FFTs, enables ultrahigh-order accuracy with unprecedented scalability, and thus for the first time the accurate modeling of plasma mirrors in 3D.
A General-purpose Framework for Parallel Processing of Large-scale LiDAR Data
NASA Astrophysics Data System (ADS)
Li, Z.; Hodgson, M.; Li, W.
2016-12-01
Light detection and ranging (LiDAR) technologies have proven efficiency to quickly obtain very detailed Earth surface data for a large spatial extent. Such data is important for scientific discoveries such as Earth and ecological sciences and natural disasters and environmental applications. However, handling LiDAR data poses grand geoprocessing challenges due to data intensity and computational intensity. Previous studies received notable success on parallel processing of LiDAR data to these challenges. However, these studies either relied on high performance computers and specialized hardware (GPUs) or focused mostly on finding customized solutions for some specific algorithms. We developed a general-purpose scalable framework coupled with sophisticated data decomposition and parallelization strategy to efficiently handle big LiDAR data. Specifically, 1) a tile-based spatial index is proposed to manage big LiDAR data in the scalable and fault-tolerable Hadoop distributed file system, 2) two spatial decomposition techniques are developed to enable efficient parallelization of different types of LiDAR processing tasks, and 3) by coupling existing LiDAR processing tools with Hadoop, this framework is able to conduct a variety of LiDAR data processing tasks in parallel in a highly scalable distributed computing environment. The performance and scalability of the framework is evaluated with a series of experiments conducted on a real LiDAR dataset using a proof-of-concept prototype system. The results show that the proposed framework 1) is able to handle massive LiDAR data more efficiently than standalone tools; and 2) provides almost linear scalability in terms of either increased workload (data volume) or increased computing nodes with both spatial decomposition strategies. We believe that the proposed framework provides valuable references on developing a collaborative cyberinfrastructure for processing big earth science data in a highly scalable environment.
Hemani, H; Warrier, M; Sakthivel, N; Chaturvedi, S
2014-05-01
Molecular dynamics (MD) simulations are used in the study of void nucleation and growth in crystals that are subjected to tensile deformation. These simulations are run for typically several hundred thousand time steps depending on the problem. We output the atom positions at a required frequency for post processing to determine the void nucleation, growth and coalescence due to tensile deformation. The simulation volume is broken up into voxels of size equal to the unit cell size of crystal. In this paper, we present the algorithm to identify the empty unit cells (voids), their connections (void size) and dynamic changes (growth and coalescence of voids) for MD simulations of large atomic systems (multi-million atoms). We discuss the parallel algorithms that were implemented and discuss their relative applicability in terms of their speedup and scalability. We also present the results on scalability of our algorithm when it is incorporated into MD software LAMMPS. Copyright © 2014 Elsevier Inc. All rights reserved.
A Numerical Study of Scalable Cardiac Electro-Mechanical Solvers on HPC Architectures
Colli Franzone, Piero; Pavarino, Luca F.; Scacchi, Simone
2018-01-01
We introduce and study some scalable domain decomposition preconditioners for cardiac electro-mechanical 3D simulations on parallel HPC (High Performance Computing) architectures. The electro-mechanical model of the cardiac tissue is composed of four coupled sub-models: (1) the static finite elasticity equations for the transversely isotropic deformation of the cardiac tissue; (2) the active tension model describing the dynamics of the intracellular calcium, cross-bridge binding and myofilament tension; (3) the anisotropic Bidomain model describing the evolution of the intra- and extra-cellular potentials in the deforming cardiac tissue; and (4) the ionic membrane model describing the dynamics of ionic currents, gating variables, ionic concentrations and stretch-activated channels. This strongly coupled electro-mechanical model is discretized in time with a splitting semi-implicit technique and in space with isoparametric finite elements. The resulting scalable parallel solver is based on Multilevel Additive Schwarz preconditioners for the solution of the Bidomain system and on BDDC preconditioned Newton-Krylov solvers for the non-linear finite elasticity system. The results of several 3D parallel simulations show the scalability of both linear and non-linear solvers and their application to the study of both physiological excitation-contraction cardiac dynamics and re-entrant waves in the presence of different mechano-electrical feedbacks. PMID:29674971
Fully implicit adaptive mesh refinement MHD algorithm
NASA Astrophysics Data System (ADS)
Philip, Bobby
2005-10-01
In the macroscopic simulation of plasmas, the numerical modeler is faced with the challenge of dealing with multiple time and length scales. The former results in stiffness due to the presence of very fast waves. The latter requires one to resolve the localized features that the system develops. Traditional approaches based on explicit time integration techniques and fixed meshes are not suitable for this challenge, as such approaches prevent the modeler from using realistic plasma parameters to keep the computation feasible. We propose here a novel approach, based on implicit methods and structured adaptive mesh refinement (SAMR). Our emphasis is on both accuracy and scalability with the number of degrees of freedom. To our knowledge, a scalable, fully implicit AMR algorithm has not been accomplished before for MHD. As a proof-of-principle, we focus on the reduced resistive MHD model as a basic MHD model paradigm, which is truly multiscale. The approach taken here is to adapt mature physics-based technologyootnotetextL. Chac'on et al., J. Comput. Phys. 178 (1), 15- 36 (2002) to AMR grids, and employ AMR-aware multilevel techniques (such as fast adaptive composite --FAC-- algorithms) for scalability. We will demonstrate that the concept is indeed feasible, featuring optimal scalability under grid refinement. Results of fully-implicit, dynamically-adaptive AMR simulations will be presented on a variety of problems.
A Testbed to Evaluate the FIWARE-Based IoT Platform in the Domain of Precision Agriculture.
Martínez, Ramón; Pastor, Juan Ángel; Álvarez, Bárbara; Iborra, Andrés
2016-11-23
Wireless sensor networks (WSNs) represent one of the most promising technologies for precision farming. Over the next few years, a significant increase in the use of such systems on commercial farms is expected. WSNs present a number of problems, regarding scalability, interoperability, communications, connectivity with databases and data processing. Different Internet of Things middleware is appearing to overcome these challenges. This paper checks whether one of these middleware, FIWARE, is suitable for the development of agricultural applications. To the authors' knowledge, there are no works that show how to use FIWARE in precision agriculture and study its appropriateness, its scalability and its efficiency for this kind of applications. To do this, a testbed has been designed and implemented to simulate different deployments and load conditions. The testbed is a typical FIWARE application, complete, yet simple and comprehensible enough to show the main features and components of FIWARE, as well as the complexity of using this technology. Although the testbed has been deployed in a laboratory environment, its design is based on the analysis of an Internet of Things use case scenario in the domain of precision agriculture.
A Testbed to Evaluate the FIWARE-Based IoT Platform in the Domain of Precision Agriculture
Martínez, Ramón; Pastor, Juan Ángel; Álvarez, Bárbara; Iborra, Andrés
2016-01-01
Wireless sensor networks (WSNs) represent one of the most promising technologies for precision farming. Over the next few years, a significant increase in the use of such systems on commercial farms is expected. WSNs present a number of problems, regarding scalability, interoperability, communications, connectivity with databases and data processing. Different Internet of Things middleware is appearing to overcome these challenges. This paper checks whether one of these middleware, FIWARE, is suitable for the development of agricultural applications. To the authors’ knowledge, there are no works that show how to use FIWARE in precision agriculture and study its appropriateness, its scalability and its efficiency for this kind of applications. To do this, a testbed has been designed and implemented to simulate different deployments and load conditions. The testbed is a typical FIWARE application, complete, yet simple and comprehensible enough to show the main features and components of FIWARE, as well as the complexity of using this technology. Although the testbed has been deployed in a laboratory environment, its design is based on the analysis of an Internet of Things use case scenario in the domain of precision agriculture. PMID:27886091
Recent Advances in Solar Sail Propulsion at NASA
NASA Technical Reports Server (NTRS)
Johnson, Les; Young, Roy M.; Montgomery, Edward E., IV
2006-01-01
Supporting NASA's Science Mission Directorate, the In-Space Propulsion Technology Program is developing solar sail propulsion for use in robotic science and exploration of the solar system. Solar sail propulsion will provide longer on-station operation, increased scientific payload mass fraction, and access to previously inaccessible orbits for multiple potential science missions. Two different 20-meter solar sail systems were produced and successfully completed functional vacuum testing last year in NASA Glenn's Space Power Facility at Plum Brook Station, Ohio. The sails were designed and developed by ATK Space Systems and L'Garde, respectively. These sail systems consist of a central structure with four deployable booms that support the sails. This sail designs are robust enough for deployments in a one atmosphere, one gravity environment, and are scalable to much larger solar sails-perhaps as much as 150 meters on a side. In addition, computation modeling and analytical simulations have been performed to assess the scalability of the technology to the large sizes (>150 meters) required for first generation solar sails missions. Life and space environmental effects testing of sail and component materials are also nearly complete. This paper will summarize recent technology advancements in solar sails and their successful ambient and vacuum testing.
Lagardère, Louis; Jolly, Luc-Henri; Lipparini, Filippo; Aviat, Félix; Stamm, Benjamin; Jing, Zhifeng F.; Harger, Matthew; Torabifard, Hedieh; Cisneros, G. Andrés; Schnieders, Michael J.; Gresh, Nohad; Maday, Yvon; Ren, Pengyu Y.; Ponder, Jay W.
2017-01-01
We present Tinker-HP, a massively MPI parallel package dedicated to classical molecular dynamics (MD) and to multiscale simulations, using advanced polarizable force fields (PFF) encompassing distributed multipoles electrostatics. Tinker-HP is an evolution of the popular Tinker package code that conserves its simplicity of use and its reference double precision implementation for CPUs. Grounded on interdisciplinary efforts with applied mathematics, Tinker-HP allows for long polarizable MD simulations on large systems up to millions of atoms. We detail in the paper the newly developed extension of massively parallel 3D spatial decomposition to point dipole polarizable models as well as their coupling to efficient Krylov iterative and non-iterative polarization solvers. The design of the code allows the use of various computer systems ranging from laboratory workstations to modern petascale supercomputers with thousands of cores. Tinker-HP proposes therefore the first high-performance scalable CPU computing environment for the development of next generation point dipole PFFs and for production simulations. Strategies linking Tinker-HP to Quantum Mechanics (QM) in the framework of multiscale polarizable self-consistent QM/MD simulations are also provided. The possibilities, performances and scalability of the software are demonstrated via benchmarks calculations using the polarizable AMOEBA force field on systems ranging from large water boxes of increasing size and ionic liquids to (very) large biosystems encompassing several proteins as well as the complete satellite tobacco mosaic virus and ribosome structures. For small systems, Tinker-HP appears to be competitive with the Tinker-OpenMM GPU implementation of Tinker. As the system size grows, Tinker-HP remains operational thanks to its access to distributed memory and takes advantage of its new algorithmic enabling for stable long timescale polarizable simulations. Overall, a several thousand-fold acceleration over a single-core computation is observed for the largest systems. The extension of the present CPU implementation of Tinker-HP to other computational platforms is discussed. PMID:29732110
Validation of a Scalable Solar Sailcraft
NASA Technical Reports Server (NTRS)
Murphy, D. M.
2006-01-01
The NASA In-Space Propulsion (ISP) program sponsored intensive solar sail technology and systems design, development, and hardware demonstration activities over the past 3 years. Efforts to validate a scalable solar sail system by functional demonstration in relevant environments, together with test-analysis correlation activities on a scalable solar sail system have recently been successfully completed. A review of the program, with descriptions of the design, results of testing, and analytical model validations of component and assembly functional, strength, stiffness, shape, and dynamic behavior are discussed. The scaled performance of the validated system is projected to demonstrate the applicability to flight demonstration and important NASA road-map missions.
CloudMan as a platform for tool, data, and analysis distribution.
Afgan, Enis; Chapman, Brad; Taylor, James
2012-11-27
Cloud computing provides an infrastructure that facilitates large scale computational analysis in a scalable, democratized fashion, However, in this context it is difficult to ensure sharing of an analysis environment and associated data in a scalable and precisely reproducible way. CloudMan (usecloudman.org) enables individual researchers to easily deploy, customize, and share their entire cloud analysis environment, including data, tools, and configurations. With the enabled customization and sharing of instances, CloudMan can be used as a platform for collaboration. The presented solution improves accessibility of cloud resources, tools, and data to the level of an individual researcher and contributes toward reproducibility and transparency of research solutions.
Using Swarming Agents for Scalable Security in Large Network Environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crouse, Michael; White, Jacob L.; Fulp, Errin W.
2011-09-23
The difficulty of securing computer infrastructures increases as they grow in size and complexity. Network-based security solutions such as IDS and firewalls cannot scale because of exponentially increasing computational costs inherent in detecting the rapidly growing number of threat signatures. Hostbased solutions like virus scanners and IDS suffer similar issues, and these are compounded when enterprises try to monitor these in a centralized manner. Swarm-based autonomous agent systems like digital ants and artificial immune systems can provide a scalable security solution for large network environments. The digital ants approach offers a biologically inspired design where each ant in the virtualmore » colony can detect atoms of evidence that may help identify a possible threat. By assembling the atomic evidences from different ant types the colony may detect the threat. This decentralized approach can require, on average, fewer computational resources than traditional centralized solutions; however there are limits to its scalability. This paper describes how dividing a large infrastructure into smaller managed enclaves allows the digital ant framework to effectively operate in larger environments. Experimental results will show that using smaller enclaves allows for more consistent distribution of agents and results in faster response times.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
2012-01-01
The simulation was performed on 64K cores of Intrepid, running at 0.25 simulated-years-per-day and taking 25 million core-hours. This is the first simulation using both the CAM5 physics and the highly scalable spectral element dynamical core. The animation of Total Precipitable Water clearly shows hurricanes developing in the Atlantic and Pacific.
Improved Satellite-based Crop Yield Mapping by Spatially Explicit Parameterization of Crop Phenology
NASA Astrophysics Data System (ADS)
Jin, Z.; Azzari, G.; Lobell, D. B.
2016-12-01
Field-scale mapping of crop yields with satellite data often relies on the use of crop simulation models. However, these approaches can be hampered by inaccuracies in the simulation of crop phenology. Here we present and test an approach to use dense time series of Landsat 7 and 8 acquisitions data to calibrate various parameters related to crop phenology simulation, such as leaf number and leaf appearance rates. These parameters are then mapped across the Midwestern United States for maize and soybean, and for two different simulation models. We then implement our recently developed Scalable satellite-based Crop Yield Mapper (SCYM) with simulations reflecting the improved phenology parameterizations, and compare to prior estimates based on default phenology routines. Our preliminary results show that the proposed method can effectively alleviate the underestimation of early-season LAI by the default Agricultural Production Systems sIMulator (APSIM), and that spatially explicit parameterization for the phenology model substantially improves the SCYM performance in capturing the spatiotemporal variation in maize and soybean yield. The scheme presented in our study thus preserves the scalability of SCYM, while significantly reducing its uncertainty.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahn, Gail-Joon
The project seeks an innovative framework to enable users to access and selectively share resources in distributed environments, enhancing the scalability of information sharing. We have investigated secure sharing & assurance approaches for ad-hoc collaboration, focused on Grids, Clouds, and ad-hoc network environments.
A Cluster-Based Framework for the Security of Medical Sensor Environments
NASA Astrophysics Data System (ADS)
Klaoudatou, Eleni; Konstantinou, Elisavet; Kambourakis, Georgios; Gritzalis, Stefanos
The adoption of Wireless Sensor Networks (WSNs) in the healthcare sector poses many security issues, mainly because medical information is considered particularly sensitive. The security mechanisms employed are expected to be more efficient in terms of energy consumption and scalability in order to cope with the constrained capabilities of WSNs and patients’ mobility. Towards this goal, cluster-based medical WSNs can substantially improve efficiency and scalability. In this context, we have proposed a general framework for cluster-based medical environments on top of which security mechanisms can rely. This framework fully covers the varying needs of both in-hospital environments and environments formed ad hoc for medical emergencies. In this paper, we further elaborate on the security of our proposed solution. We specifically focus on key establishment mechanisms and investigate the group key agreement protocols that can best fit in our framework.
Nebot, Patricio; Torres-Sospedra, Joaquín; Martínez, Rafael J
2011-01-01
The control architecture is one of the most important part of agricultural robotics and other robotic systems. Furthermore its importance increases when the system involves a group of heterogeneous robots that should cooperate to achieve a global goal. A new control architecture is introduced in this paper for groups of robots in charge of doing maintenance tasks in agricultural environments. Some important features such as scalability, code reuse, hardware abstraction and data distribution have been considered in the design of the new architecture. Furthermore, coordination and cooperation among the different elements in the system is allowed in the proposed control system. By integrating a network oriented device server Player, Java Agent Development Framework (JADE) and High Level Architecture (HLA), the previous concepts have been considered in the new architecture presented in this paper. HLA can be considered the most important part because it not only allows the data distribution and implicit communication among the parts of the system but also allows to simultaneously operate with simulated and real entities, thus allowing the use of hybrid systems in the development of applications.
NASA Technical Reports Server (NTRS)
Schnase, John L.; Tamkin, Glenn S.; Ripley, W. David III; Stong, Savannah; Gill, Roger; Duffy, Daniel Q.
2012-01-01
Scientific data services are becoming an important part of the NASA Center for Climate Simulation's mission. Our technological response to this expanding role is built around the concept of a Virtual Climate Data Server (vCDS), repetitive provisioning, image-based deployment and distribution, and virtualization-as-a-service. The vCDS is an iRODS-based data server specialized to the needs of a particular data-centric application. We use RPM scripts to build vCDS images in our local computing environment, our local Virtual Machine Environment, NASA s Nebula Cloud Services, and Amazon's Elastic Compute Cloud. Once provisioned into one or more of these virtualized resource classes, vCDSs can use iRODS s federation capabilities to create an integrated ecosystem of managed collections that is scalable and adaptable to changing resource requirements. This approach enables platform- or software-asa- service deployment of vCDS and allows the NCCS to offer virtualization-as-a-service: a capacity to respond in an agile way to new customer requests for data services.
Scalable load balancing for massively parallel distributed Monte Carlo particle transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Brien, M. J.; Brantley, P. S.; Joy, K. I.
2013-07-01
In order to run computer simulations efficiently on massively parallel computers with hundreds of thousands or millions of processors, care must be taken that the calculation is load balanced across the processors. Examining the workload of every processor leads to an unscalable algorithm, with run time at least as large as O(N), where N is the number of processors. We present a scalable load balancing algorithm, with run time 0(log(N)), that involves iterated processor-pair-wise balancing steps, ultimately leading to a globally balanced workload. We demonstrate scalability of the algorithm up to 2 million processors on the Sequoia supercomputer at Lawrencemore » Livermore National Laboratory. (authors)« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trujillo, Angelina Michelle
Strategy, Planning, Acquiring- very large scale computing platforms come and go and planning for immensely scalable machines often precedes actual procurement by 3 years. Procurement can be another year or more. Integration- After Acquisition, machines must be integrated into the computing environments at LANL. Connection to scalable storage via large scale storage networking, assuring correct and secure operations. Management and Utilization – Ongoing operations, maintenance, and trouble shooting of the hardware and systems software at massive scale is required.
NASA Astrophysics Data System (ADS)
Bucay, Igal; Helal, Ahmed; Dunsky, David; Leviyev, Alex; Mallavarapu, Akhila; Sreenivasan, S. V.; Raizen, Mark
2017-04-01
Ionization of atoms and molecules is an important process in many applications and processes such as mass spectrometry. Ionization is typically accomplished by electron bombardment, and while it is scalable to large volumes, is also very inefficient due to the small cross section of electron-atom collisions. Photoionization methods can be highly efficient, but are not scalable due to the small ionization volume. Electric field ionization is accomplished using ultra-sharp conducting tips biased to a few kilovolts, but suffers from a low ionization volume and tip fabrication limitations. We report on our progress towards an efficient, robust, and scalable method of atomic and molecular ionization using orderly arrays of sharp, gold-doped silicon nanowires. As demonstrated in earlier work, the presence of the gold greatly enhances the ionization probability, which was attributed to an increase in available acceptor surface states. We present here a novel process used to fabricate the nanowire array, results of simulations aimed at optimizing the configuration of the array, and our progress towards demonstrating efficient and scalable ionization.
NASA Technical Reports Server (NTRS)
Robinson, John E., III; Lee, Alan; Lai, Chok Fung
2017-01-01
This paper describes the Shadow-Mode Assessment Using Realistic Technologies for the National Airspace System (SMART-NAS) Test Bed. The SMART-NAS Test Bed is an air traffic simulation platform being developed by the National Aeronautics and Space Administration (NASA). The SMART-NAS Test Bed's core purpose is to conduct high-fidelity, real-time, human-in-the-loop and automation-in-the-loop simulations of current and proposed future air traffic concepts for the United States' Next Generation Air Transportation System called NextGen. The setup, configuration, coordination, and execution of realtime, human-in-the-loop air traffic management simulations are complex, tedious, time intensive, and expensive. The SMART-NAS Test Bed framework is an alternative to the current approach and will provide services throughout the simulation workflow pipeline to help alleviate these shortcomings. The principle concepts to be simulated include advanced gate-to-gate, trajectory-based operations, widespread integration of novel aircraft such as unmanned vehicles, and real-time safety assurance technologies to enable autonomous operations. To make this possible, SNTB will utilize Web-based technologies, cloud resources, and real-time, scalable, communication middleware. This paper describes the SMART-NAS Test Bed's vision, purpose, its concept of use, and the potential benefits, key capabilities, high-level requirements, architecture, software design, and usage.
A simulator tool set for evaluating HEVC/SHVC streaming
NASA Astrophysics Data System (ADS)
Al Hadhrami, Tawfik; Nightingale, James; Wang, Qi; Grecos, Christos; Kehtarnavaz, Nasser
2015-02-01
Video streaming and other multimedia applications account for an ever increasing proportion of all network traffic. The recent adoption of High Efficiency Video Coding (HEVC) as the H.265 standard provides many opportunities for new and improved services multimedia services and applications in the consumer domain. Since the delivery of version one of H.265, the Joint Collaborative Team on Video Coding have been working towards standardisation of a scalable extension (SHVC) to the H.265 standard and a series of range extensions and new profiles. As these enhancements are added to the standard the range of potential applications and research opportunities will expend. For example the use of video is also growing rapidly in other sectors such as safety, security, defence and health with real-time high quality video transmission playing an important role in areas like critical infrastructure monitoring and disaster management. Each of which may benefit from the application of enhanced HEVC/H.265 and SHVC capabilities. The majority of existing research into HEVC/H.265 transmission has focussed on the consumer domain addressing issues such as broadcast transmission and delivery to mobile devices with the lack of freely available tools widely cited as an obstacle to conducting this type of research. In this paper we present a toolset which facilitates the transmission and evaluation of HEVC/H.265 and SHVC encoded video on the popular open source NCTUns simulator. Our toolset provides researchers with a modular, easy to use platform for evaluating video transmission and adaptation proposals on large scale wired, wireless and hybrid architectures. The toolset consists of pre-processing, transmission, SHVC adaptation and post-processing tools to gather and analyse statistics. It has been implemented using HM15 and SHM5, the latest versions of the HEVC and SHVC reference software implementations to ensure that currently adopted proposals for scalable and range extensions to the standard can be investigated. We demonstrate the effectiveness and usability of our toolset by evaluating SHVC streaming and adaptation to meet terminal constraints and network conditions in a range of wired, wireless, and large scale wireless mesh network scenarios, each of which is designed to simulate a realistic environment. Our results are compared to those for H264/SVC, the scalable extension to the existing H.264/AVC advanced video coding standard.
High fidelity wireless network evaluation for heterogeneous cognitive radio networks
NASA Astrophysics Data System (ADS)
Ding, Lei; Sagduyu, Yalin; Yackoski, Justin; Azimi-Sadjadi, Babak; Li, Jason; Levy, Renato; Melodia, Tammaso
2012-06-01
We present a high fidelity cognitive radio (CR) network emulation platform for wireless system tests, measure- ments, and validation. This versatile platform provides the configurable functionalities to control and repeat realistic physical channel effects in integrated space, air, and ground networks. We combine the advantages of scalable simulation environment with reliable hardware performance for high fidelity and repeatable evaluation of heterogeneous CR networks. This approach extends CR design only at device (software-defined-radio) or lower-level protocol (dynamic spectrum access) level to end-to-end cognitive networking, and facilitates low-cost deployment, development, and experimentation of new wireless network protocols and applications on frequency- agile programmable radios. Going beyond the channel emulator paradigm for point-to-point communications, we can support simultaneous transmissions by network-level emulation that allows realistic physical-layer inter- actions between diverse user classes, including secondary users, primary users, and adversarial jammers in CR networks. In particular, we can replay field tests in a lab environment with real radios perceiving and learning the dynamic environment thereby adapting for end-to-end goals over distributed spectrum coordination channels that replace the common control channel as a single point of failure. CR networks offer several dimensions of tunable actions including channel, power, rate, and route selection. The proposed network evaluation platform is fully programmable and can reliably evaluate the necessary cross-layer design solutions with configurable op- timization space by leveraging the hardware experiments to represent the realistic effects of physical channel, topology, mobility, and jamming on spectrum agility, situational awareness, and network resiliency. We also provide the flexibility to scale up the test environment by introducing virtual radios and establishing seamless signal-level interactions with real radios. This holistic wireless evaluation approach supports a large-scale, het- erogeneous, and dynamic CR network architecture and allows developing cross-layer network protocols under high fidelity, repeatable, and scalable wireless test scenarios suitable for heterogeneous space, air, and ground networks.
26th Space Simulation Conference Proceedings. Environmental Testing: The Path Forward
NASA Technical Reports Server (NTRS)
Packard, Edward A.
2010-01-01
Topics covered include: A Multifunctional Space Environment Simulation Facility for Accelerated Spacecraft Materials Testing; Exposure of Spacecraft Surface Coatings in a Simulated GEO Radiation Environment; Gravity-Offloading System for Large-Displacement Ground Testing of Spacecraft Mechanisms; Microscopic Shutters Controlled by cRIO in Sounding Rocket; Application of a Physics-Based Stabilization Criterion to Flight System Thermal Testing; Upgrade of a Thermal Vacuum Chamber for 20 Kelvin Operations; A New Approach to Improve the Uniformity of Solar Simulator; A Perfect Space Simulation Storm; A Planetary Environmental Simulator/Test Facility; Collimation Mirror Segment Refurbishment inside ESA s Large Space; Space Simulation of the CBERS 3 and 4 Satellite Thermal Model in the New Brazilian 6x8m Thermal Vacuum Chamber; The Certification of Environmental Chambers for Testing Flight Hardware; Space Systems Environmental Test Facility Database (SSETFD), Website Development Status; Wallops Flight Facility: Current and Future Test Capabilities for Suborbital and Orbital Projects; Force Limited Vibration Testing of JWST NIRSpec Instrument Using Strain Gages; Investigation of Acoustic Field Uniformity in Direct Field Acoustic Testing; Recent Developments in Direct Field Acoustic Testing; Assembly, Integration and Test Centre in Malaysia: Integration between Building Construction Works and Equipment Installation; Complex Ground Support Equipment for Satellite Thermal Vacuum Test; Effect of Charging Electron Exposure on 1064nm Transmission through Bare Sapphire Optics and SiO2 over HfO2 AR-Coated Sapphire Optics; Environmental Testing Activities and Capabilities for Turkish Space Industry; Integrated Circuit Reliability Simulation in Space Environments; Micrometeoroid Impacts and Optical Scatter in Space Environment; Overcoming Unintended Consequences of Ambient Pressure Thermal Cycling Environmental Tests; Performance and Functionality Improvements to Next Generation Thermal Vacuum Control System; Robotic Lunar Lander Development Project: Three-Dimensional Dynamic Stability Testing and Analysis; Thermal Physical Properties of Thermal Coatings for Spacecraft in Wide Range of Environmental Conditions: Experimental and Theoretical Study; Molecular Contamination Generated in Thermal Vacuum Chambers; Preventing Cross Contamination of Hardware in Thermal Vacuum Chambers; Towards Validation of Particulate Transport Code; Updated Trends in Materials' Outgassing Technology; Electrical Power and Data Acquisition Setup for the CBER 3 and 4 Satellite TBT; Method of Obtaining High Resolution Intrinsic Wire Boom Damping Parameters for Multi-Body Dynamics Simulations; and Thermal Vacuum Testing with Scalable Software Developed In-House.
NASA Astrophysics Data System (ADS)
Park, Soomyung; Joo, Seong-Soon; Yae, Byung-Ho; Lee, Jong-Hyun
2002-07-01
In this paper, we present the Optical Cross-Connect (OXC) Management Control System Architecture, which has the scalability and robust maintenance and provides the distributed managing environment in the optical transport network. The OXC system we are developing, which is divided into the hardware and the internal and external software for the OXC system, is made up the OXC subsystem with the Optical Transport Network (OTN) sub layers-hardware and the optical switch control system, the signaling control protocol subsystem performing the User-to-Network Interface (UNI) and Network-to-Network Interface (NNI) signaling control, the Operation Administration Maintenance & Provisioning (OAM&P) subsystem, and the network management subsystem. And the OXC management control system has the features that can support the flexible expansion of the optical transport network, provide the connectivity to heterogeneous external network elements, be added or deleted without interrupting OAM&P services, be remotely operated, provide the global view and detail information for network planner and operator, and have Common Object Request Broker Architecture (CORBA) based the open system architecture adding and deleting the intelligent service networking functions easily in future. To meet these considerations, we adopt the object oriented development method in the whole developing steps of the system analysis, design, and implementation to build the OXC management control system with the scalability, the maintenance, and the distributed managing environment. As a consequently, the componentification for the OXC operation management functions of each subsystem makes the robust maintenance, and increases code reusability. Also, the component based OXC management control system architecture will have the flexibility and scalability in nature.
Control and design of multiple unmanned air vehicles for persistent surveillance
NASA Astrophysics Data System (ADS)
Nigam, Nikhil
Control of multiple autonomous aircraft for search and exploration, is a topic of current research interest for applications such as weather monitoring, geographical surveys, search and rescue, tactical reconnaissance, and extra-terrestrial exploration, and the need to distribute sensing is driven by considerations of efficiency, reliability, cost and scalability. Hence, this problem has been extensively studied in the fields of controls and artificial intelligence. The task of persistent surveillance is different from a coverage/exploration problem, in that all areas need to be continuously searched, minimizing the time between visitations to each region in the target space. This distinction does not allow a straightforward application of most exploration techniques to the problem, although ideas from these methods can still be used. The use of aerial vehicles is motivated by their ability to cover larger spaces and their relative insensitivity to terrain. However, the dynamics of Unmanned Air Vehicles (UAVs) adds complexity to the control problem. Most of the work in the literature decouples the vehicle dynamics and control policies, but their interaction is particularly interesting for a surveillance mission. Stochastic environments and UAV failures further enrich the problem by requiring the control policies to be robust, and this aspect is particularly important for hardware implementations. For a persistent mission, it becomes imperative to consider the range/endurance constraints of the vehicles. The coupling of the control policy with the endurance constraints of the vehicles is an aspect that has not been sufficiently explored. Design of UAVs for desirable mission performance is also an issue of considerable significance. The use of a single monolithic optimization for such a problem has practical limitations, and decomposition-based design is a potential alternative. In this research high-level control policies are devised, that are scalable, reliable, efficient, and robust to changes in the environment. Most of the existing techniques that carry performance guarantees are not scalable or robust to changes. The scalable techniques are often heuristic in nature, resulting in lack of reliability and performance. Our policies are tested in a multi-UAV simulation environment developed for this problem, and shown to be near-optimal in spite of being completely reactive in nature. We explicitly account for the coupling between aircraft dynamics and control policies as well, and suggest modifications to improve performance under dynamic constraints. A smart refueling policy is also developed to account for limited endurance, and large performance benefits are observed. The method is based on the solution of a linear program that can be efficiently solved online in a distributed setting, unlike previous work. The Vehicle Swarm Technology Laboratory (VSTL), a hardware testbed developed at Boeing Research and Technology for evaluating swarm of UAVs, is described next and used to test the control strategy in a real-world scenario. The simplicity and robustness of the strategy allows easy implementation and near replication of the performance observed in simulation. Finally, an architecture for system-of-systems design based on Collaborative Optimization (CO) is presented. Earlier work coupling operations and design has used frameworks that make certain assumptions not valid for this problem. The efficacy of our approach is illustrated through preliminary design results, and extension to more realistic settings is also demonstrated.
The Prodiguer Messaging Platform
NASA Astrophysics Data System (ADS)
Denvil, S.; Greenslade, M. A.; Carenton, N.; Levavasseur, G.; Raciazek, J.
2015-12-01
CONVERGENCE is a French multi-partner national project designed to gather HPC and informatics expertise to innovate in the context of running French global climate models with differing grids and at differing resolutions. Efficient and reliable execution of these models and the management and dissemination of model output are some of the complexities that CONVERGENCE aims to resolve.At any one moment in time, researchers affiliated with the Institut Pierre Simon Laplace (IPSL) climate modeling group, are running hundreds of global climate simulations. These simulations execute upon a heterogeneous set of French High Performance Computing (HPC) environments. The IPSL's simulation execution runtime libIGCM (library for IPSL Global Climate Modeling group) has recently been enhanced so as to support hitherto impossible realtime use cases such as simulation monitoring, data publication, metrics collection, simulation control, visualizations … etc. At the core of this enhancement is Prodiguer: an AMQP (Advanced Message Queue Protocol) based event driven asynchronous distributed messaging platform. libIGCM now dispatches copious amounts of information, in the form of messages, to the platform for remote processing by Prodiguer software agents at IPSL servers in Paris. Such processing takes several forms: Persisting message content to database(s); Launching rollback jobs upon simulation failure; Notifying downstream applications; Automation of visualization pipelines; We will describe and/or demonstrate the platform's: Technical implementation; Inherent ease of scalability; Inherent adaptiveness in respect to supervising simulations; Web portal receiving simulation notifications in realtime.
The Simulation of Read-time Scalable Coherent Interface
NASA Technical Reports Server (NTRS)
Li, Qiang; Grant, Terry; Grover, Radhika S.
1997-01-01
Scalable Coherent Interface (SCI, IEEE/ANSI Std 1596-1992) (SCI1, SCI2) is a high performance interconnect for shared memory multiprocessor systems. In this project we investigate an SCI Real Time Protocols (RTSCI1) using Directed Flow Control Symbols. We studied the issues of efficient generation of control symbols, and created a simulation model of the protocol on a ring-based SCI system. This report presents the results of the study. The project has been implemented using SES/Workbench. The details that follow encompass aspects of both SCI and Flow Control Protocols, as well as the effect of realistic client/server processing delay. The report is organized as follows. Section 2 provides a description of the simulation model. Section 3 describes the protocol implementation details. The next three sections of the report elaborate on the workload, results and conclusions. Appended to the report is a description of the tool, SES/Workbench, used in our simulation, and internal details of our implementation of the protocol.
A Scalable O(N) Algorithm for Large-Scale Parallel First-Principles Molecular Dynamics Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Osei-Kuffuor, Daniel; Fattebert, Jean-Luc
2014-01-01
Traditional algorithms for first-principles molecular dynamics (FPMD) simulations only gain a modest capability increase from current petascale computers, due to their O(N 3) complexity and their heavy use of global communications. To address this issue, we are developing a truly scalable O(N) complexity FPMD algorithm, based on density functional theory (DFT), which avoids global communications. The computational model uses a general nonorthogonal orbital formulation for the DFT energy functional, which requires knowledge of selected elements of the inverse of the associated overlap matrix. We present a scalable algorithm for approximately computing selected entries of the inverse of the overlap matrix,more » based on an approximate inverse technique, by inverting local blocks corresponding to principal submatrices of the global overlap matrix. The new FPMD algorithm exploits sparsity and uses nearest neighbor communication to provide a computational scheme capable of extreme scalability. Accuracy is controlled by the mesh spacing of the finite difference discretization, the size of the localization regions in which the electronic orbitals are confined, and a cutoff beyond which the entries of the overlap matrix can be omitted when computing selected entries of its inverse. We demonstrate the algorithm's excellent parallel scaling for up to O(100K) atoms on O(100K) processors, with a wall-clock time of O(1) minute per molecular dynamics time step.« less
NASA Astrophysics Data System (ADS)
Konduri, Aditya
Many natural and engineering systems are governed by nonlinear partial differential equations (PDEs) which result in a multiscale phenomena, e.g. turbulent flows. Numerical simulations of these problems are computationally very expensive and demand for extreme levels of parallelism. At realistic conditions, simulations are being carried out on massively parallel computers with hundreds of thousands of processing elements (PEs). It has been observed that communication between PEs as well as their synchronization at these extreme scales take up a significant portion of the total simulation time and result in poor scalability of codes. This issue is likely to pose a bottleneck in scalability of codes on future Exascale systems. In this work, we propose an asynchronous computing algorithm based on widely used finite difference methods to solve PDEs in which synchronization between PEs due to communication is relaxed at a mathematical level. We show that while stability is conserved when schemes are used asynchronously, accuracy is greatly degraded. Since message arrivals at PEs are random processes, so is the behavior of the error. We propose a new statistical framework in which we show that average errors drop always to first-order regardless of the original scheme. We propose new asynchrony-tolerant schemes that maintain accuracy when synchronization is relaxed. The quality of the solution is shown to depend, not only on the physical phenomena and numerical schemes, but also on the characteristics of the computing machine. A novel algorithm using remote memory access communications has been developed to demonstrate excellent scalability of the method for large-scale computing. Finally, we present a path to extend this method in solving complex multi-scale problems on Exascale machines.
MicROS-drt: supporting real-time and scalable data distribution in distributed robotic systems.
Ding, Bo; Wang, Huaimin; Fan, Zedong; Zhang, Pengfei; Liu, Hui
A primary requirement in distributed robotic software systems is the dissemination of data to all interested collaborative entities in a timely and scalable manner. However, providing such a service in a highly dynamic and resource-limited robotic environment is a challenging task, and existing robot software infrastructure has limitations in this aspect. This paper presents a novel robot software infrastructure, micROS-drt, which supports real-time and scalable data distribution. The solution is based on a loosely coupled data publish-subscribe model with the ability to support various time-related constraints. And to realize this model, a mature data distribution standard, the data distribution service for real-time systems (DDS), is adopted as the foundation of the transport layer of this software infrastructure. By elaborately adapting and encapsulating the capability of the underlying DDS middleware, micROS-drt can meet the requirement of real-time and scalable data distribution in distributed robotic systems. Evaluation results in terms of scalability, latency jitter and transport priority as well as the experiment on real robots validate the effectiveness of this work.
Molecular nanomagnets with switchable coupling for quantum simulation
Chiesa, Alessandro; Whitehead, George F. S.; Carretta, Stefano; ...
2014-12-11
Molecular nanomagnets are attractive candidate qubits because of their wide inter- and intra-molecular tunability. Uniform magnetic pulses could be exploited to implement one- and two-qubit gates in presence of a properly engineered pattern of interactions, but the synthesis of suitable and potentially scalable supramolecular complexes has proven a very hard task. Indeed, no quantum algorithms have ever been implemented, not even a proof-of-principle two-qubit gate. In this paper we show that the magnetic couplings in two supramolecular {Cr7Ni}-Ni-{Cr7Ni} assemblies can be chemically engineered to fit the above requisites for conditional gates with no need of local control. Microscopic parameters aremore » determined by a recently developed many-body ab-initio approach and used to simulate quantum gates. We find that these systems are optimal for proof-of-principle two-qubit experiments and can be exploited as building blocks of scalable architectures for quantum simulation.« less
Information Security Analysis Using Game Theory and Simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schlicher, Bob G; Abercrombie, Robert K
Information security analysis can be performed using game theory implemented in dynamic simulations of Agent Based Models (ABMs). Such simulations can be verified with the results from game theory analysis and further used to explore larger scale, real world scenarios involving multiple attackers, defenders, and information assets. Our approach addresses imperfect information and scalability that allows us to also address previous limitations of current stochastic game models. Such models only consider perfect information assuming that the defender is always able to detect attacks; assuming that the state transition probabilities are fixed before the game assuming that the players actions aremore » always synchronous; and that most models are not scalable with the size and complexity of systems under consideration. Our use of ABMs yields results of selected experiments that demonstrate our proposed approach and provides a quantitative measure for realistic information systems and their related security scenarios.« less
ID201202961, DOE S-124,539, Information Security Analysis Using Game Theory and Simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abercrombie, Robert K; Schlicher, Bob G
Information security analysis can be performed using game theory implemented in dynamic simulations of Agent Based Models (ABMs). Such simulations can be verified with the results from game theory analysis and further used to explore larger scale, real world scenarios involving multiple attackers, defenders, and information assets. Our approach addresses imperfect information and scalability that allows us to also address previous limitations of current stochastic game models. Such models only consider perfect information assuming that the defender is always able to detect attacks; assuming that the state transition probabilities are fixed before the game assuming that the players actions aremore » always synchronous; and that most models are not scalable with the size and complexity of systems under consideration. Our use of ABMs yields results of selected experiments that demonstrate our proposed approach and provides a quantitative measure for realistic information systems and their related security scenarios.« less
Evaluating Discovery Services Architectures in the Context of the Internet of Things
NASA Astrophysics Data System (ADS)
Polytarchos, Elias; Eliakis, Stelios; Bochtis, Dimitris; Pramatari, Katerina
As the "Internet of Things" is expected to grow rapidly in the following years, the need to develop and deploy efficient and scalable Discovery Services in this context is very important for its success. Thus, the ability to evaluate and compare the performance of different Discovery Services architectures is vital if we want to allege that a given design is better at meeting requirements of a specific application. The purpose of this chapter is to provide a paradigm for the evaluation of different Discovery Services for the Internet of Things in terms of efficiency, scalability and performance through the use of simulations. The methodology presented uses the application of Discovery Services to a supply chain with the Service Lookup Service Discovery Service using OMNeT++, an open source network simulation suite. Then, we delve into the simulation design and the details of our findings.
GPU-accelerated Red Blood Cells Simulations with Transport Dissipative Particle Dynamics.
Blumers, Ansel L; Tang, Yu-Hang; Li, Zhen; Li, Xuejin; Karniadakis, George E
2017-08-01
Mesoscopic numerical simulations provide a unique approach for the quantification of the chemical influences on red blood cell functionalities. The transport Dissipative Particles Dynamics (tDPD) method can lead to such effective multiscale simulations due to its ability to simultaneously capture mesoscopic advection, diffusion, and reaction. In this paper, we present a GPU-accelerated red blood cell simulation package based on a tDPD adaptation of our red blood cell model, which can correctly recover the cell membrane viscosity, elasticity, bending stiffness, and cross-membrane chemical transport. The package essentially processes all computational workloads in parallel by GPU, and it incorporates multi-stream scheduling and non-blocking MPI communications to improve inter-node scalability. Our code is validated for accuracy and compared against the CPU counterpart for speed. Strong scaling and weak scaling are also presented to characterizes scalability. We observe a speedup of 10.1 on one GPU over all 16 cores within a single node, and a weak scaling efficiency of 91% across 256 nodes. The program enables quick-turnaround and high-throughput numerical simulations for investigating chemical-driven red blood cell phenomena and disorders.
NASA Astrophysics Data System (ADS)
Crichton, Daniel; Mahabal, Ashish; Anton, Kristen; Cinquini, Luca; Colbert, Maureen; Djorgovski, S. George; Kincaid, Heather; Kelly, Sean; Liu, David
2017-05-01
We describe here the Early Detection Research Network (EDRN) for Cancer's knowledge environment. It is an open source platform built by NASA's Jet Propulsion Laboratory with contributions from the California Institute of Technology, and Giesel School of Medicine at Dartmouth. It uses tools like Apache OODT, Plone, and Solr, and borrows heavily from JPL's Planetary Data System's ontological infrastructure. It has accumulated data on hundreds of thousands of biospecemens and serves over 1300 registered users across the National Cancer Institute (NCI). The scalable computing infrastructure is built such that we are being able to reach out to other agencies, provide homogeneous access, and provide seamless analytics support and bioinformatics tools through community engagement.
CloudMan as a platform for tool, data, and analysis distribution
2012-01-01
Background Cloud computing provides an infrastructure that facilitates large scale computational analysis in a scalable, democratized fashion, However, in this context it is difficult to ensure sharing of an analysis environment and associated data in a scalable and precisely reproducible way. Results CloudMan (usecloudman.org) enables individual researchers to easily deploy, customize, and share their entire cloud analysis environment, including data, tools, and configurations. Conclusions With the enabled customization and sharing of instances, CloudMan can be used as a platform for collaboration. The presented solution improves accessibility of cloud resources, tools, and data to the level of an individual researcher and contributes toward reproducibility and transparency of research solutions. PMID:23181507
Democratizing Authority in the Built Environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andersen, Michael P; Kolb, John; Chen, Kaifei
Operating systems and applications in the built environment have relied upon central authorization and management mechanisms which restrict their scalability, especially with respect to administrative overhead. We propose a new set of primitives encompassing syndication, security, and service execution that unifies the management of applications and services across the built environment, while enabling participants to individually delegate privilege across multiple administrative domains with no loss of security or manageability. We show how to leverage a decentralized authorization syndication platform to extend the design of building operating systems beyond the single administrative domain of a building. The authorization system leveraged ismore » based on blockchain smart contracts to permit decentralized and democratized delegation of authorization without central trust. Upon this, a publish/subscribe syndication tier and a containerized service execution environment are constructed. Combined, these mechanisms solve problems of delegation, federation, device protection and service execution that arise throughout the built environment. We leverage a high-fidelity city-scale emulation to verify the scalability of the authorization tier, and briefly describe a prototypical democratized operating system for the built environment using this foundation.« less
Scalable Integrated Multi-Mission Support System Simulator Release 3.0
NASA Technical Reports Server (NTRS)
Kim, John; Velamuri, Sarma; Casey, Taylor; Bemann, Travis
2012-01-01
The Scalable Integrated Multi-mission Support System (SIMSS) is a tool that performs a variety of test activities related to spacecraft simulations and ground segment checks. SIMSS is a distributed, component-based, plug-and-play client-server system useful for performing real-time monitoring and communications testing. SIMSS runs on one or more workstations and is designed to be user-configurable or to use predefined configurations for routine operations. SIMSS consists of more than 100 modules that can be configured to create, receive, process, and/or transmit data. The SIMSS/GMSEC innovation is intended to provide missions with a low-cost solution for implementing their ground systems, as well as significantly reducing a mission s integration time and risk.
High-performance multiprocessor architecture for a 3-D lattice gas model
NASA Technical Reports Server (NTRS)
Lee, F.; Flynn, M.; Morf, M.
1991-01-01
The lattice gas method has recently emerged as a promising discrete particle simulation method in areas such as fluid dynamics. We present a very high-performance scalable multiprocessor architecture, called ALGE, proposed for the simulation of a realistic 3-D lattice gas model, Henon's 24-bit FCHC isometric model. Each of these VLSI processors is as powerful as a CRAY-2 for this application. ALGE is scalable in the sense that it achieves linear speedup for both fixed and increasing problem sizes with more processors. The core computation of a lattice gas model consists of many repetitions of two alternating phases: particle collision and propagation. Functional decomposition by symmetry group and virtual move are the respective keys to efficient implementation of collision and propagation.
A scalable neural chip with synaptic electronics using CMOS integrated memristors.
Cruz-Albrecht, Jose M; Derosier, Timothy; Srinivasa, Narayan
2013-09-27
The design and simulation of a scalable neural chip with synaptic electronics using nanoscale memristors fully integrated with complementary metal-oxide-semiconductor (CMOS) is presented. The circuit consists of integrate-and-fire neurons and synapses with spike-timing dependent plasticity (STDP). The synaptic conductance values can be stored in memristors with eight levels, and the topology of connections between neurons is reconfigurable. The circuit has been designed using a 90 nm CMOS process with via connections to on-chip post-processed memristor arrays. The design has about 16 million CMOS transistors and 73 728 integrated memristors. We provide circuit level simulations of the entire chip performing neuronal and synaptic computations that result in biologically realistic functional behavior.
Leveraging social networks for understanding the evolution of epidemics
2011-01-01
Background To understand how infectious agents disseminate throughout a population it is essential to capture the social model in a realistic manner. This paper presents a novel approach to modeling the propagation of the influenza virus throughout a realistic interconnection network based on actual individual interactions which we extract from online social networks. The advantage is that these networks can be extracted from existing sources which faithfully record interactions between people in their natural environment. We additionally allow modeling the characteristics of each individual as well as customizing his daily interaction patterns by making them time-dependent. Our purpose is to understand how the infection spreads depending on the structure of the contact network and the individuals who introduce the infection in the population. This would help public health authorities to respond more efficiently to epidemics. Results We implement a scalable, fully distributed simulator and validate the epidemic model by comparing the simulation results against the data in the 2004-2005 New York State Department of Health Report (NYSDOH), with similar temporal distribution results for the number of infected individuals. We analyze the impact of different types of connection models on the virus propagation. Lastly, we analyze and compare the effects of adopting several different vaccination policies, some of them based on individual characteristics -such as age- while others targeting the super-connectors in the social model. Conclusions This paper presents an approach to modeling the propagation of the influenza virus via a realistic social model based on actual individual interactions extracted from online social networks. We implemented a scalable, fully distributed simulator and we analyzed both the dissemination of the infection and the effect of different vaccination policies on the progress of the epidemics. The epidemic values predicted by our simulator match real data from NYSDOH. Our results show that our simulator can be a useful tool in understanding the differences in the evolution of an epidemic within populations with different characteristics and can provide guidance with regard to which, and how many, individuals should be vaccinated to slow down the virus propagation and reduce the number of infections. PMID:22784620
NASA Astrophysics Data System (ADS)
Hu, Haoyue; Eberhard, Peter
2017-10-01
Process simulations of conduction mode laser welding are performed using the meshless Lagrangian smoothed particle hydrodynamics (SPH) method. The solid phase is modeled based on the governing equations in thermoelasticity. For the liquid phase, surface tension effects are taken into account to simulate the melt flow in the weld pool, including the Marangoni force caused by a temperature-dependent surface tension gradient. A non-isothermal solid-liquid phase transition with the release or absorption of additional energy known as the latent heat of fusion is considered. The major heat transfer through conduction is modeled, whereas heat convection and radiation are neglected. The energy input from the laser beam is modeled as a Gaussian heat source acting on the initial material surface. The developed model is implemented in Pasimodo. Numerical results obtained with the model are presented for laser spot welding and seam welding of aluminum and iron. The change of process parameters like welding speed and laser power, and their effects on weld dimensions are investigated. Furthermore, simulations may be useful to obtain the threshold for deep penetration welding and to assess the overall welding quality. A scalability and performance analysis of the implemented SPH algorithm in Pasimodo is run in a shared memory environment. The analysis reveals the potential of large welding simulations on multi-core machines.
Engineering PFLOTRAN for Scalable Performance on Cray XT and IBM BlueGene Architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mills, Richard T; Sripathi, Vamsi K; Mahinthakumar, Gnanamanika
We describe PFLOTRAN - a code for simulation of coupled hydro-thermal-chemical processes in variably saturated, non-isothermal, porous media - and the approaches we have employed to obtain scalable performance on some of the largest scale supercomputers in the world. We present detailed analyses of I/O and solver performance on Jaguar, the Cray XT5 at Oak Ridge National Laboratory, and Intrepid, the IBM BlueGene/P at Argonne National Laboratory, that have guided our choice of algorithms.
A Mobile IPv6 based Distributed Mobility Management Mechanism of Mobile Internet
NASA Astrophysics Data System (ADS)
Yan, Shi; Jiayin, Cheng; Shanzhi, Chen
A flatter architecture is one of the trends of mobile Internet. Traditional centralized mobility management mechanism faces the challenges such as scalability and UE reachability. A MIPv6 based distributed mobility management mechanism is proposed in this paper. Some important network entities and signaling procedures are defined. UE reachability is also considered in this paper through extension to DNS servers. Simulation results show that the proposed approach can overcome the scalability problem of the centralized scheme.
In situ visualization for large-scale combustion simulations.
Yu, Hongfeng; Wang, Chaoli; Grout, Ray W; Chen, Jacqueline H; Ma, Kwan-Liu
2010-01-01
As scientific supercomputing moves toward petascale and exascale levels, in situ visualization stands out as a scalable way for scientists to view the data their simulations generate. This full picture is crucial particularly for capturing and understanding highly intermittent transient phenomena, such as ignition and extinction events in turbulent combustion.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aderholdt, Ferrol; Caldwell, Blake A.; Hicks, Susan Elaine
High performance computing environments are often used for a wide variety of workloads ranging from simulation, data transformation and analysis, and complex workflows to name just a few. These systems may process data at various security levels but in so doing are often enclaved at the highest security posture. This approach places significant restrictions on the users of the system even when processing data at a lower security level and exposes data at higher levels of confidentiality to a much broader population than otherwise necessary. The traditional approach of isolation, while effective in establishing security enclaves poses significant challenges formore » the use of shared infrastructure in HPC environments. This report details current state-of-the-art in reconfigurable network enclaving through Software Defined Networking (SDN) and Network Function Virtualization (NFV) and their applicability to secure enclaves in HPC environments. SDN and NFV methods are based on a solid foundation of system wide virtualization. The purpose of which is very straight forward, the system administrator can deploy networks that are more amenable to customer needs, and at the same time achieve increased scalability making it easier to increase overall capacity as needed without negatively affecting functionality. The network administration of both the server system and the virtual sub-systems is simplified allowing control of the infrastructure through well-defined APIs (Application Programming Interface). While SDN and NFV technologies offer significant promise in meeting these goals, they also provide the ability to address a significant component of the multi-tenant challenge in HPC environments, namely resource isolation. Traditional HPC systems are built upon scalable high-performance networking technologies designed to meet specific application requirements. Dynamic isolation of resources within these environments has remained difficult to achieve. SDN and NFV methodology provide us with relevant concepts and available open standards based APIs that isolate compute and storage resources within an otherwise common networking infrastructure. Additionally, the integration of the networking APIs within larger system frameworks such as OpenStack provide the tools necessary to establish isolated enclaves dynamically allowing the benefits of HPC while providing a controlled security structure surrounding these systems.« less
Vacuum Deployment and Testing of a 4-Quadrant Scalable Inflatable Solar Sail System
NASA Technical Reports Server (NTRS)
Lichodziejewski, David; Derbes, Billy; Galena, Daisy; Friese, Dave
2005-01-01
Solar sails reflect photons streaming from the sun and transfer momentum to the sail. The thrust, though small, is continuous and acts for the life of the mission without the need for propellant. Recent advances in materials and ultra-low mass gossamer structures have enabled a host of useful missions utilizing solar sail propulsion. The team of L'Garde, Jet Propulsion Laboratories, Ball Aerospace, and Langley Research Center, under the direction of the NASA In-Space Propulsion office, has been developing a scalable solar sail configuration to address NASA s future space propulsion needs. The baseline design currently in development and testing was optimized around the 1 AU solar sentinel mission. Featuring inflatably deployed sub-T(sub g), rigidized beam components, the 10,000 sq m sail and support structure weighs only 47.5 kg, including margin, yielding an areal density of 4.8 g/sq m. Striped sail architecture, net/membrane sail design, and L'Garde's conical boom deployment technique allows scalability without high mass penalties. This same structural concept can be scaled to meet and exceed the requirements of a number of other useful NASA missions. This paper discusses the interim accomplishments of phase 3 of a 3-phase NASA program to advance the technology readiness level (TRL) of the solar sail system from 3 toward a technology readiness level of 6 in 2005. Under earlier phases of the program many test articles have been fabricated and tested successfully. Most notably an unprecedented 4-quadrant 10 m solar sail ground test article was fabricated, subjected to launch environment tests, and was successfully deployed under simulated space conditions at NASA Plum Brook s 30m vacuum facility. Phase 2 of the program has seen much development and testing of this design validating assumptions, mass estimates, and predicted mission scalability. Under Phase 3 a much larger 20 m square test article including subscale vane has been fabricated and tested. A 20 m system ambient deployment has been successfully conducted after enduring Delta-2 launch environment testing. The program will culminate in a vacuum deployment of a 20 m subscale test article at the NASA Glenn s Plum Brook 30 m vacuum test facility to bring the TRL level as close to 6 as possible in 1 g. This focused program will pave the way for a flight experiment of this highly efficient space propulsion technology.
A De-centralized Scheduling and Load Balancing Algorithm for Heterogeneous Grid Environments
NASA Technical Reports Server (NTRS)
Arora, Manish; Das, Sajal K.; Biswas, Rupak
2002-01-01
In the past two decades, numerous scheduling and load balancing techniques have been proposed for locally distributed multiprocessor systems. However, they all suffer from significant deficiencies when extended to a Grid environment: some use a centralized approach that renders the algorithm unscalable, while others assume the overhead involved in searching for appropriate resources to be negligible. Furthermore, classical scheduling algorithms do not consider a Grid node to be N-resource rich and merely work towards maximizing the utilization of one of the resources. In this paper, we propose a new scheduling and load balancing algorithm for a generalized Grid model of N-resource nodes that not only takes into account the node and network heterogeneity, but also considers the overhead involved in coordinating among the nodes. Our algorithm is decentralized, scalable, and overlaps the node coordination time with that of the actual processing of ready jobs, thus saving valuable clock cycles needed for making decisions. The proposed algorithm is studied by conducting simulations using the Message Passing Interface (MPI) paradigm.
A De-Centralized Scheduling and Load Balancing Algorithm for Heterogeneous Grid Environments
NASA Technical Reports Server (NTRS)
Arora, Manish; Das, Sajal K.; Biswas, Rupak; Biegel, Bryan (Technical Monitor)
2002-01-01
In the past two decades, numerous scheduling and load balancing techniques have been proposed for locally distributed multiprocessor systems. However, they all suffer from significant deficiencies when extended to a Grid environment: some use a centralized approach that renders the algorithm unscalable, while others assume the overhead involved in searching for appropriate resources to be negligible. Furthermore, classical scheduling algorithms do not consider a Grid node to be N-resource rich and merely work towards maximizing the utilization of one of the resources. In this paper we propose a new scheduling and load balancing algorithm for a generalized Grid model of N-resource nodes that not only takes into account the node and network heterogeneity, but also considers the overhead involved in coordinating among the nodes. Our algorithm is de-centralized, scalable, and overlaps the node coordination time of the actual processing of ready jobs, thus saving valuable clock cycles needed for making decisions. The proposed algorithm is studied by conducting simulations using the Message Passing Interface (MPI) paradigm.
Tezaur, Irina K.; Tuminaro, Raymond S.; Perego, Mauro; ...
2015-01-01
We examine the scalability of the recently developed Albany/FELIX finite-element based code for the first-order Stokes momentum balance equations for ice flow. We focus our analysis on the performance of two possible preconditioners for the iterative solution of the sparse linear systems that arise from the discretization of the governing equations: (1) a preconditioner based on the incomplete LU (ILU) factorization, and (2) a recently-developed algebraic multigrid (AMG) preconditioner, constructed using the idea of semi-coarsening. A strong scalability study on a realistic, high resolution Greenland ice sheet problem reveals that, for a given number of processor cores, the AMG preconditionermore » results in faster linear solve times but the ILU preconditioner exhibits better scalability. In addition, a weak scalability study is performed on a realistic, moderate resolution Antarctic ice sheet problem, a substantial fraction of which contains floating ice shelves, making it fundamentally different from the Greenland ice sheet problem. We show that as the problem size increases, the performance of the ILU preconditioner deteriorates whereas the AMG preconditioner maintains scalability. This is because the linear systems are extremely ill-conditioned in the presence of floating ice shelves, and the ill-conditioning has a greater negative effect on the ILU preconditioner than on the AMG preconditioner.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kleese van Dam, Kerstin; Lansing, Carina S.; Elsethagen, Todd O.
2014-01-28
Modern workflow systems enable scientists to run ensemble simulations at unprecedented scales and levels of complexity, allowing them to study system sizes previously impossible to achieve, due to the inherent resource requirements needed for the modeling work. However as a result of these new capabilities the science teams suddenly also face unprecedented data volumes that they are unable to analyze with their existing tools and methodologies in a timely fashion. In this paper we will describe the ongoing development work to create an integrated data intensive scientific workflow and analysis environment that offers researchers the ability to easily create andmore » execute complex simulation studies and provides them with different scalable methods to analyze the resulting data volumes. The integration of simulation and analysis environments is hereby not only a question of ease of use, but supports fundamental functions in the correlated analysis of simulation input, execution details and derived results for multi-variant, complex studies. To this end the team extended and integrated the existing capabilities of the Velo data management and analysis infrastructure, the MeDICi data intensive workflow system and RHIPE the R for Hadoop version of the well-known statistics package, as well as developing a new visual analytics interface for the result exploitation by multi-domain users. The capabilities of the new environment are demonstrated on a use case that focusses on the Pacific Northwest National Laboratory (PNNL) building energy team, showing how they were able to take their previously local scale simulations to a nationwide level by utilizing data intensive computing techniques not only for their modeling work, but also for the subsequent analysis of their modeling results. As part of the PNNL research initiative PRIMA (Platform for Regional Integrated Modeling and Analysis) the team performed an initial 3 year study of building energy demands for the US Eastern Interconnect domain, which they are now planning to extend to predict the demand for the complete century. The initial study raised their data demands from a few GBs to 400GB for the 3year study and expected tens of TBs for the full century.« less
Fabrication of Scalable Indoor Light Energy Harvester and Study for Agricultural IoT Applications
NASA Astrophysics Data System (ADS)
Watanabe, M.; Nakamura, A.; Kunii, A.; Kusano, K.; Futagawa, M.
2015-12-01
A scalable indoor light energy harvester was fabricated by microelectromechanical system (MEMS) and printing hybrid technology and evaluated for agricultural IoT applications under different environmental input power density conditions, such as outdoor farming under the sun, greenhouse farming under scattered lighting, and a plant factory under LEDs. We fabricated and evaluated a dye- sensitized-type solar cell (DSC) as a low cost and “scalable” optical harvester device. We developed a transparent conductive oxide (TCO)-less process with a honeycomb metal mesh substrate fabricated by MEMS technology. In terms of the electrical and optical properties, we achieved scalable harvester output power by cell area sizing. Second, we evaluated the dependence of the input power scalable characteristics on the input light intensity, spectrum distribution, and light inlet direction angle, because harvested environmental input power is unstable. The TiO2 fabrication relied on nanoimprint technology, which was designed for optical optimization and fabrication, and we confirmed that the harvesters are robust to a variety of environments. Finally, we studied optical energy harvesting applications for agricultural IoT systems. These scalable indoor light harvesters could be used in many applications and situations in smart agriculture.
Iterative Integration of Visual Insights during Scalable Patent Search and Analysis.
Koch, S; Bosch, H; Giereth, M; Ertl, T
2011-05-01
Patents are of growing importance in current economic markets. Analyzing patent information has, therefore, become a common task for many interest groups. As a prerequisite for patent analysis, extensive search for relevant patent information is essential. Unfortunately, the complexity of patent material inhibits a straightforward retrieval of all relevant patent documents and leads to iterative, time-consuming approaches in practice. Already the amount of patent data to be analyzed poses challenges with respect to scalability. Further scalability issues arise concerning the diversity of users and the large variety of analysis tasks. With "PatViz", a system for interactive analysis of patent information has been developed addressing scalability at various levels. PatViz provides a visual environment allowing for interactive reintegration of insights into subsequent search iterations, thereby bridging the gap between search and analytic processes. Because of its extensibility, we expect that the approach we have taken can be employed in different problem domains that require high quality of search results regarding their completeness.
Rocinante, a virtual collaborative visualizer
DOE Office of Scientific and Technical Information (OSTI.GOV)
McDonald, M.J.; Ice, L.G.
1996-12-31
With the goal of improving the ability of people around the world to share the development and use of intelligent systems, Sandia National Laboratories` Intelligent Systems and Robotics Center is developing new Virtual Collaborative Engineering (VCE) and Virtual Collaborative Control (VCC) technologies. A key area of VCE and VCC research is in shared visualization of virtual environments. This paper describes a Virtual Collaborative Visualizer (VCV), named Rocinante, that Sandia developed for VCE and VCC applications. Rocinante allows multiple participants to simultaneously view dynamic geometrically-defined environments. Each viewer can exclude extraneous detail or include additional information in the scene as desired.more » Shared information can be saved and later replayed in a stand-alone mode. Rocinante automatically scales visualization requirements with computer system capabilities. Models with 30,000 polygons and 4 Megabytes of texture display at 12 to 15 frames per second (fps) on an SGI Onyx and at 3 to 8 fps (without texture) on Indigo 2 Extreme computers. In its networked mode, Rocinante synchronizes its local geometric model with remote simulators and sensory systems by monitoring data transmitted through UDP packets. Rocinante`s scalability and performance make it an ideal VCC tool. Users throughout the country can monitor robot motions and the thinking behind their motion planners and simulators.« less
Adamovich, Sergei; Fluet, Gerard G.; Merians, Alma S.; Mathai, Abraham; Qiu, Qinyin
2010-01-01
Current neuroscience has identified several constructs to increase the effectiveness of upper extremity rehabilitation. One is the use of progressive, skill acquisition-oriented training. Another approach emphasizes the use of bilateral activities. Building on these principles, this paper describes the design and feasibility testing of a robotic / virtual environment system designed to train the arm of persons who have had strokes. The system provides a variety of assistance modes, scalable workspaces and hand-robot interfaces allowing persons with strokes to train multiple joints in three dimensions. The simulations utilize assistance algorithms that adjust task difficulty both online and offline in relation to subject performance. Several distinctive haptic effects have been incorporated into the simulations. An adaptive master-slave relationship between the unimpaired and impaired arm encourages active movement of the subject's hemiparetic arm during a bimanual task. Adaptive anti-gravity support and damping stabilize the arm during virtual reaching and placement tasks. An adaptive virtual spring provides assistance to complete the movement if the subject is unable to complete the task in time. Finally, haptically rendered virtual objects help to shape the movement trajectory during a virtual placement task. A proof of concept study demonstrated this system to be safe, feasible and worthy of further study. PMID:19666345
Scalable Online Network Modeling and Simulation
2005-08-01
ONLINE NETWORK MODELING AND SIMULATION 6. AUTHOR(S) Boleslaw Szymanski , Shivkumar Kalyanaraman, Biplab Sikdar and Christopher Carothers 5...performance for a wide range of parameter values (parameter sensitivity), understanding of protocol stability and dynamics, and studying feature ...a wide range of parameter values (parameter sensitivity), understanding of protocol stability and dynamics, and studying feature interactions
Polarizable Molecular Dynamics in a Polarizable Continuum Solvent
Lipparini, Filippo; Lagardère, Louis; Raynaud, Christophe; Stamm, Benjamin; Cancès, Eric; Mennucci, Benedetta; Schnieders, Michael; Ren, Pengyu; Maday, Yvon; Piquemal, Jean-Philip
2015-01-01
We present for the first time scalable polarizable molecular dynamics (MD) simulations within a polarizable continuum solvent with molecular shape cavities and exact solution of the mutual polarization. The key ingredients are a very efficient algorithm for solving the equations associated with the polarizable continuum, in particular, the domain decomposition Conductor-like Screening Model (ddCOSMO), a rigorous coupling of the continuum with the polarizable force field achieved through a robust variational formulation and an effective strategy to solve the coupled equations. The coupling of ddCOSMO with non variational force fields, including AMOEBA, is also addressed. The MD simulations are feasible, for real life systems, on standard cluster nodes; a scalable parallel implementation allows for further speed up in the context of a newly developed module in Tinker, named Tinker-HP. NVE simulations are stable and long term energy conservation can be achieved. This paper is focused on the methodological developments, on the analysis of the algorithm and on the stability of the simulations; a proof-of-concept application is also presented to attest the possibilities of this newly developed technique. PMID:26516318
Communication Simulations for Power System Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fuller, Jason C.; Ciraci, Selim; Daily, Jeffrey A.
2013-05-29
New smart grid technologies and concepts, such as dynamic pricing, demand response, dynamic state estimation, and wide area monitoring, protection, and control, are expected to require considerable communication resources. As the cost of retrofit can be high, future power grids will require the integration of high-speed, secure connections with legacy communication systems, while still providing adequate system control and security. While considerable work has been performed to create co-simulators for the power domain with load models and market operations, limited work has been performed in integrating communications directly into a power domain solver. The simulation of communication and power systemsmore » will become more important as the two systems become more inter-related. This paper will discuss ongoing work at Pacific Northwest National Laboratory to create a flexible, high-speed power and communication system co-simulator for smart grid applications. The framework for the software will be described, including architecture considerations for modular, high performance computing and large-scale scalability (serialization, load balancing, partitioning, cross-platform support, etc.). The current simulator supports the ns-3 (telecommunications) and GridLAB-D (distribution systems) simulators. Ongoing and future work will be described, including planned future expansions for a traditional transmission solver. A test case using the co-simulator, utilizing a transactive demand response system created for the Olympic Peninsula and AEP gridSMART demonstrations, requiring two-way communication between distributed and centralized market devices, will be used to demonstrate the value and intended purpose of the co-simulation environment.« less
Stable scalable control of soliton propagation in broadband nonlinear optical waveguides
NASA Astrophysics Data System (ADS)
Peleg, Avner; Nguyen, Quan M.; Huynh, Toan T.
2017-02-01
We develop a method for achieving scalable transmission stabilization and switching of N colliding soliton sequences in optical waveguides with broadband delayed Raman response and narrowband nonlinear gain-loss. We show that dynamics of soliton amplitudes in N-sequence transmission is described by a generalized N-dimensional predator-prey model. Stability and bifurcation analysis for the predator-prey model are used to obtain simple conditions on the physical parameters for robust transmission stabilization as well as on-off and off-on switching of M out of N soliton sequences. Numerical simulations for single-waveguide transmission with a system of N coupled nonlinear Schrödinger equations with 2 ≤ N ≤ 4 show excellent agreement with the predator-prey model's predictions and stable propagation over significantly larger distances compared with other broadband nonlinear single-waveguide systems. Moreover, stable on-off and off-on switching of multiple soliton sequences and stable multiple transmission switching events are demonstrated by the simulations. We discuss the reasons for the robustness and scalability of transmission stabilization and switching in waveguides with broadband delayed Raman response and narrowband nonlinear gain-loss, and explain their advantages compared with other broadband nonlinear waveguides.
NASA Technical Reports Server (NTRS)
Hanebutte, Ulf R.; Joslin, Ronald D.; Zubair, Mohammad
1994-01-01
The implementation and the performance of a parallel spatial direct numerical simulation (PSDNS) code are reported for the IBM SP1 supercomputer. The spatially evolving disturbances that are associated with laminar-to-turbulent in three-dimensional boundary-layer flows are computed with the PS-DNS code. By remapping the distributed data structure during the course of the calculation, optimized serial library routines can be utilized that substantially increase the computational performance. Although the remapping incurs a high communication penalty, the parallel efficiency of the code remains above 40% for all performed calculations. By using appropriate compile options and optimized library routines, the serial code achieves 52-56 Mflops on a single node of the SP1 (45% of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a 'real world' simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP for the same simulation. The scalability information provides estimated computational costs that match the actual costs relative to changes in the number of grid points.
Scalable Performance Environments for Parallel Systems
NASA Technical Reports Server (NTRS)
Reed, Daniel A.; Olson, Robert D.; Aydt, Ruth A.; Madhyastha, Tara M.; Birkett, Thomas; Jensen, David W.; Nazief, Bobby A. A.; Totty, Brian K.
1991-01-01
As parallel systems expand in size and complexity, the absence of performance tools for these parallel systems exacerbates the already difficult problems of application program and system software performance tuning. Moreover, given the pace of technological change, we can no longer afford to develop ad hoc, one-of-a-kind performance instrumentation software; we need scalable, portable performance analysis tools. We describe an environment prototype based on the lessons learned from two previous generations of performance data analysis software. Our environment prototype contains a set of performance data transformation modules that can be interconnected in user-specified ways. It is the responsibility of the environment infrastructure to hide details of module interconnection and data sharing. The environment is written in C++ with the graphical displays based on X windows and the Motif toolkit. It allows users to interconnect and configure modules graphically to form an acyclic, directed data analysis graph. Performance trace data are represented in a self-documenting stream format that includes internal definitions of data types, sizes, and names. The environment prototype supports the use of head-mounted displays and sonic data presentation in addition to the traditional use of visual techniques.
Learning Analytics Platform, towards an Open Scalable Streaming Solution for Education
ERIC Educational Resources Information Center
Lewkow, Nicholas; Zimmerman, Neil; Riedesel, Mark; Essa, Alfred
2015-01-01
Next generation digital learning environments require delivering "just-in-time feedback" to learners and those who support them. Unlike traditional business intelligence environments, streaming data requires resilient infrastructure that can move data at scale from heterogeneous data sources, process the data quickly for use across…
Scalable, Finite Element Analysis of Electromagnetic Scattering and Radiation
NASA Technical Reports Server (NTRS)
Cwik, T.; Lou, J.; Katz, D.
1997-01-01
In this paper a method for simulating electromagnetic fields scattered from complex objects is reviewed; namely, an unstructured finite element code that does not use traditional mesh partitioning algorithms.
Simms, Andrew M; Toofanny, Rudesh D; Kehl, Catherine; Benson, Noah C; Daggett, Valerie
2008-06-01
Dynameomics is a project to investigate and catalog the native-state dynamics and thermal unfolding pathways of representatives of all protein folds using solvated molecular dynamics simulations, as described in the preceding paper. Here we introduce the design of the molecular dynamics data warehouse, a scalable, reliable repository that houses simulation data that vastly simplifies management and access. In the succeeding paper, we describe the development of a complementary multidimensional database. A single protein unfolding or native-state simulation can take weeks to months to complete, and produces gigabytes of coordinate and analysis data. Mining information from over 3000 completed simulations is complicated and time-consuming. Even the simplest queries involve writing intricate programs that must be built from low-level file system access primitives and include significant logic to correctly locate and parse data of interest. As a result, programs to answer questions that require data from hundreds of simulations are very difficult to write. Thus, organization and access to simulation data have been major obstacles to the discovery of new knowledge in the Dynameomics project. This repository is used internally and is the foundation of the Dynameomics portal site http://www.dynameomics.org. By organizing simulation data into a scalable, manageable and accessible form, we can begin to address substantial questions that move us closer to solving biomedical and bioengineering problems.
UTM Safely Enabling UAS Operations in Low-Altitude Airspace
NASA Technical Reports Server (NTRS)
Kopardekar, Parimal
2017-01-01
Conduct research, development and testing to identify airspace operations requirements to enable large-scale visual and beyond visual line of sight UAS operations in the low-altitude airspace. Use build-a-little-test-a-little strategy remote areas to urban areas Low density: No traffic management required but understanding of airspace constraints. Cooperative traffic management: Understanding of airspace constraints and other operations. Manned and unmanned traffic management: Scalable and heterogeneous operations. UTM construct consistent with FAAs risk-based strategy. UTM research platform is used for simulations and tests. UTM offers path towards scalability.
UTM Safely Enabling UAS Operations in Low-Altitude Airspace
NASA Technical Reports Server (NTRS)
Kopardekar, Parimal H.
2016-01-01
Conduct research, development and testing to identify airspace operations requirements to enable large-scale visual and beyond visual line of sight UAS operations in the low-altitude airspace. Use build-a-little-test-a-little strategy remote areas to urban areas Low density: No traffic management required but understanding of airspace constraints. Cooperative traffic management: Understanding of airspace constraints and other operations. Manned and unmanned traffic management: Scalable and heterogeneous operations. UTM construct consistent with FAAs risk-based strategy. UTM research platform is used for simulations and tests. UTM offers path towards scalability.
A channel differential EZW coding scheme for EEG data compression.
Dehkordi, Vahid R; Daou, Hoda; Labeau, Fabrice
2011-11-01
In this paper, a method is proposed to compress multichannel electroencephalographic (EEG) signals in a scalable fashion. Correlation between EEG channels is exploited through clustering using a k-means method. Representative channels for each of the clusters are encoded individually while other channels are encoded differentially, i.e., with respect to their respective cluster representatives. The compression is performed using the embedded zero-tree wavelet encoding adapted to 1-D signals. Simulations show that the scalable features of the scheme lead to a flexible quality/rate tradeoff, without requiring detailed EEG signal modeling.
Scalable printed electronics: an organic decoder addressing ferroelectric non-volatile memory.
Ng, Tse Nga; Schwartz, David E; Lavery, Leah L; Whiting, Gregory L; Russo, Beverly; Krusor, Brent; Veres, Janos; Bröms, Per; Herlogsson, Lars; Alam, Naveed; Hagel, Olle; Nilsson, Jakob; Karlsson, Christer
2012-01-01
Scalable circuits of organic logic and memory are realized using all-additive printing processes. A 3-bit organic complementary decoder is fabricated and used to read and write non-volatile, rewritable ferroelectric memory. The decoder-memory array is patterned by inkjet and gravure printing on flexible plastics. Simulation models for the organic transistors are developed, enabling circuit designs tolerant of the variations in printed devices. We explain the key design rules in fabrication of complex printed circuits and elucidate the performance requirements of materials and devices for reliable organic digital logic.
OWL: A scalable Monte Carlo simulation suite for finite-temperature study of materials
NASA Astrophysics Data System (ADS)
Li, Ying Wai; Yuk, Simuck F.; Cooper, Valentino R.; Eisenbach, Markus; Odbadrakh, Khorgolkhuu
The OWL suite is a simulation package for performing large-scale Monte Carlo simulations. Its object-oriented, modular design enables it to interface with various external packages for energy evaluations. It is therefore applicable to study the finite-temperature properties for a wide range of systems: from simple classical spin models to materials where the energy is evaluated by ab initio methods. This scheme not only allows for the study of thermodynamic properties based on first-principles statistical mechanics, it also provides a means for massive, multi-level parallelism to fully exploit the capacity of modern heterogeneous computer architectures. We will demonstrate how improved strong and weak scaling is achieved by employing novel, parallel and scalable Monte Carlo algorithms, as well as the applications of OWL to a few selected frontier materials research problems. This research was supported by the Office of Science of the Department of Energy under contract DE-AC05-00OR22725.
Scalable Integrated Multi-Mission Support System (SIMSS) Simulator Release 2.0 for GMSEC
NASA Technical Reports Server (NTRS)
Kim, John; Velamuri, Sarma; Casey, Taylor; Bemann, Travis
2012-01-01
Scalable Integrated Multi-Mission Support System (SIMSS) Simulator Release 2.0 software is designed to perform a variety of test activities related to spacecraft simulations and ground segment checks. This innovation uses the existing SIMSS framework, which interfaces with the GMSEC (Goddard Mission Services Evolution Center) Application Programming Interface (API) Version 3.0 message middleware, and allows SIMSS to accept GMSEC standard messages via the GMSEC message bus service. SIMSS is a distributed, component-based, plug-and-play client-server system that is useful for performing real-time monitoring and communications testing. SIMSS runs on one or more workstations, and is designed to be user-configurable, or to use predefined configurations for routine operations. SIMSS consists of more than 100 modules that can be configured to create, receive, process, and/or transmit data. The SIMSS/GMSEC innovation is intended to provide missions with a low-cost solution for implementing their ground systems, as well as to significantly reduce a mission s integration time and risk.
AEGIS: a robust and scalable real-time public health surveillance system.
Reis, Ben Y; Kirby, Chaim; Hadden, Lucy E; Olson, Karen; McMurry, Andrew J; Daniel, James B; Mandl, Kenneth D
2007-01-01
In this report, we describe the Automated Epidemiological Geotemporal Integrated Surveillance system (AEGIS), developed for real-time population health monitoring in the state of Massachusetts. AEGIS provides public health personnel with automated near-real-time situational awareness of utilization patterns at participating healthcare institutions, supporting surveillance of bioterrorism and naturally occurring outbreaks. As real-time public health surveillance systems become integrated into regional and national surveillance initiatives, the challenges of scalability, robustness, and data security become increasingly prominent. A modular and fault tolerant design helps AEGIS achieve scalability and robustness, while a distributed storage model with local autonomy helps to minimize risk of unauthorized disclosure. The report includes a description of the evolution of the design over time in response to the challenges of a regional and national integration environment.
RF wave simulation for cold edge plasmas using the MFEM library
NASA Astrophysics Data System (ADS)
Shiraiwa, S.; Wright, J. C.; Bonoli, P. T.; Kolev, T.; Stowell, M.
2017-10-01
A newly developed generic electro-magnetic (EM) simulation tool for modeling RF wave propagation in SOL plasmas is presented. The primary motivation of this development is to extend the domain partitioning approach for incorporating arbitrarily shaped SOL plasmas and antenna to the TORIC core ICRF solver, which was previously demonstrated in the 2D geometry [S. Shiraiwa, et. al., "HISTORIC: extending core ICRF wave simulation to include realistic SOL plasmas", Nucl. Fusion in press], to larger and more complicated simulations by including a 3D realistic antenna and integrating RF rectified sheath potential model. Such an extension requires a scalable high fidelity 3D edge plasma wave simulation. We used the MFEM [
Parallel distributed, reciprocal Monte Carlo radiation in coupled, large eddy combustion simulations
NASA Astrophysics Data System (ADS)
Hunsaker, Isaac L.
Radiation is the dominant mode of heat transfer in high temperature combustion environments. Radiative heat transfer affects the gas and particle phases, including all the associated combustion chemistry. The radiative properties are in turn affected by the turbulent flow field. This bi-directional coupling of radiation turbulence interactions poses a major challenge in creating parallel-capable, high-fidelity combustion simulations. In this work, a new model was developed in which reciprocal monte carlo radiation was coupled with a turbulent, large-eddy simulation combustion model. A technique wherein domain patches are stitched together was implemented to allow for scalable parallelism. The combustion model runs in parallel on a decomposed domain. The radiation model runs in parallel on a recomposed domain. The recomposed domain is stored on each processor after information sharing of the decomposed domain is handled via the message passing interface. Verification and validation testing of the new radiation model were favorable. Strong scaling analyses were performed on the Ember cluster and the Titan cluster for the CPU-radiation model and GPU-radiation model, respectively. The model demonstrated strong scaling to over 1,700 and 16,000 processing cores on Ember and Titan, respectively.
Osborn, Sarah; Zulian, Patrick; Benson, Thomas; ...
2018-01-30
This work describes a domain embedding technique between two nonmatching meshes used for generating realizations of spatially correlated random fields with applications to large-scale sampling-based uncertainty quantification. The goal is to apply the multilevel Monte Carlo (MLMC) method for the quantification of output uncertainties of PDEs with random input coefficients on general and unstructured computational domains. We propose a highly scalable, hierarchical sampling method to generate realizations of a Gaussian random field on a given unstructured mesh by solving a reaction–diffusion PDE with a stochastic right-hand side. The stochastic PDE is discretized using the mixed finite element method on anmore » embedded domain with a structured mesh, and then, the solution is projected onto the unstructured mesh. This work describes implementation details on how to efficiently transfer data from the structured and unstructured meshes at coarse levels, assuming that this can be done efficiently on the finest level. We investigate the efficiency and parallel scalability of the technique for the scalable generation of Gaussian random fields in three dimensions. An application of the MLMC method is presented for quantifying uncertainties of subsurface flow problems. Here, we demonstrate the scalability of the sampling method with nonmatching mesh embedding, coupled with a parallel forward model problem solver, for large-scale 3D MLMC simulations with up to 1.9·109 unknowns.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Osborn, Sarah; Zulian, Patrick; Benson, Thomas
This work describes a domain embedding technique between two nonmatching meshes used for generating realizations of spatially correlated random fields with applications to large-scale sampling-based uncertainty quantification. The goal is to apply the multilevel Monte Carlo (MLMC) method for the quantification of output uncertainties of PDEs with random input coefficients on general and unstructured computational domains. We propose a highly scalable, hierarchical sampling method to generate realizations of a Gaussian random field on a given unstructured mesh by solving a reaction–diffusion PDE with a stochastic right-hand side. The stochastic PDE is discretized using the mixed finite element method on anmore » embedded domain with a structured mesh, and then, the solution is projected onto the unstructured mesh. This work describes implementation details on how to efficiently transfer data from the structured and unstructured meshes at coarse levels, assuming that this can be done efficiently on the finest level. We investigate the efficiency and parallel scalability of the technique for the scalable generation of Gaussian random fields in three dimensions. An application of the MLMC method is presented for quantifying uncertainties of subsurface flow problems. Here, we demonstrate the scalability of the sampling method with nonmatching mesh embedding, coupled with a parallel forward model problem solver, for large-scale 3D MLMC simulations with up to 1.9·109 unknowns.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carothers, Christopher D.; Meredith, Jeremy S.; Blanco, Marc
Performance modeling of extreme-scale applications on accurate representations of potential architectures is critical for designing next generation supercomputing systems because it is impractical to construct prototype systems at scale with new network hardware in order to explore designs and policies. However, these simulations often rely on static application traces that can be difficult to work with because of their size and lack of flexibility to extend or scale up without rerunning the original application. To address this problem, we have created a new technique for generating scalable, flexible workloads from real applications, we have implemented a prototype, called Durango, thatmore » combines a proven analytical performance modeling language, Aspen, with the massively parallel HPC network modeling capabilities of the CODES framework.Our models are compact, parameterized and representative of real applications with computation events. They are not resource intensive to create and are portable across simulator environments. We demonstrate the utility of Durango by simulating the LULESH application in the CODES simulation environment on several topologies and show that Durango is practical to use for simulation without loss of fidelity, as quantified by simulation metrics. During our validation of Durango's generated communication model of LULESH, we found that the original LULESH miniapp code had a latent bug where the MPI_Waitall operation was used incorrectly. This finding underscores the potential need for a tool such as Durango, beyond its benefits for flexible workload generation and modeling.Additionally, we demonstrate the efficacy of Durango's direct integration approach, which links Aspen into CODES as part of the running network simulation model. Here, Aspen generates the application-level computation timing events, which in turn drive the start of a network communication phase. Results show that Durango's performance scales well when executing both torus and dragonfly network models on up to 4K Blue Gene/Q nodes using 32K MPI ranks, Durango also avoids the overheads and complexities associated with extreme-scale trace files.« less
Generic, scalable and decentralized fault detection for robot swarms.
Tarapore, Danesh; Christensen, Anders Lyhne; Timmis, Jon
2017-01-01
Robot swarms are large-scale multirobot systems with decentralized control which means that each robot acts based only on local perception and on local coordination with neighboring robots. The decentralized approach to control confers number of potential benefits. In particular, inherent scalability and robustness are often highlighted as key distinguishing features of robot swarms compared with systems that rely on traditional approaches to multirobot coordination. It has, however, been shown that swarm robotics systems are not always fault tolerant. To realize the robustness potential of robot swarms, it is thus essential to give systems the capacity to actively detect and accommodate faults. In this paper, we present a generic fault-detection system for robot swarms. We show how robots with limited and imperfect sensing capabilities are able to observe and classify the behavior of one another. In order to achieve this, the underlying classifier is an immune system-inspired algorithm that learns to distinguish between normal behavior and abnormal behavior online. Through a series of experiments, we systematically assess the performance of our approach in a detailed simulation environment. In particular, we analyze our system's capacity to correctly detect robots with faults, false positive rates, performance in a foraging task in which each robot exhibits a composite behavior, and performance under perturbations of the task environment. Results show that our generic fault-detection system is robust, that it is able to detect faults in a timely manner, and that it achieves a low false positive rate. The developed fault-detection system has the potential to enable long-term autonomy for robust multirobot systems, thus increasing the usefulness of robots for a diverse repertoire of upcoming applications in the area of distributed intelligent automation.
Generic, scalable and decentralized fault detection for robot swarms
Christensen, Anders Lyhne; Timmis, Jon
2017-01-01
Robot swarms are large-scale multirobot systems with decentralized control which means that each robot acts based only on local perception and on local coordination with neighboring robots. The decentralized approach to control confers number of potential benefits. In particular, inherent scalability and robustness are often highlighted as key distinguishing features of robot swarms compared with systems that rely on traditional approaches to multirobot coordination. It has, however, been shown that swarm robotics systems are not always fault tolerant. To realize the robustness potential of robot swarms, it is thus essential to give systems the capacity to actively detect and accommodate faults. In this paper, we present a generic fault-detection system for robot swarms. We show how robots with limited and imperfect sensing capabilities are able to observe and classify the behavior of one another. In order to achieve this, the underlying classifier is an immune system-inspired algorithm that learns to distinguish between normal behavior and abnormal behavior online. Through a series of experiments, we systematically assess the performance of our approach in a detailed simulation environment. In particular, we analyze our system’s capacity to correctly detect robots with faults, false positive rates, performance in a foraging task in which each robot exhibits a composite behavior, and performance under perturbations of the task environment. Results show that our generic fault-detection system is robust, that it is able to detect faults in a timely manner, and that it achieves a low false positive rate. The developed fault-detection system has the potential to enable long-term autonomy for robust multirobot systems, thus increasing the usefulness of robots for a diverse repertoire of upcoming applications in the area of distributed intelligent automation. PMID:28806756
Scalable Domain Decomposed Monte Carlo Particle Transport
NASA Astrophysics Data System (ADS)
O'Brien, Matthew Joseph
In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation. The main algorithms we consider are: • Domain decomposition of constructive solid geometry: enables extremely large calculations in which the background geometry is too large to fit in the memory of a single computational node. • Load Balancing: keeps the workload per processor as even as possible so the calculation runs efficiently. • Global Particle Find: if particles are on the wrong processor, globally resolve their locations to the correct processor based on particle coordinate and background domain. • Visualizing constructive solid geometry, sourcing particles, deciding that particle streaming communication is completed and spatial redecomposition. These algorithms are some of the most important parallel algorithms required for domain decomposed Monte Carlo particle transport. We demonstrate that our previous algorithms were not scalable, prove that our new algorithms are scalable, and run some of the algorithms up to 2 million MPI processes on the Sequoia supercomputer.
Enabling parallel simulation of large-scale HPC network systems
Mubarak, Misbah; Carothers, Christopher D.; Ross, Robert B.; ...
2016-04-07
Here, with the increasing complexity of today’s high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring the design space of HPC systems—in particular, networks. In order to make effective design decisions, simulations of these systems must possess the following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, and (3) be able to analyze a broad range of network workloads. Most state-of-the-art HPC network simulation frameworks, however, are constrained in one or more of these areas. In this work, we present a simulation framework for modeling two important classes of networks usedmore » in today’s IBM and Cray supercomputers: torus and dragonfly networks. We use the Co-Design of Multi-layer Exascale Storage Architecture (CODES) simulation framework to simulate these network topologies at a flit-level detail using the Rensselaer Optimistic Simulation System (ROSS) for parallel discrete-event simulation. Our simulation framework meets all the requirements of a practical network simulation and can assist network designers in design space exploration. First, it uses validated and detailed flit-level network models to provide an accurate and high-fidelity network simulation. Second, instead of relying on serial time-stepped or traditional conservative discrete-event simulations that limit simulation scalability and efficiency, we use the optimistic event-scheduling capability of ROSS to achieve efficient and scalable HPC network simulations on today’s high-performance cluster systems. Third, our models give network designers a choice in simulating a broad range of network workloads, including HPC application workloads using detailed network traces, an ability that is rarely offered in parallel with high-fidelity network simulations« less
Enabling parallel simulation of large-scale HPC network systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mubarak, Misbah; Carothers, Christopher D.; Ross, Robert B.
Here, with the increasing complexity of today’s high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring the design space of HPC systems—in particular, networks. In order to make effective design decisions, simulations of these systems must possess the following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, and (3) be able to analyze a broad range of network workloads. Most state-of-the-art HPC network simulation frameworks, however, are constrained in one or more of these areas. In this work, we present a simulation framework for modeling two important classes of networks usedmore » in today’s IBM and Cray supercomputers: torus and dragonfly networks. We use the Co-Design of Multi-layer Exascale Storage Architecture (CODES) simulation framework to simulate these network topologies at a flit-level detail using the Rensselaer Optimistic Simulation System (ROSS) for parallel discrete-event simulation. Our simulation framework meets all the requirements of a practical network simulation and can assist network designers in design space exploration. First, it uses validated and detailed flit-level network models to provide an accurate and high-fidelity network simulation. Second, instead of relying on serial time-stepped or traditional conservative discrete-event simulations that limit simulation scalability and efficiency, we use the optimistic event-scheduling capability of ROSS to achieve efficient and scalable HPC network simulations on today’s high-performance cluster systems. Third, our models give network designers a choice in simulating a broad range of network workloads, including HPC application workloads using detailed network traces, an ability that is rarely offered in parallel with high-fidelity network simulations« less
Jeong, Seol Young; Jo, Hyeong Gon; Kang, Soon Ju
2014-03-21
A tracking service like asset management is essential in a dynamic hospital environment consisting of numerous mobile assets (e.g., wheelchairs or infusion pumps) that are continuously relocated throughout a hospital. The tracking service is accomplished based on the key technologies of an indoor location-based service (LBS), such as locating and monitoring multiple mobile targets inside a building in real time. An indoor LBS such as a tracking service entails numerous resource lookups being requested concurrently and frequently from several locations, as well as a network infrastructure requiring support for high scalability in indoor environments. A traditional centralized architecture needs to maintain a geographic map of the entire building or complex in its central server, which can cause low scalability and traffic congestion. This paper presents a self-organizing and fully distributed indoor mobile asset management (MAM) platform, and proposes an architecture for multiple trackees (such as mobile assets) and trackers based on the proposed distributed platform in real time. In order to verify the suggested platform, scalability performance according to increases in the number of concurrent lookups was evaluated in a real test bed. Tracking latency and traffic load ratio in the proposed tracking architecture was also evaluated.
NASA Astrophysics Data System (ADS)
Hegde, Ganesh; Povolotskyi, Michael; Kubis, Tillmann; Boykin, Timothy; Klimeck, Gerhard
2014-03-01
Semi-empirical Tight Binding (TB) is known to be a scalable and accurate atomistic representation for electron transport for realistically extended nano-scaled semiconductor devices that might contain millions of atoms. In this paper, an environment-aware and transferable TB model suitable for electronic structure and transport simulations in technologically relevant metals, metallic alloys, metal nanostructures, and metallic interface systems are described. Part I of this paper describes the development and validation of the new TB model. The new model incorporates intra-atomic diagonal and off-diagonal elements for implicit self-consistency and greater transferability across bonding environments. The dependence of the on-site energies on strain has been obtained by appealing to the Moments Theorem that links closed electron paths in the system to energy moments of angular momentum resolved local density of states obtained ab initio. The model matches self-consistent density functional theory electronic structure results for bulk face centered cubic metals with and without strain, metallic alloys, metallic interfaces, and metallic nanostructures with high accuracy and can be used in predictive electronic structure and transport problems in metallic systems at realistically extended length scales.
Declarative Knowledge Acquisition in Immersive Virtual Learning Environments
ERIC Educational Resources Information Center
Webster, Rustin
2016-01-01
The author investigated the interaction effect of immersive virtual reality (VR) in the classroom. The objective of the project was to develop and provide a low-cost, scalable, and portable VR system containing purposely designed and developed immersive virtual learning environments for the US Army. The purpose of the mixed design experiment was…
An FPGA-Based Massively Parallel Neuromorphic Cortex Simulator
Wang, Runchun M.; Thakur, Chetan S.; van Schaik, André
2018-01-01
This paper presents a massively parallel and scalable neuromorphic cortex simulator designed for simulating large and structurally connected spiking neural networks, such as complex models of various areas of the cortex. The main novelty of this work is the abstraction of a neuromorphic architecture into clusters represented by minicolumns and hypercolumns, analogously to the fundamental structural units observed in neurobiology. Without this approach, simulating large-scale fully connected networks needs prohibitively large memory to store look-up tables for point-to-point connections. Instead, we use a novel architecture, based on the structural connectivity in the neocortex, such that all the required parameters and connections can be stored in on-chip memory. The cortex simulator can be easily reconfigured for simulating different neural networks without any change in hardware structure by programming the memory. A hierarchical communication scheme allows one neuron to have a fan-out of up to 200 k neurons. As a proof-of-concept, an implementation on one Altera Stratix V FPGA was able to simulate 20 million to 2.6 billion leaky-integrate-and-fire (LIF) neurons in real time. We verified the system by emulating a simplified auditory cortex (with 100 million neurons). This cortex simulator achieved a low power dissipation of 1.62 μW per neuron. With the advent of commercially available FPGA boards, our system offers an accessible and scalable tool for the design, real-time simulation, and analysis of large-scale spiking neural networks. PMID:29692702
An FPGA-Based Massively Parallel Neuromorphic Cortex Simulator.
Wang, Runchun M; Thakur, Chetan S; van Schaik, André
2018-01-01
This paper presents a massively parallel and scalable neuromorphic cortex simulator designed for simulating large and structurally connected spiking neural networks, such as complex models of various areas of the cortex. The main novelty of this work is the abstraction of a neuromorphic architecture into clusters represented by minicolumns and hypercolumns, analogously to the fundamental structural units observed in neurobiology. Without this approach, simulating large-scale fully connected networks needs prohibitively large memory to store look-up tables for point-to-point connections. Instead, we use a novel architecture, based on the structural connectivity in the neocortex, such that all the required parameters and connections can be stored in on-chip memory. The cortex simulator can be easily reconfigured for simulating different neural networks without any change in hardware structure by programming the memory. A hierarchical communication scheme allows one neuron to have a fan-out of up to 200 k neurons. As a proof-of-concept, an implementation on one Altera Stratix V FPGA was able to simulate 20 million to 2.6 billion leaky-integrate-and-fire (LIF) neurons in real time. We verified the system by emulating a simplified auditory cortex (with 100 million neurons). This cortex simulator achieved a low power dissipation of 1.62 μW per neuron. With the advent of commercially available FPGA boards, our system offers an accessible and scalable tool for the design, real-time simulation, and analysis of large-scale spiking neural networks.
Scalable Unix tools on parallel processors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gropp, W.; Lusk, E.
1994-12-31
The introduction of parallel processors that run a separate copy of Unix on each process has introduced new problems in managing the user`s environment. This paper discusses some generalizations of common Unix commands for managing files (e.g. 1s) and processes (e.g. ps) that are convenient and scalable. These basic tools, just like their Unix counterparts, are text-based. We also discuss a way to use these with a graphical user interface (GUI). Some notes on the implementation are provided. Prototypes of these commands are publicly available.
Scalable and reusable emulator for evaluating the performance of SS7 networks
NASA Astrophysics Data System (ADS)
Lazar, Aurel A.; Tseng, Kent H.; Lim, Koon Seng; Choe, Winston
1994-04-01
A scalable and reusable emulator was designed and implemented for studying the behavior of SS7 networks. The emulator design was largely based on public domain software. It was developed on top of an environment supported by PVM, the Parallel Virtual Machine, and managed by OSIMIS-the OSI Management Information Service platform. The emulator runs on top of a commercially available ATM LAN interconnecting engineering workstations. As a case study for evaluating the emulator, the behavior of the Singapore National SS7 Network under fault and unbalanced loading conditions was investigated.
pcircle - A Suite of Scalable Parallel File System Tools
DOE Office of Scientific and Technical Information (OSTI.GOV)
WANG, FEIYI
2015-10-01
Most of the software related to file system are written for conventional local file system, they are serialized and can't take advantage of the benefit of a large scale parallel file system. "pcircle" software builds on top of ubiquitous MPI in cluster computing environment and "work-stealing" pattern to provide a scalable, high-performance suite of file system tools. In particular - it implemented parallel data copy and parallel data checksumming, with advanced features such as async progress report, checkpoint and restart, as well as integrity checking.
We will build a scalable clinical decision support (CDS) platform that helps clinicians and patients select cancer screening strategies that are best suited to each individual. This kind of CDS is important because increased evidence supports personalizing cancer screening decisions according to each individual's unique cancer risks. While a highly desired goal, individualizing screening at a population scale requires the implementation of patient-specific risk assessments for several types of cancer. This is quite challenging in today's overwhelmed primary care environment.
Scalable and Resilient Middleware to Handle Information Exchange during Environment Crisis
NASA Astrophysics Data System (ADS)
Tao, R.; Poslad, S.; Moßgraber, J.; Middleton, S.; Hammitzsch, M.
2012-04-01
The EU FP7 TRIDEC project focuses on enabling real-time, intelligent, information management of collaborative, complex, critical decision processes for earth management. A key challenge is to promote a communication infrastructure to facilitate interoperable environment information services during environment events and crises such as tsunamis and drilling, during which increasing volumes and dimensionality of disparate information sources, including sensor-based and human-based ones, can result, and need to be managed. Such a system needs to support: scalable, distributed messaging; asynchronous messaging; open messaging to handling changing clients such as new and retired automated system and human information sources becoming online or offline; flexible data filtering, and heterogeneous access networks (e.g., GSM, WLAN and LAN). In addition, the system needs to be resilient to handle the ICT system failures, e.g. failure, degradation and overloads, during environment events. There are several system middleware choices for TRIDEC based upon a Service-oriented-architecture (SOA), Event-driven-Architecture (EDA), Cloud Computing, and Enterprise Service Bus (ESB). In an SOA, everything is a service (e.g. data access, processing and exchange); clients can request on demand or subscribe to services registered by providers; more often interaction is synchronous. In an EDA system, events that represent significant changes in state can be processed simply, or as streams or more complexly. Cloud computing is a virtualization, interoperable and elastic resource allocation model. An ESB, a fundamental component for enterprise messaging, supports synchronous and asynchronous message exchange models and has inbuilt resilience against ICT failure. Our middleware proposal is an ESB based hybrid architecture model: an SOA extension supports more synchronous workflows; EDA assists the ESB to handle more complex event processing; Cloud computing can be used to increase and decrease the ESB resources on demand. To reify this hybrid ESB centric architecture, we will adopt two complementary approaches: an open source one for scalability and resilience improvement while a commercial one can be used for ultra-speed messaging, whilst we can bridge between these two to support interoperability. In TRIDEC, to manage such a hybrid messaging system, overlay and underlay management techniques will be adopted. The managers (both global and local) will collect, store and update status information (e.g. CPU utilization, free space, number of clients) and balance the usage, throughput, and delays to improve resilience and scalability. The expected resilience improvement includes dynamic failover, self-healing, pre-emptive load balancing, and bottleneck prediction while the expected improvement for scalability includes capacity estimation, Http Bridge, and automatic configuration and reconfiguration (e.g. add or delete clients and servers).
Hierarchical optimization for neutron scattering problems
Bao, Feng; Archibald, Rick; Bansal, Dipanshu; ...
2016-03-14
In this study, we present a scalable optimization method for neutron scattering problems that determines confidence regions of simulation parameters in lattice dynamics models used to fit neutron scattering data for crystalline solids. The method uses physics-based hierarchical dimension reduction in both the computational simulation domain and the parameter space. We demonstrate for silicon that after a few iterations the method converges to parameters values (interatomic force-constants) computed with density functional theory simulations.
Hierarchical optimization for neutron scattering problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bao, Feng; Archibald, Rick; Bansal, Dipanshu
In this study, we present a scalable optimization method for neutron scattering problems that determines confidence regions of simulation parameters in lattice dynamics models used to fit neutron scattering data for crystalline solids. The method uses physics-based hierarchical dimension reduction in both the computational simulation domain and the parameter space. We demonstrate for silicon that after a few iterations the method converges to parameters values (interatomic force-constants) computed with density functional theory simulations.
Evaluation of 3D printed anatomically scalable transfemoral prosthetic knee.
Ramakrishnan, Tyagi; Schlafly, Millicent; Reed, Kyle B
2017-07-01
This case study compares a transfemoral amputee's gait while using the existing Ossur Total Knee 2000 and our novel 3D printed anatomically scalable transfemoral prosthetic knee. The anatomically scalable transfemoral prosthetic knee is 3D printed out of a carbon-fiber and nylon composite that has a gear-mesh coupling with a hard-stop weight-actuated locking mechanism aided by a cross-linked four-bar spring mechanism. This design can be scaled using anatomical dimensions of a human femur and tibia to have a unique fit for each user. The transfemoral amputee who was tested is high functioning and walked on the Computer Assisted Rehabilitation Environment (CAREN) at a self-selected pace. The motion capture and force data that was collected showed that there were distinct differences in the gait dynamics. The data was used to perform the Combined Gait Asymmetry Metric (CGAM), where the scores revealed that the overall asymmetry of the gait on the Ossur Total Knee was more asymmetric than the anatomically scalable transfemoral prosthetic knee. The anatomically scalable transfemoral prosthetic knee had higher peak knee flexion that caused a large step time asymmetry. This made walking on the anatomically scalable transfemoral prosthetic knee more strenuous due to the compensatory movements in adapting to the different dynamics. This can be overcome by tuning the cross-linked spring mechanism to emulate the dynamics of the subject better. The subject stated that the knee would be good for daily use and has the potential to be adapted as a running knee.
Schlecht, Ulrich; Liu, Zhimin; Blundell, Jamie R; St Onge, Robert P; Levy, Sasha F
2017-05-25
Several large-scale efforts have systematically catalogued protein-protein interactions (PPIs) of a cell in a single environment. However, little is known about how the protein interactome changes across environmental perturbations. Current technologies, which assay one PPI at a time, are too low throughput to make it practical to study protein interactome dynamics. Here, we develop a highly parallel protein-protein interaction sequencing (PPiSeq) platform that uses a novel double barcoding system in conjunction with the dihydrofolate reductase protein-fragment complementation assay in Saccharomyces cerevisiae. PPiSeq detects PPIs at a rate that is on par with current assays and, in contrast with current methods, quantitatively scores PPIs with enough accuracy and sensitivity to detect changes across environments. Both PPI scoring and the bulk of strain construction can be performed with cell pools, making the assay scalable and easily reproduced across environments. PPiSeq is therefore a powerful new tool for large-scale investigations of dynamic PPIs.
NASA Astrophysics Data System (ADS)
Nightingale, James; Wang, Qi; Grecos, Christos
2011-03-01
Users of the next generation wireless paradigm known as multihomed mobile networks expect satisfactory quality of service (QoS) when accessing streamed multimedia content. The recent H.264 Scalable Video Coding (SVC) extension to the Advanced Video Coding standard (AVC), offers the facility to adapt real-time video streams in response to the dynamic conditions of multiple network paths encountered in multihomed wireless mobile networks. Nevertheless, preexisting streaming algorithms were mainly proposed for AVC delivery over multipath wired networks and were evaluated by software simulation. This paper introduces a practical, hardware-based testbed upon which we implement and evaluate real-time H.264 SVC streaming algorithms in a realistic multihomed wireless mobile networks environment. We propose an optimised streaming algorithm with multi-fold technical contributions. Firstly, we extended the AVC packet prioritisation schemes to reflect the three-dimensional granularity of SVC. Secondly, we designed a mechanism for evaluating the effects of different streamer 'read ahead window' sizes on real-time performance. Thirdly, we took account of the previously unconsidered path switching and mobile networks tunnelling overheads encountered in real-world deployments. Finally, we implemented a path condition monitoring and reporting scheme to facilitate the intelligent path switching. The proposed system has been experimentally shown to offer a significant improvement in PSNR of the received stream compared with representative existing algorithms.
NASA Astrophysics Data System (ADS)
Culp, Robert D.; McQuerry, James P.
1991-07-01
The present conference on guidance and control encompasses advances in guidance, navigation, and control, storyboard displays, approaches to space-borne pointing control, international space programs, recent experiences with systems, and issues regarding navigation in the low-earth-orbit space environment. Specific issues addressed include a scalable architecture for an operational spaceborne autonavigation system, the mitigation of multipath error in GPS-based attitude determination, microgravity flight testing of a laboratory robot, and the application of neural networks. Other issues addressed include image navigation with second-generation Meteosat, Magellan star-scanner experiences, high-precision control systems for telescopes and interferometers, gravitational effects on low-earth orbiters, experimental verification of nanometer-level optical pathlengths, and a flight telerobotic servicer prototype simulator. (For individual items see A93-15577 to A93-15613)
Adjacent Vehicle Number-Triggered Adaptive Transmission for V2V Communications.
Wei, Yiqiao; Chen, Jingjun; Hwang, Seung-Hoon
2018-03-02
For vehicle-to-vehicle (V2V) communication, such issues as continuity and reliability still have to be solved. Specifically, it is necessary to consider a more scalable physical layer due to the high-speed mobility of vehicles and the complex channel environment. Adaptive transmission has been adapted in channel-dependent scheduling. However, it has been neglected with regards to the physical topology changes in the vehicle network. In this paper, we propose a physical topology-triggered adaptive transmission scheme which adjusts the data rate between vehicles according to the number of connectable vehicles nearby. Also, we investigate the performance of the proposed method using computer simulations and compare it with the conventional methods. The numerical results show that the proposed method can provide more continuous and reliable data transmission for V2V communications.
Nanostructured sapphire optical fiber for sensing in harsh environments
NASA Astrophysics Data System (ADS)
Chen, Hui; Liu, Kai; Ma, Yiwei; Tian, Fei; Du, Henry
2017-05-01
We describe an innovative and scalable strategy of transforming a commercial unclad sapphire optical fiber to an allalumina nanostructured sapphire optical fiber (NSOF) that overcomes decades-long challenges faced in the field of sapphire fiber optics. The strategy entails fiber coating with metal Al followed by subsequent anodization to form anodized alumina oxide (AAO) cladding of highly organized pore channel structure. We show that Ag nanoparticles entrapped in AAO show excellent structural and morphological stability and less susceptibility to oxidation for potential high-temperature surface-enhanced Raman Scattering (SERS). We reveal, with aid of numerical simulations, that the AAO cladding greatly increases the evanescent-field overlap both in power and extent and that lower porosity of AAO results in higher evanescent-field overlap. This work has opened the door to new sapphire fiber-based sensor design and sensor architecture.
Adjacent Vehicle Number-Triggered Adaptive Transmission for V2V Communications
Wei, Yiqiao; Chen, Jingjun
2018-01-01
For vehicle-to-vehicle (V2V) communication, such issues as continuity and reliability still have to be solved. Specifically, it is necessary to consider a more scalable physical layer due to the high-speed mobility of vehicles and the complex channel environment. Adaptive transmission has been adapted in channel-dependent scheduling. However, it has been neglected with regards to the physical topology changes in the vehicle network. In this paper, we propose a physical topology-triggered adaptive transmission scheme which adjusts the data rate between vehicles according to the number of connectable vehicles nearby. Also, we investigate the performance of the proposed method using computer simulations and compare it with the conventional methods. The numerical results show that the proposed method can provide more continuous and reliable data transmission for V2V communications. PMID:29498646
Improving Fidelity of Launch Vehicle Liftoff Acoustic Simulations
NASA Technical Reports Server (NTRS)
Liever, Peter; West, Jeff
2016-01-01
Launch vehicles experience high acoustic loads during ignition and liftoff affected by the interaction of rocket plume generated acoustic waves with launch pad structures. Application of highly parallelized Computational Fluid Dynamics (CFD) analysis tools optimized for application on the NAS computer systems such as the Loci/CHEM program now enable simulation of time-accurate, turbulent, multi-species plume formation and interaction with launch pad geometry and capture the generation of acoustic noise at the source regions in the plume shear layers and impingement regions. These CFD solvers are robust in capturing the acoustic fluctuations, but they are too dissipative to accurately resolve the propagation of the acoustic waves throughout the launch environment domain along the vehicle. A hybrid Computational Fluid Dynamics and Computational Aero-Acoustics (CFD/CAA) modeling framework has been developed to improve such liftoff acoustic environment predictions. The framework combines the existing highly-scalable NASA production CFD code, Loci/CHEM, with a high-order accurate discontinuous Galerkin (DG) solver, Loci/THRUST, developed in the same computational framework. Loci/THRUST employs a low dissipation, high-order, unstructured DG method to accurately propagate acoustic waves away from the source regions across large distances. The DG solver is currently capable of solving up to 4th order solutions for non-linear, conservative acoustic field propagation. Higher order boundary conditions are implemented to accurately model the reflection and refraction of acoustic waves on launch pad components. The DG solver accepts generalized unstructured meshes, enabling efficient application of common mesh generation tools for CHEM and THRUST simulations. The DG solution is coupled with the CFD solution at interface boundaries placed near the CFD acoustic source regions. Both simulations are executed simultaneously with coordinated boundary condition data exchange.
Scalable printed electronics: an organic decoder addressing ferroelectric non-volatile memory
Ng, Tse Nga; Schwartz, David E.; Lavery, Leah L.; Whiting, Gregory L.; Russo, Beverly; Krusor, Brent; Veres, Janos; Bröms, Per; Herlogsson, Lars; Alam, Naveed; Hagel, Olle; Nilsson, Jakob; Karlsson, Christer
2012-01-01
Scalable circuits of organic logic and memory are realized using all-additive printing processes. A 3-bit organic complementary decoder is fabricated and used to read and write non-volatile, rewritable ferroelectric memory. The decoder-memory array is patterned by inkjet and gravure printing on flexible plastics. Simulation models for the organic transistors are developed, enabling circuit designs tolerant of the variations in printed devices. We explain the key design rules in fabrication of complex printed circuits and elucidate the performance requirements of materials and devices for reliable organic digital logic. PMID:22900143
NASA Technical Reports Server (NTRS)
Kopardekar, Parimal H.
2017-01-01
Conduct research, development and testing to identify airspace operations requirements to enable large-scale visual and beyond visual line of sight UAS operations in the low-altitude airspace. Use build-a-little-test-a-little strategy remote areas to urban areas Low density: No traffic management required but understanding of airspace constraints. Cooperative traffic management: Understanding of airspace constraints and other operations. Manned and unmanned traffic management: Scalable and heterogeneous operations. UTM construct consistent with FAAs risk-based strategy. UTM research platform is used for simulations and tests. UTM offers path towards scalability
Scalable quantum computation scheme based on quantum-actuated nuclear-spin decoherence-free qubits
NASA Astrophysics Data System (ADS)
Dong, Lihong; Rong, Xing; Geng, Jianpei; Shi, Fazhan; Li, Zhaokai; Duan, Changkui; Du, Jiangfeng
2017-11-01
We propose a novel theoretical scheme of quantum computation. Nuclear spin pairs are utilized to encode decoherence-free (DF) qubits. A nitrogen-vacancy center serves as a quantum actuator to initialize, readout, and quantum control the DF qubits. The realization of CNOT gates between two DF qubits are also presented. Numerical simulations show high fidelities of all these processes. Additionally, we discuss the potential of scalability. Our scheme reduces the challenge of classical interfaces from controlling and observing complex quantum systems down to a simple quantum actuator. It also provides a novel way to handle complex quantum systems.
Discrete Event Modeling and Massively Parallel Execution of Epidemic Outbreak Phenomena
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perumalla, Kalyan S; Seal, Sudip K
2011-01-01
In complex phenomena such as epidemiological outbreaks, the intensity of inherent feedback effects and the significant role of transients in the dynamics make simulation the only effective method for proactive, reactive or post-facto analysis. The spatial scale, runtime speed, and behavioral detail needed in detailed simulations of epidemic outbreaks make it necessary to use large-scale parallel processing. Here, an optimistic parallel execution of a new discrete event formulation of a reaction-diffusion simulation model of epidemic propagation is presented to facilitate in dramatically increasing the fidelity and speed by which epidemiological simulations can be performed. Rollback support needed during optimistic parallelmore » execution is achieved by combining reverse computation with a small amount of incremental state saving. Parallel speedup of over 5,500 and other runtime performance metrics of the system are observed with weak-scaling execution on a small (8,192-core) Blue Gene / P system, while scalability with a weak-scaling speedup of over 10,000 is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes exceeding several hundreds of millions of individuals in the largest cases are successfully exercised to verify model scalability.« less
NASA Astrophysics Data System (ADS)
Ivkin, N.; Liu, Z.; Yang, L. F.; Kumar, S. S.; Lemson, G.; Neyrinck, M.; Szalay, A. S.; Braverman, V.; Budavari, T.
2018-04-01
Cosmological N-body simulations play a vital role in studying models for the evolution of the Universe. To compare to observations and make a scientific inference, statistic analysis on large simulation datasets, e.g., finding halos, obtaining multi-point correlation functions, is crucial. However, traditional in-memory methods for these tasks do not scale to the datasets that are forbiddingly large in modern simulations. Our prior paper (Liu et al., 2015) proposes memory-efficient streaming algorithms that can find the largest halos in a simulation with up to 109 particles on a small server or desktop. However, this approach fails when directly scaling to larger datasets. This paper presents a robust streaming tool that leverages state-of-the-art techniques on GPU boosting, sampling, and parallel I/O, to significantly improve performance and scalability. Our rigorous analysis of the sketch parameters improves the previous results from finding the centers of the 103 largest halos (Liu et al., 2015) to ∼ 104 - 105, and reveals the trade-offs between memory, running time and number of halos. Our experiments show that our tool can scale to datasets with up to ∼ 1012 particles while using less than an hour of running time on a single GPU Nvidia GTX 1080.
Fully implicit adaptive mesh refinement algorithm for reduced MHD
NASA Astrophysics Data System (ADS)
Philip, Bobby; Pernice, Michael; Chacon, Luis
2006-10-01
In the macroscopic simulation of plasmas, the numerical modeler is faced with the challenge of dealing with multiple time and length scales. Traditional approaches based on explicit time integration techniques and fixed meshes are not suitable for this challenge, as such approaches prevent the modeler from using realistic plasma parameters to keep the computation feasible. We propose here a novel approach, based on implicit methods and structured adaptive mesh refinement (SAMR). Our emphasis is on both accuracy and scalability with the number of degrees of freedom. As a proof-of-principle, we focus on the reduced resistive MHD model as a basic MHD model paradigm, which is truly multiscale. The approach taken here is to adapt mature physics-based technology to AMR grids, and employ AMR-aware multilevel techniques (such as fast adaptive composite grid --FAC-- algorithms) for scalability. We demonstrate that the concept is indeed feasible, featuring near-optimal scalability under grid refinement. Results of fully-implicit, dynamically-adaptive AMR simulations in challenging dissipation regimes will be presented on a variety of problems that benefit from this capability, including tearing modes, the island coalescence instability, and the tilt mode instability. L. Chac'on et al., J. Comput. Phys. 178 (1), 15- 36 (2002) B. Philip, M. Pernice, and L. Chac'on, Lecture Notes in Computational Science and Engineering, accepted (2006)
A generic, cost-effective, and scalable cell lineage analysis platform
Biezuner, Tamir; Spiro, Adam; Raz, Ofir; Amir, Shiran; Milo, Lilach; Adar, Rivka; Chapal-Ilani, Noa; Berman, Veronika; Fried, Yael; Ainbinder, Elena; Cohen, Galit; Barr, Haim M.; Halaban, Ruth; Shapiro, Ehud
2016-01-01
Advances in single-cell genomics enable commensurate improvements in methods for uncovering lineage relations among individual cells. Current sequencing-based methods for cell lineage analysis depend on low-resolution bulk analysis or rely on extensive single-cell sequencing, which is not scalable and could be biased by functional dependencies. Here we show an integrated biochemical-computational platform for generic single-cell lineage analysis that is retrospective, cost-effective, and scalable. It consists of a biochemical-computational pipeline that inputs individual cells, produces targeted single-cell sequencing data, and uses it to generate a lineage tree of the input cells. We validated the platform by applying it to cells sampled from an ex vivo grown tree and analyzed its feasibility landscape by computer simulations. We conclude that the platform may serve as a generic tool for lineage analysis and thus pave the way toward large-scale human cell lineage discovery. PMID:27558250
Scalable parallel distance field construction for large-scale applications
Yu, Hongfeng; Xie, Jinrong; Ma, Kwan -Liu; ...
2015-10-01
Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. Anew distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking overtime, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate itsmore » efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. In conclusion, our work greatly extends the usability of distance fields for demanding applications.« less
Scalable Parallel Distance Field Construction for Large-Scale Applications.
Yu, Hongfeng; Xie, Jinrong; Ma, Kwan-Liu; Kolla, Hemanth; Chen, Jacqueline H
2015-10-01
Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. A new distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking over time, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate its efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. Our work greatly extends the usability of distance fields for demanding applications.
H.264 Layered Coded Video over Wireless Networks: Channel Coding and Modulation Constraints
NASA Astrophysics Data System (ADS)
Ghandi, M. M.; Barmada, B.; Jones, E. V.; Ghanbari, M.
2006-12-01
This paper considers the prioritised transmission of H.264 layered coded video over wireless channels. For appropriate protection of video data, methods such as prioritised forward error correction coding (FEC) or hierarchical quadrature amplitude modulation (HQAM) can be employed, but each imposes system constraints. FEC provides good protection but at the price of a high overhead and complexity. HQAM is less complex and does not introduce any overhead, but permits only fixed data ratios between the priority layers. Such constraints are analysed and practical solutions are proposed for layered transmission of data-partitioned and SNR-scalable coded video where combinations of HQAM and FEC are used to exploit the advantages of both coding methods. Simulation results show that the flexibility of SNR scalability and absence of picture drift imply that SNR scalability as modelled is superior to data partitioning in such applications.
Coupled Physics Environment (CouPE) library - Design, Implementation, and Release
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mahadevan, Vijay S.
Over several years, high fidelity, validated mono-physics solvers with proven scalability on peta-scale architectures have been developed independently. Based on a unified component-based architecture, these existing codes can be coupled with a unified mesh-data backplane and a flexible coupling-strategy-based driver suite to produce a viable tool for analysts. In this report, we present details on the design decisions and developments on CouPE, an acronym that stands for Coupled Physics Environment that orchestrates a coupled physics solver through the interfaces exposed by MOAB array-based unstructured mesh, both of which are part of SIGMA (Scalable Interfaces for Geometry and Mesh-Based Applications) toolkit.more » The SIGMA toolkit contains libraries that enable scalable geometry and unstructured mesh creation and handling in a memory and computationally efficient implementation. The CouPE version being prepared for a full open-source release along with updated documentation will contain several useful examples that will enable users to start developing their applications natively using the native MOAB mesh and couple their models to existing physics applications to analyze and solve real world problems of interest. An integrated multi-physics simulation capability for the design and analysis of current and future nuclear reactor models is also being investigated as part of the NEAMS RPL, to tightly couple neutron transport, thermal-hydraulics and structural mechanics physics under the SHARP framework. This report summarizes the efforts that have been invested in CouPE to bring together several existing physics applications namely PROTEUS (neutron transport code), Nek5000 (computational fluid-dynamics code) and Diablo (structural mechanics code). The goal of the SHARP framework is to perform fully resolved coupled physics analysis of a reactor on heterogeneous geometry, in order to reduce the overall numerical uncertainty while leveraging available computational resources. The design of CouPE along with motivations that led to implementation choices are also discussed. The first release of the library will be different from the current version of the code that integrates the components in SHARP and explanation on the need for forking the source base will also be provided. Enhancements in the functionality and improved user guides will be available as part of the release. CouPE v0.1 is scheduled for an open-source release in December 2014 along with SIGMA v1.1 components that provide support for language-agnostic mesh loading, traversal and query interfaces along with scalable solution transfer of fields between different physics codes. The coupling methodology and software interfaces of the library are presented, along with verification studies on two representative fast sodium-cooled reactor demonstration problems to prove the usability of the CouPE library.« less
Use of the NetBeans Platform for NASA Robotic Conjunction Assessment Risk Analysis
NASA Technical Reports Server (NTRS)
Sabey, Nickolas J.
2014-01-01
The latest Java and JavaFX technologies are very attractive software platforms for customers involved in space mission operations such as those of NASA and the US Air Force. For NASA Robotic Conjunction Assessment Risk Analysis (CARA), the NetBeans platform provided an environment in which scalable software solutions could be developed quickly and efficiently. Both Java 8 and the NetBeans platform are in the process of simplifying CARA development in secure environments by providing a significant amount of capability in a single accredited package, where accreditation alone can account for 6-8 months for each library or software application. Capabilities either in use or being investigated by CARA include: 2D and 3D displays with JavaFX, parallelization with the new Streams API, and scalability through the NetBeans plugin architecture.
NASA Astrophysics Data System (ADS)
Fink, Wolfgang; George, Thomas; Tarbell, Mark A.
2007-04-01
Robotic reconnaissance operations are called for in extreme environments, not only those such as space, including planetary atmospheres, surfaces, and subsurfaces, but also in potentially hazardous or inaccessible operational areas on Earth, such as mine fields, battlefield environments, enemy occupied territories, terrorist infiltrated environments, or areas that have been exposed to biochemical agents or radiation. Real time reconnaissance enables the identification and characterization of transient events. A fundamentally new mission concept for tier-scalable reconnaissance of operational areas, originated by Fink et al., is aimed at replacing the engineering and safety constrained mission designs of the past. The tier-scalable paradigm integrates multi-tier (orbit atmosphere surface/subsurface) and multi-agent (satellite UAV/blimp surface/subsurface sensing platforms) hierarchical mission architectures, introducing not only mission redundancy and safety, but also enabling and optimizing intelligent, less constrained, and distributed reconnaissance in real time. Given the mass, size, and power constraints faced by such a multi-platform approach, this is an ideal application scenario for a diverse set of MEMS sensors. To support such mission architectures, a high degree of operational autonomy is required. Essential elements of such operational autonomy are: (1) automatic mapping of an operational area from different vantage points (including vehicle health monitoring); (2) automatic feature extraction and target/region-of-interest identification within the mapped operational area; and (3) automatic target prioritization for close-up examination. These requirements imply the optimal deployment of MEMS sensors and sensor platforms, sensor fusion, and sensor interoperability.
Research on Chinese characters display of airborne MFD based on GL studio
NASA Astrophysics Data System (ADS)
Wang, Zhile; Dong, Junyu; Hu, Wenting; Cui, Yipeng
2018-04-01
GL Studio cannot display Chinese characters during developing the airborne MFD, this paper propose a method of establishing a Chinese character font with GB2312 encoding, establish the font table and the display unit of Chinese characters based on GL Studio. Abstract the storage and display data model of Chinese characters, parse the GB encoding of the corresponding Chinese characters that MFD received, find the coordinates of the Chinese characters in the font table, establish the dynamic control model and the dynamic display model of Chinese characters based on the display unit of Chinese characters. In GL Studio and VC ++.NET environment, this model has been successfully applied to develop the airborne MFD in a variety of mission simulators. This method has successfully solved the problem that GL Studio software cannot develop MFD software of Chinese domestic aircraft and can also be used for other professional airborne MFD development tools such as IDATA. It has been proved by experiments that this is a fast effective scalable and reconfigurable method of developing both actual equipment and simulators.
A scalable silicon photonic chip-scale optical switch for high performance computing systems.
Yu, Runxiang; Cheung, Stanley; Li, Yuliang; Okamoto, Katsunari; Proietti, Roberto; Yin, Yawei; Yoo, S J B
2013-12-30
This paper discusses the architecture and provides performance studies of a silicon photonic chip-scale optical switch for scalable interconnect network in high performance computing systems. The proposed switch exploits optical wavelength parallelism and wavelength routing characteristics of an Arrayed Waveguide Grating Router (AWGR) to allow contention resolution in the wavelength domain. Simulation results from a cycle-accurate network simulator indicate that, even with only two transmitter/receiver pairs per node, the switch exhibits lower end-to-end latency and higher throughput at high (>90%) input loads compared with electronic switches. On the device integration level, we propose to integrate all the components (ring modulators, photodetectors and AWGR) on a CMOS-compatible silicon photonic platform to ensure a compact, energy efficient and cost-effective device. We successfully demonstrate proof-of-concept routing functions on an 8 × 8 prototype fabricated using foundry services provided by OpSIS-IME.
ERIC Educational Resources Information Center
Lee, Chien-Sing
2007-01-01
Models represent a set of generic patterns to test hypotheses. This paper presents the CogMoLab student model in the context of an integrated learning environment. Three aspects are discussed: diagnostic and predictive modeling with respect to the issues of credit assignment and scalability and compositional modeling of the student profile in the…
Phipps, Eric T.; D'Elia, Marta; Edwards, Harold C.; ...
2017-04-18
In this study, quantifying simulation uncertainties is a critical component of rigorous predictive simulation. A key component of this is forward propagation of uncertainties in simulation input data to output quantities of interest. Typical approaches involve repeated sampling of the simulation over the uncertain input data, and can require numerous samples when accurately propagating uncertainties from large numbers of sources. Often simulation processes from sample to sample are similar and much of the data generated from each sample evaluation could be reused. We explore a new method for implementing sampling methods that simultaneously propagates groups of samples together in anmore » embedded fashion, which we call embedded ensemble propagation. We show how this approach takes advantage of properties of modern computer architectures to improve performance by enabling reuse between samples, reducing memory bandwidth requirements, improving memory access patterns, improving opportunities for fine-grained parallelization, and reducing communication costs. We describe a software technique for implementing embedded ensemble propagation based on the use of C++ templates and describe its integration with various scientific computing libraries within Trilinos. We demonstrate improved performance, portability and scalability for the approach applied to the simulation of partial differential equations on a variety of CPU, GPU, and accelerator architectures, including up to 131,072 cores on a Cray XK7 (Titan).« less
Metabolic interactions and dynamics in microbial communities
NASA Astrophysics Data System (ADS)
Segre', Daniel
Metabolism, in addition to being the engine of every living cell, plays a major role in the cell-cell and cell-environment relations that shape the dynamics and evolution of microbial communities, e.g. by mediating competition and cross-feeding interactions between different species. Despite the increasing availability of metagenomic sequencing data for numerous microbial ecosystems, fundamental aspects of these communities, such as the unculturability of many isolates, and the conditions necessary for taxonomic or functional stability, are still poorly understood. We are developing mechanistic computational approaches for studying the interactions between different organisms based on the knowledge of their entire metabolic networks. In particular, we have recently built an open source platform for the Computation of Microbial Ecosystems in Time and Space (COMETS), which combines metabolic models with convection-diffusion equations to simulate the spatio-temporal dynamics of metabolism in microbial communities. COMETS has been experimentally tested on small artificial communities, and is scalable to hundreds of species in complex environments. I will discuss recent developments and challenges towards the implementation of models for microbiomes and synthetic microbial communities.
CAVE2: a hybrid reality environment for immersive simulation and information analysis
NASA Astrophysics Data System (ADS)
Febretti, Alessandro; Nishimoto, Arthur; Thigpen, Terrance; Talandis, Jonas; Long, Lance; Pirtle, J. D.; Peterka, Tom; Verlo, Alan; Brown, Maxine; Plepys, Dana; Sandin, Dan; Renambot, Luc; Johnson, Andrew; Leigh, Jason
2013-03-01
Hybrid Reality Environments represent a new kind of visualization spaces that blur the line between virtual environments and high resolution tiled display walls. This paper outlines the design and implementation of the CAVE2TM Hybrid Reality Environment. CAVE2 is the world's first near-seamless flat-panel-based, surround-screen immersive system. Unique to CAVE2 is that it will enable users to simultaneously view both 2D and 3D information, providing more flexibility for mixed media applications. CAVE2 is a cylindrical system of 24 feet in diameter and 8 feet tall, and consists of 72 near-seamless, off-axisoptimized passive stereo LCD panels, creating an approximately 320 degree panoramic environment for displaying information at 37 Megapixels (in stereoscopic 3D) or 74 Megapixels in 2D and at a horizontal visual acuity of 20/20. Custom LCD panels with shifted polarizers were built so the images in the top and bottom rows of LCDs are optimized for vertical off-center viewing- allowing viewers to come closer to the displays while minimizing ghosting. CAVE2 is designed to support multiple operating modes. In the Fully Immersive mode, the entire room can be dedicated to one virtual simulation. In 2D model, the room can operate like a traditional tiled display wall enabling users to work with large numbers of documents at the same time. In the Hybrid mode, a mixture of both 2D and 3D applications can be simultaneously supported. The ability to treat immersive work spaces in this Hybrid way has never been achieved before, and leverages the special abilities of CAVE2 to enable researchers to seamlessly interact with large collections of 2D and 3D data. To realize this hybrid ability, we merged the Scalable Adaptive Graphics Environment (SAGE) - a system for supporting 2D tiled displays, with Omegalib - a virtual reality middleware supporting OpenGL, OpenSceneGraph and Vtk applications.
NASA Astrophysics Data System (ADS)
Al Hadhrami, Tawfik; Nightingale, James M.; Wang, Qi; Grecos, Christos
2014-05-01
In emergency situations, the ability to remotely monitor unfolding events using high-quality video feeds will significantly improve the incident commander's understanding of the situation and thereby aids effective decision making. This paper presents a novel, adaptive video monitoring system for emergency situations where the normal communications network infrastructure has been severely impaired or is no longer operational. The proposed scheme, operating over a rapidly deployable wireless mesh network, supports real-time video feeds between first responders, forward operating bases and primary command and control centers. Video feeds captured on portable devices carried by first responders and by static visual sensors are encoded in H.264/SVC, the scalable extension to H.264/AVC, allowing efficient, standard-based temporal, spatial, and quality scalability of the video. A three-tier video delivery system is proposed, which balances the need to avoid overuse of mesh nodes with the operational requirements of the emergency management team. In the first tier, the video feeds are delivered at a low spatial and temporal resolution employing only the base layer of the H.264/SVC video stream. Routing in this mode is designed to employ all nodes across the entire mesh network. In the second tier, whenever operational considerations require that commanders or operators focus on a particular video feed, a `fidelity control' mechanism at the monitoring station sends control messages to the routing and scheduling agents in the mesh network, which increase the quality of the received picture using SNR scalability while conserving bandwidth by maintaining a low frame rate. In this mode, routing decisions are based on reliable packet delivery with the most reliable routes being used to deliver the base and lower enhancement layers; as fidelity is increased and more scalable layers are transmitted they will be assigned to routes in descending order of reliability. The third tier of video delivery transmits a high-quality video stream including all available scalable layers using the most reliable routes through the mesh network ensuring the highest possible video quality. The proposed scheme is implemented in a proven simulator, and the performance of the proposed system is numerically evaluated through extensive simulations. We further present an in-depth analysis of the proposed solutions and potential approaches towards supporting high-quality visual communications in such a demanding context.
NASA Astrophysics Data System (ADS)
Koziel, Slawomir; Bekasiewicz, Adrian
2016-10-01
Multi-objective optimization of antenna structures is a challenging task owing to the high computational cost of evaluating the design objectives as well as the large number of adjustable parameters. Design speed-up can be achieved by means of surrogate-based optimization techniques. In particular, a combination of variable-fidelity electromagnetic (EM) simulations, design space reduction techniques, response surface approximation models and design refinement methods permits identification of the Pareto-optimal set of designs within a reasonable timeframe. Here, a study concerning the scalability of surrogate-assisted multi-objective antenna design is carried out based on a set of benchmark problems, with the dimensionality of the design space ranging from six to 24 and a CPU cost of the EM antenna model from 10 to 20 min per simulation. Numerical results indicate that the computational overhead of the design process increases more or less quadratically with the number of adjustable geometric parameters of the antenna structure at hand, which is a promising result from the point of view of handling even more complex problems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pannala, S; D'Azevedo, E; Zacharia, T
The goal of the radiation modeling effort was to develop and implement a radiation algorithm that is fast and accurate for the underhood environment. As part of this CRADA, a net-radiation model was chosen to simulate radiative heat transfer in an underhood of a car. The assumptions (diffuse-gray and uniform radiative properties in each element) reduce the problem tremendously and all the view factors for radiation thermal calculations can be calculated once and for all at the beginning of the simulation. The cost for online integration of heat exchanges due to radiation is found to be less than 15% ofmore » the baseline CHAD code and thus very manageable. The off-line view factor calculation is constructed to be very modular and has been completely integrated to read CHAD grid files and the output from this code can be read into the latest version of CHAD. Further integration has to be performed to accomplish the same with STAR-CD. The main outcome of this effort is to obtain a highly scalable and portable simulation capability to model view factors for underhood environment (for e.g. a view factor calculation which took 14 hours on a single processor only took 14 minutes on 64 processors). The code has also been validated using a simple test case where analytical solutions are available. This simulation capability gives underhood designers in the automotive companies the ability to account for thermal radiation - which usually is critical in the underhood environment and also turns out to be one of the most computationally expensive components of underhood simulations. This report starts off with the original work plan as elucidated in the proposal in section B. This is followed by Technical work plan to accomplish the goals of the project in section C. In section D, background to the current work is provided with references to the previous efforts this project leverages on. The results are discussed in section 1E. This report ends with conclusions and future scope of work in section F.« less
Scalable splitting algorithms for big-data interferometric imaging in the SKA era
NASA Astrophysics Data System (ADS)
Onose, Alexandru; Carrillo, Rafael E.; Repetti, Audrey; McEwen, Jason D.; Thiran, Jean-Philippe; Pesquet, Jean-Christophe; Wiaux, Yves
2016-11-01
In the context of next-generation radio telescopes, like the Square Kilometre Array (SKA), the efficient processing of large-scale data sets is extremely important. Convex optimization tasks under the compressive sensing framework have recently emerged and provide both enhanced image reconstruction quality and scalability to increasingly larger data sets. We focus herein mainly on scalability and propose two new convex optimization algorithmic structures able to solve the convex optimization tasks arising in radio-interferometric imaging. They rely on proximal splitting and forward-backward iterations and can be seen, by analogy, with the CLEAN major-minor cycle, as running sophisticated CLEAN-like iterations in parallel in multiple data, prior, and image spaces. Both methods support any convex regularization function, in particular, the well-studied ℓ1 priors promoting image sparsity in an adequate domain. Tailored for big-data, they employ parallel and distributed computations to achieve scalability, in terms of memory and computational requirements. One of them also exploits randomization, over data blocks at each iteration, offering further flexibility. We present simulation results showing the feasibility of the proposed methods as well as their advantages compared to state-of-the-art algorithmic solvers. Our MATLAB code is available online on GitHub.
Protein Simulation Data in the Relational Model.
Simms, Andrew M; Daggett, Valerie
2012-10-01
High performance computing is leading to unprecedented volumes of data. Relational databases offer a robust and scalable model for storing and analyzing scientific data. However, these features do not come without a cost-significant design effort is required to build a functional and efficient repository. Modeling protein simulation data in a relational database presents several challenges: the data captured from individual simulations are large, multi-dimensional, and must integrate with both simulation software and external data sites. Here we present the dimensional design and relational implementation of a comprehensive data warehouse for storing and analyzing molecular dynamics simulations using SQL Server.
Protein Simulation Data in the Relational Model
Simms, Andrew M.; Daggett, Valerie
2011-01-01
High performance computing is leading to unprecedented volumes of data. Relational databases offer a robust and scalable model for storing and analyzing scientific data. However, these features do not come without a cost—significant design effort is required to build a functional and efficient repository. Modeling protein simulation data in a relational database presents several challenges: the data captured from individual simulations are large, multi-dimensional, and must integrate with both simulation software and external data sites. Here we present the dimensional design and relational implementation of a comprehensive data warehouse for storing and analyzing molecular dynamics simulations using SQL Server. PMID:23204646
NASA Technical Reports Server (NTRS)
Sanyal, Soumya; Jain, Amit; Das, Sajal K.; Biswas, Rupak
2003-01-01
In this paper, we propose a distributed approach for mapping a single large application to a heterogeneous grid environment. To minimize the execution time of the parallel application, we distribute the mapping overhead to the available nodes of the grid. This approach not only provides a fast mapping of tasks to resources but is also scalable. We adopt a hierarchical grid model and accomplish the job of mapping tasks to this topology using a scheduler tree. Results show that our three-phase algorithm provides high quality mappings, and is fast and scalable.
A scalable and practical one-pass clustering algorithm for recommender system
NASA Astrophysics Data System (ADS)
Khalid, Asra; Ghazanfar, Mustansar Ali; Azam, Awais; Alahmari, Saad Ali
2015-12-01
KMeans clustering-based recommendation algorithms have been proposed claiming to increase the scalability of recommender systems. One potential drawback of these algorithms is that they perform training offline and hence cannot accommodate the incremental updates with the arrival of new data, making them unsuitable for the dynamic environments. From this line of research, a new clustering algorithm called One-Pass is proposed, which is a simple, fast, and accurate. We show empirically that the proposed algorithm outperforms K-Means in terms of recommendation and training time while maintaining a good level of accuracy.
Advanced Dynamically Adaptive Algorithms for Stochastic Simulations on Extreme Scales
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xiu, Dongbin
2017-03-03
The focus of the project is the development of mathematical methods and high-performance computational tools for stochastic simulations, with a particular emphasis on computations on extreme scales. The core of the project revolves around the design of highly efficient and scalable numerical algorithms that can adaptively and accurately, in high dimensional spaces, resolve stochastic problems with limited smoothness, even containing discontinuities.
Difficulties with True Interoperability in Modeling & Simulation
2011-12-01
2009. Programming Scala : Scalability = Functional Programming + Ob- jects. 1 st ed. O‟Reilly Media. 2652 Gallant and Gaughan AUTHOR BIOGRAPHIES...that develops a model or simulation has a specific purpose, set of requirements and limited funding. These programs cannot afford to coordinate with...implementation. The program offices should budget for and plan for coordination across domain projects within a limited scope to improve interoperability with
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shin, J; Coss, D; McMurry, J
Purpose: To evaluate the efficiency of multithreaded Geant4 (Geant4-MT, version 10.0) for proton Monte Carlo dose calculations using a high performance computing facility. Methods: Geant4-MT was used to calculate 3D dose distributions in 1×1×1 mm3 voxels in a water phantom and patient's head with a 150 MeV proton beam covering approximately 5×5 cm2 in the water phantom. Three timestamps were measured on the fly to separately analyze the required time for initialization (which cannot be parallelized), processing time of individual threads, and completion time. Scalability of averaged processing time per thread was calculated as a function of thread number (1,more » 100, 150, and 200) for both 1M and 50 M histories. The total memory usage was recorded. Results: Simulations with 50 M histories were fastest with 100 threads, taking approximately 1.3 hours and 6 hours for the water phantom and the CT data, respectively with better than 1.0 % statistical uncertainty. The calculations show 1/N scalability in the event loops for both cases. The gains from parallel calculations started to decrease with 150 threads. The memory usage increases linearly with number of threads. No critical failures were observed during the simulations. Conclusion: Multithreading in Geant4-MT decreased simulation time in proton dose distribution calculations by a factor of 64 and 54 at a near optimal 100 threads for water phantom and patient's data respectively. Further simulations will be done to determine the efficiency at the optimal thread number. Considering the trend of computer architecture development, utilizing Geant4-MT for radiotherapy simulations is an excellent cost-effective alternative for a distributed batch queuing system. However, because the scalability depends highly on simulation details, i.e., the ratio of the processing time of one event versus waiting time to access for the shared event queue, a performance evaluation as described is recommended.« less
Scalability and Validation of Big Data Bioinformatics Software.
Yang, Andrian; Troup, Michael; Ho, Joshua W K
2017-01-01
This review examines two important aspects that are central to modern big data bioinformatics analysis - software scalability and validity. We argue that not only are the issues of scalability and validation common to all big data bioinformatics analyses, they can be tackled by conceptually related methodological approaches, namely divide-and-conquer (scalability) and multiple executions (validation). Scalability is defined as the ability for a program to scale based on workload. It has always been an important consideration when developing bioinformatics algorithms and programs. Nonetheless the surge of volume and variety of biological and biomedical data has posed new challenges. We discuss how modern cloud computing and big data programming frameworks such as MapReduce and Spark are being used to effectively implement divide-and-conquer in a distributed computing environment. Validation of software is another important issue in big data bioinformatics that is often ignored. Software validation is the process of determining whether the program under test fulfils the task for which it was designed. Determining the correctness of the computational output of big data bioinformatics software is especially difficult due to the large input space and complex algorithms involved. We discuss how state-of-the-art software testing techniques that are based on the idea of multiple executions, such as metamorphic testing, can be used to implement an effective bioinformatics quality assurance strategy. We hope this review will raise awareness of these critical issues in bioinformatics.
Jeong, Seol Young; Jo, Hyeong Gon; Kang, Soon Ju
2014-01-01
A tracking service like asset management is essential in a dynamic hospital environment consisting of numerous mobile assets (e.g., wheelchairs or infusion pumps) that are continuously relocated throughout a hospital. The tracking service is accomplished based on the key technologies of an indoor location-based service (LBS), such as locating and monitoring multiple mobile targets inside a building in real time. An indoor LBS such as a tracking service entails numerous resource lookups being requested concurrently and frequently from several locations, as well as a network infrastructure requiring support for high scalability in indoor environments. A traditional centralized architecture needs to maintain a geographic map of the entire building or complex in its central server, which can cause low scalability and traffic congestion. This paper presents a self-organizing and fully distributed indoor mobile asset management (MAM) platform, and proposes an architecture for multiple trackees (such as mobile assets) and trackers based on the proposed distributed platform in real time. In order to verify the suggested platform, scalability performance according to increases in the number of concurrent lookups was evaluated in a real test bed. Tracking latency and traffic load ratio in the proposed tracking architecture was also evaluated. PMID:24662407
Large-scale parallel genome assembler over cloud computing environment.
Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong
2017-06-01
The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.
Coilable Crystalline Fiber (CCF) Lasers and their Scalability
2014-03-01
Fibers: Double-Clad Design Concept of Tm:YAG-Core Fiber and Mode Simulation. Proc. SPIE 2012, 8237 , 82373M. 8. Beach, R. J.; Mitchell, S. C...Dubinskii, M. True Crystalline Fibers: Double-Clad LMA Design Concept of Tm:YAG-Core Fiber and Mode Simulation. Proc. of SPIE 2012, 8237 , 82373M-1...Tm:YAG-Core Fiber and Mode Simulation. Proc. SPIE 8237 , 82373M, 2012. 8. Beach, R. J.; Mitchell, S. C.; Meissner, H. E.; Meissner, O. R.; Krupke, W
Efficient Byzantine Fault Tolerance for Scalable Storage and Services
2009-07-01
most critical applications must survive in ever harsher environments. Less synchronous networking delivers packets unreliably and unpredictably, and... synchronous environments to allowing asynchrony, and from tolerating crashes to tolerating some corruptions through ad-hoc consistency checks. Ad-hoc...servers are responsive. To support this thesis statement, this disseration takes the following steps. First, it develops a new cryptographic primitive
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karthik, Rajasekar
2014-01-01
In this paper, an architecture for building Scalable And Mobile Environment For High-Performance Computing with spatial capabilities called SAME4HPC is described using cutting-edge technologies and standards such as Node.js, HTML5, ECMAScript 6, and PostgreSQL 9.4. Mobile devices are increasingly becoming powerful enough to run high-performance apps. At the same time, there exist a significant number of low-end and older devices that rely heavily on the server or the cloud infrastructure to do the heavy lifting. Our architecture aims to support both of these types of devices to provide high-performance and rich user experience. A cloud infrastructure consisting of OpenStack withmore » Ubuntu, GeoServer, and high-performance JavaScript frameworks are some of the key open-source and industry standard practices that has been adopted in this architecture.« less
Preliminary basic performance analysis of the Cedar multiprocessor memory system
NASA Technical Reports Server (NTRS)
Gallivan, K.; Jalby, W.; Turner, S.; Veidenbaum, A.; Wijshoff, H.
1991-01-01
Some preliminary basic results on the performance of the Cedar multiprocessor memory system are presented. Empirical results are presented and used to calibrate a memory system simulator which is then used to discuss the scalability of the system.
Innovation for integrated command environments
NASA Astrophysics Data System (ADS)
Perry, Amie A.; McKneely, Jennifer A.
2000-11-01
Command environments have rarely been able to easily accommodate rapid changes in technology and mission. Yet, command personnel, by their selection criteria, experience, and very nature, tend to be extremely adaptive and flexible, and able to learn new missions and address new challenges fairly easily. Instead, the hardware and software components of the systems do no provide the needed flexibility and scalability for command personnel. How do we solve this problem? In order to even dream of keeping pace with a rapidly changing world, we must begin to think differently about the command environment and its systems. What is the correct definition of the integrated command environment system? What types of tasks must be performed in this environment, and how might they change in the next five to twenty-five years? How should the command environment be developed, maintained, and evolved to provide needed flexibility and scalability? The issues and concepts to be considered as new Integrated Command/Control Environments (ICEs) are designed following a human-centered process. A futuristic model, the Dream Integrated Command Environment (DICE) will be described which demonstrates specific ICE innovations. The major paradigm shift required to be able to think differently about this problem is to center the DICE around the command personnel from its inception. Conference participants may not agree with every concept or idea presented, but will hopefully come away with a clear understanding that to radically improve future systems, designers must focus on the end users.
PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.
Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy
2015-05-01
We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.
PARALLEL HOP: A SCALABLE HALO FINDER FOR MASSIVE COSMOLOGICAL DATA SETS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Skory, Stephen; Turk, Matthew J.; Norman, Michael L.
2010-11-15
Modern N-body cosmological simulations contain billions (10{sup 9}) of dark matter particles. These simulations require hundreds to thousands of gigabytes of memory and employ hundreds to tens of thousands of processing cores on many compute nodes. In order to study the distribution of dark matter in a cosmological simulation, the dark matter halos must be identified using a halo finder, which establishes the halo membership of every particle in the simulation. The resources required for halo finding are similar to the requirements for the simulation itself. In particular, simulations have become too extensive to use commonly employed halo finders, suchmore » that the computational requirements to identify halos must now be spread across multiple nodes and cores. Here, we present a scalable-parallel halo finding method called Parallel HOP for large-scale cosmological simulation data. Based on the halo finder HOP, it utilizes message passing interface and domain decomposition to distribute the halo finding workload across multiple compute nodes, enabling analysis of much larger data sets than is possible with the strictly serial or previous parallel implementations of HOP. We provide a reference implementation of this method as a part of the toolkit {sup yt}, an analysis toolkit for adaptive mesh refinement data that include complementary analysis modules. Additionally, we discuss a suite of benchmarks that demonstrate that this method scales well up to several hundred tasks and data sets in excess of 2000{sup 3} particles. The Parallel HOP method and our implementation can be readily applied to any kind of N-body simulation data and is therefore widely applicable.« less
Liebe, J D; Hübner, U
2013-01-01
Continuous improvements of IT-performance in healthcare organisations require actionable performance indicators, regularly conducted, independent measurements and meaningful and scalable reference groups. Existing IT-benchmarking initiatives have focussed on the development of reliable and valid indicators, but less on the questions about how to implement an environment for conducting easily repeatable and scalable IT-benchmarks. This study aims at developing and trialling a procedure that meets the afore-mentioned requirements. We chose a well established, regularly conducted (inter-) national IT-survey of healthcare organisations (IT-Report Healthcare) as the environment and offered the participants of the 2011 survey (CIOs of hospitals) to enter a benchmark. The 61 structural and functional performance indicators covered among others the implementation status and integration of IT-systems and functions, global user satisfaction and the resources of the IT-department. Healthcare organisations were grouped by size and ownership. The benchmark results were made available electronically and feedback on the use of these results was requested after several months. Fifty-ninehospitals participated in the benchmarking. Reference groups consisted of up to 141 members depending on the number of beds (size) and the ownership (public vs. private). A total of 122 charts showing single indicator frequency views were sent to each participant. The evaluation showed that 94.1% of the CIOs who participated in the evaluation considered this benchmarking beneficial and reported that they would enter again. Based on the feedback of the participants we developed two additional views that provide a more consolidated picture. The results demonstrate that establishing an independent, easily repeatable and scalable IT-benchmarking procedure is possible and was deemed desirable. Based on these encouraging results a new benchmarking round which includes process indicators is currently conducted.
Internet-Based Laboratory Immersion: When The Real Deal is Not Available
NASA Astrophysics Data System (ADS)
Meisner, Gerald; Hoffman, Harol
2004-11-01
Do you want all of your students to investigate equilibrium conditions in the physics lab, but don't have time for lab investigations? Do your under-prepared students need basic, careful and detailed remedial work to help them succeed? LAAPhysics provides an answer to these questions by means of robust online physics courseware based on: (1) a sound, research-based pedagogy (2) a rich laboratory environment with skills and operational knowledge transferable to the wet lab' and (3) a paradigm which is economically scalable. LAAPhysics provides both synchronous and asynchronous learning experiences for an introductory, algebra-based course for students (undergraduate, AP High School, seekers of a second degree), those seeking career changes, and pre-service and in-service teachers. We have developed a simulated physics laboratory comprised of virtual lab equipment and instruments, associated curriculum modules and virtual guidance for real time feedback, formative assessment and collaborative learning.
Privacy enabling technology for video surveillance
NASA Astrophysics Data System (ADS)
Dufaux, Frédéric; Ouaret, Mourad; Abdeljaoued, Yousri; Navarro, Alfonso; Vergnenègre, Fabrice; Ebrahimi, Touradj
2006-05-01
In this paper, we address the problem privacy in video surveillance. We propose an efficient solution based on transformdomain scrambling of regions of interest in a video sequence. More specifically, the sign of selected transform coefficients is flipped during encoding. We address more specifically the case of Motion JPEG 2000. Simulation results show that the technique can be successfully applied to conceal information in regions of interest in the scene while providing with a good level of security. Furthermore, the scrambling is flexible and allows adjusting the amount of distortion introduced. This is achieved with a small impact on coding performance and negligible computational complexity increase. In the proposed video surveillance system, heterogeneous clients can remotely access the system through the Internet or 2G/3G mobile phone network. Thanks to the inherently scalable Motion JPEG 2000 codestream, the server is able to adapt the resolution and bandwidth of the delivered video depending on the usage environment of the client.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heo, Yeonsook; Augenbroe, Godfried; Graziano, Diane
2015-05-01
The increasing interest in retrofitting of existing buildings is motivated by the need to make a major contribution to enhancing building energy efficiency and reducing energy consumption and CO2 emission by the built environment. This paper examines the relevance of calibration in model-based analysis to support decision-making for energy and carbon efficiency retrofits of individual buildings and portfolios of buildings. The authors formulate a set of real retrofit decision-making situations and evaluate the role of calibration by using a case study that compares predictions and decisions from an uncalibrated model with those of a calibrated model. The case study illustratesmore » both the mechanics and outcomes of a practical alternative to the expert- and time-intense application of dynamic energy simulation models for large-scale retrofit decision-making under uncertainty.« less
PetIGA: A framework for high-performance isogeometric analysis
Dalcin, Lisandro; Collier, Nathaniel; Vignal, Philippe; ...
2016-05-25
We present PetIGA, a code framework to approximate the solution of partial differential equations using isogeometric analysis. PetIGA can be used to assemble matrices and vectors which come from a Galerkin weak form, discretized with Non-Uniform Rational B-spline basis functions. We base our framework on PETSc, a high-performance library for the scalable solution of partial differential equations, which simplifies the development of large-scale scientific codes, provides a rich environment for prototyping, and separates parallelism from algorithm choice. We describe the implementation of PetIGA, and exemplify its use by solving a model nonlinear problem. To illustrate the robustness and flexibility ofmore » PetIGA, we solve some challenging nonlinear partial differential equations that include problems in both solid and fluid mechanics. Lastly, we show strong scaling results on up to 4096 cores, which confirm the suitability of PetIGA for large scale simulations.« less
LEGO® Bricks as Building Blocks for Centimeter-Scale Biological Environments: The Case of Plants
Lind, Kara R.; Sizmur, Tom; Benomar, Saida; Miller, Anthony; Cademartiri, Ludovico
2014-01-01
LEGO bricks are commercially available interlocking pieces of plastic that are conventionally used as toys. We describe their use to build engineered environments for cm-scale biological systems, in particular plant roots. Specifically, we take advantage of the unique modularity of these building blocks to create inexpensive, transparent, reconfigurable, and highly scalable environments for plant growth in which structural obstacles and chemical gradients can be precisely engineered to mimic soil. PMID:24963716
LEGO® bricks as building blocks for centimeter-scale biological environments: the case of plants.
Lind, Kara R; Sizmur, Tom; Benomar, Saida; Miller, Anthony; Cademartiri, Ludovico
2014-01-01
LEGO bricks are commercially available interlocking pieces of plastic that are conventionally used as toys. We describe their use to build engineered environments for cm-scale biological systems, in particular plant roots. Specifically, we take advantage of the unique modularity of these building blocks to create inexpensive, transparent, reconfigurable, and highly scalable environments for plant growth in which structural obstacles and chemical gradients can be precisely engineered to mimic soil.
Fast, Automated, Scalable Generation of Textured 3D Models of Indoor Environments
2014-12-18
expensive travel and on-site visits. Different applications require models of different complexities, both with and without furniture geometry. The...environment and to localize the system in the environment over time. The datasets shown in this paper were generated by a backpack -mounted system that uses 2D...voxel is found to intersect the line segment from a scanner to a corresponding scan point. If a laser passes through a voxel, that voxel is considered
geoKepler Workflow Module for Computationally Scalable and Reproducible Geoprocessing and Modeling
NASA Astrophysics Data System (ADS)
Cowart, C.; Block, J.; Crawl, D.; Graham, J.; Gupta, A.; Nguyen, M.; de Callafon, R.; Smarr, L.; Altintas, I.
2015-12-01
The NSF-funded WIFIRE project has developed an open-source, online geospatial workflow platform for unifying geoprocessing tools and models for for fire and other geospatially dependent modeling applications. It is a product of WIFIRE's objective to build an end-to-end cyberinfrastructure for real-time and data-driven simulation, prediction and visualization of wildfire behavior. geoKepler includes a set of reusable GIS components, or actors, for the Kepler Scientific Workflow System (https://kepler-project.org). Actors exist for reading and writing GIS data in formats such as Shapefile, GeoJSON, KML, and using OGC web services such as WFS. The actors also allow for calling geoprocessing tools in other packages such as GDAL and GRASS. Kepler integrates functions from multiple platforms and file formats into one framework, thus enabling optimal GIS interoperability, model coupling, and scalability. Products of the GIS actors can be fed directly to models such as FARSITE and WRF. Kepler's ability to schedule and scale processes using Hadoop and Spark also makes geoprocessing ultimately extensible and computationally scalable. The reusable workflows in geoKepler can be made to run automatically when alerted by real-time environmental conditions. Here, we show breakthroughs in the speed of creating complex data for hazard assessments with this platform. We also demonstrate geoKepler workflows that use Data Assimilation to ingest real-time weather data into wildfire simulations, and for data mining techniques to gain insight into environmental conditions affecting fire behavior. Existing machine learning tools and libraries such as R and MLlib are being leveraged for this purpose in Kepler, as well as Kepler's Distributed Data Parallel (DDP) capability to provide a framework for scalable processing. geoKepler workflows can be executed via an iPython notebook as a part of a Jupyter hub at UC San Diego for sharing and reporting of the scientific analysis and results from various runs of geoKepler workflows. The communication between iPython and Kepler workflow executions is established through an iPython magic function for Kepler that we have implemented. In summary, geoKepler is an ecosystem that makes geospatial processing and analysis of any kind programmable, reusable, scalable and sharable.
Scalable Predictive Analysis in Critically Ill Patients Using a Visual Open Data Analysis Platform
Poucke, Sven Van; Zhang, Zhongheng; Schmitz, Martin; Vukicevic, Milan; Laenen, Margot Vander; Celi, Leo Anthony; Deyne, Cathy De
2016-01-01
With the accumulation of large amounts of health related data, predictive analytics could stimulate the transformation of reactive medicine towards Predictive, Preventive and Personalized (PPPM) Medicine, ultimately affecting both cost and quality of care. However, high-dimensionality and high-complexity of the data involved, prevents data-driven methods from easy translation into clinically relevant models. Additionally, the application of cutting edge predictive methods and data manipulation require substantial programming skills, limiting its direct exploitation by medical domain experts. This leaves a gap between potential and actual data usage. In this study, the authors address this problem by focusing on open, visual environments, suited to be applied by the medical community. Moreover, we review code free applications of big data technologies. As a showcase, a framework was developed for the meaningful use of data from critical care patients by integrating the MIMIC-II database in a data mining environment (RapidMiner) supporting scalable predictive analytics using visual tools (RapidMiner’s Radoop extension). Guided by the CRoss-Industry Standard Process for Data Mining (CRISP-DM), the ETL process (Extract, Transform, Load) was initiated by retrieving data from the MIMIC-II tables of interest. As use case, correlation of platelet count and ICU survival was quantitatively assessed. Using visual tools for ETL on Hadoop and predictive modeling in RapidMiner, we developed robust processes for automatic building, parameter optimization and evaluation of various predictive models, under different feature selection schemes. Because these processes can be easily adopted in other projects, this environment is attractive for scalable predictive analytics in health research. PMID:26731286
Scalable Predictive Analysis in Critically Ill Patients Using a Visual Open Data Analysis Platform.
Van Poucke, Sven; Zhang, Zhongheng; Schmitz, Martin; Vukicevic, Milan; Laenen, Margot Vander; Celi, Leo Anthony; De Deyne, Cathy
2016-01-01
With the accumulation of large amounts of health related data, predictive analytics could stimulate the transformation of reactive medicine towards Predictive, Preventive and Personalized (PPPM) Medicine, ultimately affecting both cost and quality of care. However, high-dimensionality and high-complexity of the data involved, prevents data-driven methods from easy translation into clinically relevant models. Additionally, the application of cutting edge predictive methods and data manipulation require substantial programming skills, limiting its direct exploitation by medical domain experts. This leaves a gap between potential and actual data usage. In this study, the authors address this problem by focusing on open, visual environments, suited to be applied by the medical community. Moreover, we review code free applications of big data technologies. As a showcase, a framework was developed for the meaningful use of data from critical care patients by integrating the MIMIC-II database in a data mining environment (RapidMiner) supporting scalable predictive analytics using visual tools (RapidMiner's Radoop extension). Guided by the CRoss-Industry Standard Process for Data Mining (CRISP-DM), the ETL process (Extract, Transform, Load) was initiated by retrieving data from the MIMIC-II tables of interest. As use case, correlation of platelet count and ICU survival was quantitatively assessed. Using visual tools for ETL on Hadoop and predictive modeling in RapidMiner, we developed robust processes for automatic building, parameter optimization and evaluation of various predictive models, under different feature selection schemes. Because these processes can be easily adopted in other projects, this environment is attractive for scalable predictive analytics in health research.
Scuba: scalable kernel-based gene prioritization.
Zampieri, Guido; Tran, Dinh Van; Donini, Michele; Navarin, Nicolò; Aiolli, Fabio; Sperduti, Alessandro; Valle, Giorgio
2018-01-25
The uncovering of genes linked to human diseases is a pressing challenge in molecular biology and precision medicine. This task is often hindered by the large number of candidate genes and by the heterogeneity of the available information. Computational methods for the prioritization of candidate genes can help to cope with these problems. In particular, kernel-based methods are a powerful resource for the integration of heterogeneous biological knowledge, however, their practical implementation is often precluded by their limited scalability. We propose Scuba, a scalable kernel-based method for gene prioritization. It implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimization of the margin distribution. Scuba is optimized to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. As a direct consequence of scalability, Scuba integrates also a new efficient strategy to select optimal kernel parameters for each data source. We performed cross-validation experiments and simulated a realistic usage setting, showing that Scuba outperforms a wide range of state-of-the-art methods. Scuba achieves state-of-the-art performance and has enhanced scalability compared to existing kernel-based approaches for genomic data. This method can be useful to prioritize candidate genes, particularly when their number is large or when input data is highly heterogeneous. The code is freely available at https://github.com/gzampieri/Scuba .
Scalable High-order Methods for Multi-Scale Problems: Analysis, Algorithms and Application
2016-02-26
Karniadakis, “Resilient algorithms for reconstructing and simulating gappy flow fields in CFD ”, Fluid Dynamic Research, vol. 47, 051402, 2015. 2. Y. Yu, H...simulation, domain decomposition, CFD , gappy data, estimation theory, and gap-tooth algorithm. 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF...objective of this project was to develop a general CFD framework for multifidelity simula- tions to target multiscale problems but also resilience in
Development of a Scale-up Tool for Pervaporation Processes
Thiess, Holger; Strube, Jochen
2018-01-01
In this study, an engineering tool for the design and optimization of pervaporation processes is developed based on physico-chemical modelling coupled with laboratory/mini-plant experiments. The model incorporates the solution-diffusion-mechanism, polarization effects (concentration and temperature), axial dispersion, pressure drop and the temperature drop in the feed channel due to vaporization of the permeating components. The permeance, being the key model parameter, was determined via dehydration experiments on a mini-plant scale for the binary mixtures ethanol/water and ethyl acetate/water. A second set of experimental data was utilized for the validation of the model for two chemical systems. The industrially relevant ternary mixture, ethanol/ethyl acetate/water, was investigated close to its azeotropic point and compared to a simulation conducted with the determined binary permeance data. Experimental and simulation data proved to agree very well for the investigated process conditions. In order to test the scalability of the developed engineering tool, large-scale data from an industrial pervaporation plant used for the dehydration of ethanol was compared to a process simulation conducted with the validated physico-chemical model. Since the membranes employed in both mini-plant and industrial scale were of the same type, the permeance data could be transferred. The comparison of the measured and simulated data proved the scalability of the derived model. PMID:29342956
Entangling spin-spin interactions of ions in individually controlled potential wells
NASA Astrophysics Data System (ADS)
Wilson, Andrew; Colombe, Yves; Brown, Kenton; Knill, Emanuel; Leibfried, Dietrich; Wineland, David
2014-03-01
Physical systems that cannot be modeled with classical computers appear in many different branches of science, including condensed-matter physics, statistical mechanics, high-energy physics, atomic physics and quantum chemistry. Despite impressive progress on the control and manipulation of various quantum systems, implementation of scalable devices for quantum simulation remains a formidable challenge. As one approach to scalability in simulation, here we demonstrate an elementary building-block of a configurable quantum simulator based on atomic ions. Two ions are trapped in separate potential wells that can individually be tailored to emulate a number of different spin-spin couplings mediated by the ions' Coulomb interaction together with classical laser and microwave fields. We demonstrate deterministic tuning of this interaction by independent control of the local wells and emulate a particular spin-spin interaction to entangle the internal states of the two ions with 0.81(2) fidelity. Extension of the building-block demonstrated here to a 2D-network, which ion-trap micro-fabrication processes enable, may provide a new quantum simulator architecture with broad flexibility in designing and scaling the arrangement of ions and their mutual interactions. This research was funded by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), ONR, and the NIST Quantum Information Program.
Scalable Metropolis Monte Carlo for simulation of hard shapes
NASA Astrophysics Data System (ADS)
Anderson, Joshua A.; Eric Irrgang, M.; Glotzer, Sharon C.
2016-07-01
We design and implement a scalable hard particle Monte Carlo simulation toolkit (HPMC), and release it open source as part of HOOMD-blue. HPMC runs in parallel on many CPUs and many GPUs using domain decomposition. We employ BVH trees instead of cell lists on the CPU for fast performance, especially with large particle size disparity, and optimize inner loops with SIMD vector intrinsics on the CPU. Our GPU kernel proposes many trial moves in parallel on a checkerboard and uses a block-level queue to redistribute work among threads and avoid divergence. HPMC supports a wide variety of shape classes, including spheres/disks, unions of spheres, convex polygons, convex spheropolygons, concave polygons, ellipsoids/ellipses, convex polyhedra, convex spheropolyhedra, spheres cut by planes, and concave polyhedra. NVT and NPT ensembles can be run in 2D or 3D triclinic boxes. Additional integration schemes permit Frenkel-Ladd free energy computations and implicit depletant simulations. In a benchmark system of a fluid of 4096 pentagons, HPMC performs 10 million sweeps in 10 min on 96 CPU cores on XSEDE Comet. The same simulation would take 7.6 h in serial. HPMC also scales to large system sizes, and the same benchmark with 16.8 million particles runs in 1.4 h on 2048 GPUs on OLCF Titan.
Accelerating cardiac bidomain simulations using graphics processing units.
Neic, A; Liebmann, M; Hoetzl, E; Mitchell, L; Vigmond, E J; Haase, G; Plank, G
2012-08-01
Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6-20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20 GPUs, 476 CPU cores were required on a national supercomputing facility.
Accelerating Cardiac Bidomain Simulations Using Graphics Processing Units
Neic, Aurel; Liebmann, Manfred; Hoetzl, Elena; Mitchell, Lawrence; Vigmond, Edward J.; Haase, Gundolf
2013-01-01
Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6–20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20GPUs, 476 CPU cores were required on a national supercomputing facility. PMID:22692867
DYNAMICO, an atmospheric dynamical core for high-performance climate modeling
NASA Astrophysics Data System (ADS)
Dubos, Thomas; Meurdesoif, Yann; Spiga, Aymeric; Millour, Ehouarn; Fita, Lluis; Hourdin, Frédéric; Kageyama, Masa; Traore, Abdoul-Khadre; Guerlet, Sandrine; Polcher, Jan
2017-04-01
Institut Pierre Simon Laplace has developed a very scalable atmospheric dynamical core, DYNAMICO, based on energy-conserving finite-difference/finite volume numerics on a quasi-uniform icosahedral-hexagonal mesh. Scalability is achieved by combining hybrid MPI/OpenMP parallelism to asynchronous I/O. This dynamical core has been coupled to radiative transfer physics tailored to the atmosphere of Saturn, allowing unprecedented simulations of the climate of this giant planet. For terrestrial climate studies DYNAMICO is being integrated into the IPSL Earth System Model IPSL-CM. Preliminary aquaplanet and AMIP-style simulations yield reasonable results when compared to outputs from IPSL-CM5. The observed performance suggests that an order of magnitude may be gained with respect to IPSL-CM CMIP5 simulations either on the duration of simulations or on their resolution. Longer simulations would be of interest for the study of paleoclimate, while higher resolution could improve certain aspects of the modeled climate such as extreme events, as will be explored in the HighResMIP project. Following IPSL's strategic vision of building a unified global-regional modelling system, a fully-compressible, non-hydrostatic prototype of DYNAMICO has been developed, enabling future convection-resolving simulations. Work supported by ANR project "HEAT", grant number CE23_2014_HEAT Dubos, T., Dubey, S., Tort, M., Mittal, R., Meurdesoif, Y., and Hourdin, F.: DYNAMICO-1.0, an icosahedral hydrostatic dynamical core designed for consistency and versatility, Geosci. Model Dev., 8, 3131-3150, doi:10.5194/gmd-8-3131-2015, 2015.
Zwier, Matthew C.; Adelman, Joshua L.; Kaus, Joseph W.; Pratt, Adam J.; Wong, Kim F.; Rego, Nicholas B.; Suárez, Ernesto; Lettieri, Steven; Wang, David W.; Grabe, Michael; Zuckerman, Daniel M.; Chong, Lillian T.
2015-01-01
The weighted ensemble (WE) path sampling approach orchestrates an ensemble of parallel calculations with intermittent communication to enhance the sampling of rare events, such as molecular associations or conformational changes in proteins or peptides. Trajectories are replicated and pruned in a way that focuses computational effort on under-explored regions of configuration space while maintaining rigorous kinetics. To enable the simulation of rare events at any scale (e.g. atomistic, cellular), we have developed an open-source, interoperable, and highly scalable software package for the execution and analysis of WE simulations: WESTPA (The Weighted Ensemble Simulation Toolkit with Parallelization and Analysis). WESTPA scales to thousands of CPU cores and includes a suite of analysis tools that have been implemented in a massively parallel fashion. The software has been designed to interface conveniently with any dynamics engine and has already been used with a variety of molecular dynamics (e.g. GROMACS, NAMD, OpenMM, AMBER) and cell-modeling packages (e.g. BioNetGen, MCell). WESTPA has been in production use for over a year, and its utility has been demonstrated for a broad set of problems, ranging from atomically detailed host-guest associations to non-spatial chemical kinetics of cellular signaling networks. The following describes the design and features of WESTPA, including the facilities it provides for running WE simulations, storing and analyzing WE simulation data, as well as examples of input and output. PMID:26392815
A Multilevel, Hierarchical Sampling Technique for Spatially Correlated Random Fields
Osborn, Sarah; Vassilevski, Panayot S.; Villa, Umberto
2017-10-26
In this paper, we propose an alternative method to generate samples of a spatially correlated random field with applications to large-scale problems for forward propagation of uncertainty. A classical approach for generating these samples is the Karhunen--Loève (KL) decomposition. However, the KL expansion requires solving a dense eigenvalue problem and is therefore computationally infeasible for large-scale problems. Sampling methods based on stochastic partial differential equations provide a highly scalable way to sample Gaussian fields, but the resulting parametrization is mesh dependent. We propose a multilevel decomposition of the stochastic field to allow for scalable, hierarchical sampling based on solving amore » mixed finite element formulation of a stochastic reaction-diffusion equation with a random, white noise source function. Lastly, numerical experiments are presented to demonstrate the scalability of the sampling method as well as numerical results of multilevel Monte Carlo simulations for a subsurface porous media flow application using the proposed sampling method.« less
A Multilevel, Hierarchical Sampling Technique for Spatially Correlated Random Fields
DOE Office of Scientific and Technical Information (OSTI.GOV)
Osborn, Sarah; Vassilevski, Panayot S.; Villa, Umberto
In this paper, we propose an alternative method to generate samples of a spatially correlated random field with applications to large-scale problems for forward propagation of uncertainty. A classical approach for generating these samples is the Karhunen--Loève (KL) decomposition. However, the KL expansion requires solving a dense eigenvalue problem and is therefore computationally infeasible for large-scale problems. Sampling methods based on stochastic partial differential equations provide a highly scalable way to sample Gaussian fields, but the resulting parametrization is mesh dependent. We propose a multilevel decomposition of the stochastic field to allow for scalable, hierarchical sampling based on solving amore » mixed finite element formulation of a stochastic reaction-diffusion equation with a random, white noise source function. Lastly, numerical experiments are presented to demonstrate the scalability of the sampling method as well as numerical results of multilevel Monte Carlo simulations for a subsurface porous media flow application using the proposed sampling method.« less
Sensor-scheduling simulation of disparate sensors for Space Situational Awareness
NASA Astrophysics Data System (ADS)
Hobson, T.; Clarkson, I.
2011-09-01
The art and science of space situational awareness (SSA) has been practised and developed from the time of Sputnik. However, recent developments, such as the accelerating pace of satellite launch, the proliferation of launch capable agencies, both commercial and sovereign, and recent well-publicised collisions involving man-made space objects, has further magnified the importance of timely and accurate SSA. The United States Strategic Command (USSTRATCOM) operates the Space Surveillance Network (SSN), a global network of sensors tasked with maintaining SSA. The rapidly increasing number of resident space objects will require commensurate improvements in the SSN. Sensors are scarce resources that must be scheduled judiciously to obtain measurements of maximum utility. Improvements in sensor scheduling and fusion, can serve to reduce the number of additional sensors that may be required. Recently, Hill et al. [1] have proposed and developed a simulation environment named TASMAN (Tasking Autonomous Sensors in a Multiple Application Network) to enable testing of alternative scheduling strategies within a simulated multi-sensor, multi-target environment. TASMAN simulates a high-fidelity, hardware-in-the-loop system by running multiple machines with different roles in parallel. At present, TASMAN is limited to simulations involving electro-optic sensors. Its high fidelity is at once a feature and a limitation, since supercomputing is required to run simulations of appreciable scale. In this paper, we describe an alternative, modular and scalable SSA simulation system that can extend the work of Hill et al with reduced complexity, albeit also with reduced fidelity. The tool has been developed in MATLAB and therefore can be run on a very wide range of computing platforms. It can also make use of MATLAB’s parallel processing capabilities to obtain considerable speed-up. The speed and flexibility so obtained can be used to quickly test scheduling algorithms even with a relatively large number of space objects. We further describe an application of the tool by exploring how the relative mixture of electro-optical and radar sensors can impact the scheduling, fusion and achievable accuracy of an SSA system. By varying the mixture of sensor types, we are able to characterise the main advantages and disadvantages of each configuration.
NASA Astrophysics Data System (ADS)
Appel, Marius; Nüst, Daniel; Pebesma, Edzer
2017-04-01
Geoscientific analyses of Earth observation data typically involve a long path from data acquisition to scientific results and conclusions. Before starting the actual processing, scenes must be downloaded from the providers' platforms and the computing infrastructure needs to be prepared. The computing environment often requires specialized software, which in turn might have lots of dependencies. The software is often highly customized and provided without commercial support, which leads to rather ad-hoc systems and irreproducible results. To let other scientists reproduce the analyses, the full workspace including data, code, the computing environment, and documentation must be bundled and shared. Technologies such as virtualization or containerization allow for the creation of identical computing environments with relatively little effort. Challenges, however, arise when the volume of the data is too large, when computations are done in a cluster environment, or when complex software components such as databases are used. We discuss these challenges for the example of scalable Land use change detection on Landsat imagery. We present a reproducible implementation that runs R and the scalable data management and analytical system SciDB within a Docker container. Thanks to an explicit container recipe (the Dockerfile), this enables the all-in-one reproduction including the installation of software components, the ingestion of the data, and the execution of the analysis in a well-defined environment. We furthermore discuss possibilities how the implementation could be transferred to multi-container environments in order to support reproducibility on large cluster environments.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Slattery, Stuart R.
In this study we analyze and extend mesh-free algorithms for three-dimensional data transfer problems in partitioned multiphysics simulations. We first provide a direct comparison between a mesh-based weighted residual method using the common-refinement scheme and two mesh-free algorithms leveraging compactly supported radial basis functions: one using a spline interpolation and one using a moving least square reconstruction. Through the comparison we assess both the conservation and accuracy of the data transfer obtained from each of the methods. We do so for a varying set of geometries with and without curvature and sharp features and for functions with and without smoothnessmore » and with varying gradients. Our results show that the mesh-based and mesh-free algorithms are complementary with cases where each was demonstrated to perform better than the other. We then focus on the mesh-free methods by developing a set of algorithms to parallelize them based on sparse linear algebra techniques. This includes a discussion of fast parallel radius searching in point clouds and restructuring the interpolation algorithms to leverage data structures and linear algebra services designed for large distributed computing environments. The scalability of our new algorithms is demonstrated on a leadership class computing facility using a set of basic scaling studies. Finally, these scaling studies show that for problems with reasonable load balance, our new algorithms for both spline interpolation and moving least square reconstruction demonstrate both strong and weak scalability using more than 100,000 MPI processes with billions of degrees of freedom in the data transfer operation.« less
NASA Astrophysics Data System (ADS)
Qiao, Mu
2015-03-01
Service Oriented Architecture1 (SOA) is widely used in building flexible and scalable web sites and services. In most of the web or mobile photo book and gifting business space, the products ordered are highly variable without a standard template that one can substitute texts or images from similar to that of commercial variable data printing. In this paper, the author describes a SOA workflow in a multi-sites, multi-product lines fulfillment system where three major challenges are addressed: utilization of hardware and equipment, highly automation with fault recovery, and highly scalable and flexible with order volume fluctuation.
XPRESS: eXascale PRogramming Environment and System Software
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brightwell, Ron; Sterling, Thomas; Koniges, Alice
The XPRESS Project is one of four major projects of the DOE Office of Science Advanced Scientific Computing Research X-stack Program initiated in September, 2012. The purpose of XPRESS is to devise an innovative system software stack to enable practical and useful exascale computing around the end of the decade with near-term contributions to efficient and scalable operation of trans-Petaflops performance systems in the next two to three years; both for DOE mission-critical applications. To this end, XPRESS directly addresses critical challenges in computing of efficiency, scalability, and programmability through introspective methods of dynamic adaptive resource management and task scheduling.
NASA Technical Reports Server (NTRS)
2006-01-01
The Global Change Master Directory (GCMD) has been one of the best known Earth science and global change data discovery online resources throughout its extended operational history. The growing popularity of the system since its introduction on the World Wide Web in 1994 has created an environment where resolving issues of scalability, security, and interoperability have been critical to providing the best available service to the users and partners of the GCMD. Innovative approaches developed at the GCMD in these areas will be presented with a focus on how they relate to current and future GO-ESSP community needs.
NASA Astrophysics Data System (ADS)
Sastry, Kumara Narasimha
2007-03-01
Effective and efficient rnultiscale modeling is essential to advance both the science and synthesis in a, wide array of fields such as physics, chemistry, materials science; biology, biotechnology and pharmacology. This study investigates the efficacy and potential of rising genetic algorithms for rnultiscale materials modeling and addresses some of the challenges involved in designing competent algorithms that solve hard problems quickly, reliably and accurately. In particular, this thesis demonstrates the use of genetic algorithms (GAs) and genetic programming (GP) in multiscale modeling with the help of two non-trivial case studies in materials science and chemistry. The first case study explores the utility of genetic programming (GP) in multi-timescaling alloy kinetics simulations. In essence, GP is used to bridge molecular dynamics and kinetic Monte Carlo methods to span orders-of-magnitude in simulation time. Specifically, GP is used to regress symbolically an inline barrier function from a limited set of molecular dynamics simulations to enable kinetic Monte Carlo that simulate seconds of real time. Results on a non-trivial example of vacancy-assisted migration on a surface of a face-centered cubic (fcc) Copper-Cobalt (CuxCo 1-x) alloy show that GP predicts all barriers with 0.1% error from calculations for less than 3% of active configurations, independent of type of potentials used to obtain the learning set of barriers via molecular dynamics. The resulting method enables 2--9 orders-of-magnitude increase in real-time dynamics simulations taking 4--7 orders-of-magnitude less CPU time. The second case study presents the application of multiobjective genetic algorithms (MOGAs) in multiscaling quantum chemistry simulations. Specifically, MOGAs are used to bridge high-level quantum chemistry and semiempirical methods to provide accurate representation of complex molecular excited-state and ground-state behavior. Results on ethylene and benzene---two common building blocks in organic chemistry---indicate that MOGAs produce High-quality semiempirical methods that (1) are stable to small perturbations, (2) yield accurate configuration energies on untested and critical excited states, and (3) yield ab initio quality excited-state dynamics. The proposed method enables simulations of more complex systems to realistic, multi-picosecond timescales, well beyond previous attempts or expectation of human experts, and 2--3 orders-of-magnitude reduction in computational cost. While the two applications use simple evolutionary operators, in order to tackle more complex systems, their scalability and limitations have to be investigated. The second part of the thesis addresses some of the challenges involved with a successful design of genetic algorithms and genetic programming for multiscale modeling. The first issue addressed is the scalability of genetic programming, where facetwise models are built to assess the population size required by GP to ensure adequate supply of raw building blocks and also to ensure accurate decision-making between competing building blocks. This study also presents a design of competent genetic programming, where traditional fixed recombination operators are replaced by building and sampling probabilistic models of promising candidate programs. The proposed scalable GP, called extended compact GP (eCGP), combines the ideas from extended compact genetic algorithm (eCGA) and probabilistic incremental program evolution (PIPE) and adaptively identifies, propagates and exchanges important subsolutions of a search problem. Results show that eCGP scales cubically with problem size on both GP-easy and GP-hard problems. Finally, facetwise models are developed to explore limitations of scalability of MOGAs, where the scalability of multiobjective algorithms in reliably maintaining Pareto-optimal solutions is addressed. The results show that even when the building blocks are accurately identified, massive multimodality of the search problems can easily overwhelm the nicher (diversity preserving operator) and lead to exponential scale-up. Facetwise models are developed, which incorporate the combined effects of model accuracy, decision making, and sub-structure supply, as well as the effect of niching on the population sizing, to predict a limit on the growth rate of a maximum number of sub-structures that can compete in the two objectives to circumvent the failure of the niching method. The results show that if the number of competing building blocks between multiple objectives is less than the proposed limit, multiobjective GAs scale-up polynomially with the problem size on boundedly-difficult problems.
A Domain-Specific Language for Aviation Domain Interoperability
ERIC Educational Resources Information Center
Comitz, Paul
2013-01-01
Modern information systems require a flexible, scalable, and upgradeable infrastructure that allows communication and collaboration between heterogeneous information processing and computing environments. Aviation systems from different organizations often use differing representations and distribution policies for the same data and messages,…
CORE IT Services: The NDC and NSSC
NASA Technical Reports Server (NTRS)
Dischinger, Portia
2005-01-01
Contents include the folowing: Virtual NASA Data Center (NDC) overview. NDC challenges. NDC present and future. NDC environment.NDC decomposition NDC opportunities. NDC solution: Horizontally and vertically scalable. Adaptable to new business/mission areas. Competitive-cost and quality. Superior service delivery.
ASDC Advances in the Utilization of Microservices and Hybrid Cloud Environments
NASA Astrophysics Data System (ADS)
Baskin, W. E.; Herbert, A.; Mazaika, A.; Walter, J.
2017-12-01
The Atmospheric Science Data Center (ASDC) is transitioning many of its software tools and applications to standalone microservices deployable in a hybrid cloud, offering benefits such as scalability and efficient environment management. This presentation features several projects the ASDC staff have implemented leveraging the OpenShift Container Application Platform and OpenStack Hybrid Cloud Environment focusing on key tools and techniques applied to: Earth Science data processing Spatial-Temporal metadata generation, validation, repair, and curation Archived Data discovery, visualization, and access
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lorenz, Daniel; Wolf, Felix
2016-02-17
The PRIMA-X (Performance Retargeting of Instrumentation, Measurement, and Analysis Technologies for Exascale Computing) project is the successor of the DOE PRIMA (Performance Refactoring of Instrumentation, Measurement, and Analysis Technologies for Petascale Computing) project, which addressed the challenge of creating a core measurement infrastructure that would serve as a common platform for both integrating leading parallel performance systems (notably TAU and Scalasca) and developing next-generation scalable performance tools. The PRIMA-X project shifts the focus away from refactorization of robust performance tools towards a re-targeting of the parallel performance measurement and analysis architecture for extreme scales. The massive concurrency, asynchronous execution dynamics,more » hardware heterogeneity, and multi-objective prerequisites (performance, power, resilience) that identify exascale systems introduce fundamental constraints on the ability to carry forward existing performance methodologies. In particular, there must be a deemphasis of per-thread observation techniques to significantly reduce the otherwise unsustainable flood of redundant performance data. Instead, it will be necessary to assimilate multi-level resource observations into macroscopic performance views, from which resilient performance metrics can be attributed to the computational features of the application. This requires a scalable framework for node-level and system-wide monitoring and runtime analyses of dynamic performance information. Also, the interest in optimizing parallelism parameters with respect to performance and energy drives the integration of tool capabilities in the exascale environment further. Initially, PRIMA-X was a collaborative project between the University of Oregon (lead institution) and the German Research School for Simulation Sciences (GRS). Because Prof. Wolf, the PI at GRS, accepted a position as full professor at Technische Universität Darmstadt (TU Darmstadt) starting February 1st, 2015, the project ended at GRS on January 31st, 2015. This report reflects the work accomplished at GRS until then. The work of GRS is expected to be continued at TU Darmstadt. The first main accomplishment of GRS is the design of different thread-level aggregation techniques. We created a prototype capable of aggregating the thread-level information in performance profiles using these techniques. The next step will be the integration of the most promising techniques into the Score-P measurement system and their evaluation. The second main accomplishment is a substantial increase of Score-P’s scalability, achieved by improving the design of the system-tree representation in Score-P’s profile format. We developed a new representation and a distributed algorithm to create the scalable system tree representation. Finally, we developed a lightweight approach to MPI wait-state profiling. Former algorithms either needed piggy-backing, which can cause significant runtime overhead, or tracing, which comes with its own set of scaling challenges. Our approach works with local data only and, thus, is scalable and has very little overhead.« less
The GBS code for tokamak scrape-off layer simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Halpern, F.D., E-mail: federico.halpern@epfl.ch; Ricci, P.; Jolliet, S.
2016-06-15
We describe a new version of GBS, a 3D global, flux-driven plasma turbulence code to simulate the turbulent dynamics in the tokamak scrape-off layer (SOL), superseding the code presented by Ricci et al. (2012) [14]. The present work is driven by the objective of studying SOL turbulent dynamics in medium size tokamaks and beyond with a high-fidelity physics model. We emphasize an intertwining framework of improved physics models and the computational improvements that allow them. The model extensions include neutral atom physics, finite ion temperature, the addition of a closed field line region, and a non-Boussinesq treatment of the polarizationmore » drift. GBS has been completely refactored with the introduction of a 3-D Cartesian communicator and a scalable parallel multigrid solver. We report dramatically enhanced parallel scalability, with the possibility of treating electromagnetic fluctuations very efficiently. The method of manufactured solutions as a verification process has been carried out for this new code version, demonstrating the correct implementation of the physical model.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Visweswara Sathanur, Arun; Choudhury, Sutanay; Joslyn, Cliff A.
Property graphs can be used to represent heterogeneous networks with attributed vertices and edges. Given one property graph, simulating another graph with same or greater size with identical statistical properties with respect to the attributes and connectivity is critical for privacy preservation and benchmarking purposes. In this work we tackle the problem of capturing the statistical dependence of the edge connectivity on the vertex labels and using the same distribution to regenerate property graphs of the same or expanded size in a scalable manner. However, accurate simulation becomes a challenge when the attributes do not completely explain the network structure.more » We propose the Property Graph Model (PGM) approach that uses an attribute (or label) augmentation strategy to mitigate the problem and preserve the graph connectivity as measured via degree distribution, vertex label distributions and edge connectivity. Our proposed algorithm is scalable with a linear complexity in the number of edges in the target graph. We illustrate the efficacy of the PGM approach in regenerating and expanding the datasets by leveraging two distinct illustrations.« less
Evaluating the performance of parallel subsurface simulators: An illustrative example with PFLOTRAN
Hammond, G E; Lichtner, P C; Mills, R T
2014-01-01
[1] To better inform the subsurface scientist on the expected performance of parallel simulators, this work investigates performance of the reactive multiphase flow and multicomponent biogeochemical transport code PFLOTRAN as it is applied to several realistic modeling scenarios run on the Jaguar supercomputer. After a brief introduction to the code's parallel layout and code design, PFLOTRAN's parallel performance (measured through strong and weak scalability analyses) is evaluated in the context of conceptual model layout, software and algorithmic design, and known hardware limitations. PFLOTRAN scales well (with regard to strong scaling) for three realistic problem scenarios: (1) in situ leaching of copper from a mineral ore deposit within a 5-spot flow regime, (2) transient flow and solute transport within a regional doublet, and (3) a real-world problem involving uranium surface complexation within a heterogeneous and extremely dynamic variably saturated flow field. Weak scalability is discussed in detail for the regional doublet problem, and several difficulties with its interpretation are noted. PMID:25506097
Evaluating the performance of parallel subsurface simulators: An illustrative example with PFLOTRAN.
Hammond, G E; Lichtner, P C; Mills, R T
2014-01-01
[1] To better inform the subsurface scientist on the expected performance of parallel simulators, this work investigates performance of the reactive multiphase flow and multicomponent biogeochemical transport code PFLOTRAN as it is applied to several realistic modeling scenarios run on the Jaguar supercomputer. After a brief introduction to the code's parallel layout and code design, PFLOTRAN's parallel performance (measured through strong and weak scalability analyses) is evaluated in the context of conceptual model layout, software and algorithmic design, and known hardware limitations. PFLOTRAN scales well (with regard to strong scaling) for three realistic problem scenarios: (1) in situ leaching of copper from a mineral ore deposit within a 5-spot flow regime, (2) transient flow and solute transport within a regional doublet, and (3) a real-world problem involving uranium surface complexation within a heterogeneous and extremely dynamic variably saturated flow field. Weak scalability is discussed in detail for the regional doublet problem, and several difficulties with its interpretation are noted.
XNsim: Internet-Enabled Collaborative Distributed Simulation via an Extensible Network
NASA Technical Reports Server (NTRS)
Novotny, John; Karpov, Igor; Zhang, Chendi; Bedrossian, Nazareth S.
2007-01-01
In this paper, the XNsim approach to achieve Internet-enabled, dynamically scalable collaborative distributed simulation capabilities is presented. With this approach, a complete simulation can be assembled from shared component subsystems written in different formats, that run on different computing platforms, with different sampling rates, in different geographic locations, and over singlelmultiple networks. The subsystems interact securely with each other via the Internet. Furthermore, the simulation topology can be dynamically modified. The distributed simulation uses a combination of hub-and-spoke and peer-topeer network topology. A proof-of-concept demonstrator is also presented. The XNsim demonstrator can be accessed at http://www.jsc.draver.corn/xn that hosts various examples of Internet enabled simulations.
Long-range interactions and parallel scalability in molecular simulations
NASA Astrophysics Data System (ADS)
Patra, Michael; Hyvönen, Marja T.; Falck, Emma; Sabouri-Ghomi, Mohsen; Vattulainen, Ilpo; Karttunen, Mikko
2007-01-01
Typical biomolecular systems such as cellular membranes, DNA, and protein complexes are highly charged. Thus, efficient and accurate treatment of electrostatic interactions is of great importance in computational modeling of such systems. We have employed the GROMACS simulation package to perform extensive benchmarking of different commonly used electrostatic schemes on a range of computer architectures (Pentium-4, IBM Power 4, and Apple/IBM G5) for single processor and parallel performance up to 8 nodes—we have also tested the scalability on four different networks, namely Infiniband, GigaBit Ethernet, Fast Ethernet, and nearly uniform memory architecture, i.e. communication between CPUs is possible by directly reading from or writing to other CPUs' local memory. It turns out that the particle-mesh Ewald method (PME) performs surprisingly well and offers competitive performance unless parallel runs on PC hardware with older network infrastructure are needed. Lipid bilayers of sizes 128, 512 and 2048 lipid molecules were used as the test systems representing typical cases encountered in biomolecular simulations. Our results enable an accurate prediction of computational speed on most current computing systems, both for serial and parallel runs. These results should be helpful in, for example, choosing the most suitable configuration for a small departmental computer cluster.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wilson, David G.; Cook, Marvin A.
This report summarizes collaborative efforts between Secure Scalable Microgrid and Korean Institute of Energy Research team members . The efforts aim to advance microgrid research and development towards the efficient utilization of networked microgrids . The collaboration resulted in the identification of experimental and real time simulation capabilities that may be leveraged for networked microgrids research, development, and demonstration . Additional research was performed to support the demonstration of control techniques within real time simulation and with hardware in the loop for DC microgrids .
Parameter Studies, time-dependent simulations and design with automated Cartesian methods
NASA Technical Reports Server (NTRS)
Aftosmis, Michael
2005-01-01
Over the past decade, NASA has made a substantial investment in developing adaptive Cartesian grid methods for aerodynamic simulation. Cartesian-based methods played a key role in both the Space Shuttle Accident Investigation and in NASA's return to flight activities. The talk will provide an overview of recent technological developments focusing on the generation of large-scale aerodynamic databases, automated CAD-based design, and time-dependent simulations with of bodies in relative motion. Automation, scalability and robustness underly all of these applications and research in each of these topics will be presented.
He, Bo; Liu, Yang; Dong, Diya; Shen, Yue; Yan, Tianhong; Nian, Rui
2015-08-13
In this paper, a novel iterative sparse extended information filter (ISEIF) was proposed to solve the simultaneous localization and mapping problem (SLAM), which is very crucial for autonomous vehicles. The proposed algorithm solves the measurement update equations with iterative methods adaptively to reduce linearization errors. With the scalability advantage being kept, the consistency and accuracy of SEIF is improved. Simulations and practical experiments were carried out with both a land car benchmark and an autonomous underwater vehicle. Comparisons between iterative SEIF (ISEIF), standard EKF and SEIF are presented. All of the results convincingly show that ISEIF yields more consistent and accurate estimates compared to SEIF and preserves the scalability advantage over EKF, as well.
Djukic, Maja; Fulmer, Terry; Adams, Jennifer G; Lee, Sabrina; Triola, Marc M
2012-09-01
Interprofessional education is a critical precursor to effective teamwork and the collaboration of health care professionals in clinical settings. Numerous barriers have been identified that preclude scalable and sustainable interprofessional education (IPE) efforts. This article describes NYU3T: Teaching, Technology, Teamwork, a model that uses novel technologies such as Web-based learning, virtual patients, and high-fidelity simulation to overcome some of the common barriers and drive implementation of evidence-based teamwork curricula. It outlines the program's curricular components, implementation strategy, evaluation methods, and lessons learned from the first year of delivery and describes implications for future large-scale IPE initiatives. Copyright © 2012 Elsevier Inc. All rights reserved.
PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences
Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong
2015-01-01
Abstract We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate—slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory. PMID:25549288
Xiang, Yu; Chen, Chen; Zhang, Chongfu; Qiu, Kun
2013-01-14
In this paper, we propose and demonstrate a novel integrated radio-over-fiber passive optical network (RoF-PON) system for both wired and wireless access. By utilizing the polarization multiplexed four-wave mixing (FWM) effect in a semiconductor optical amplifier (SOA), scalable generation of multi-frequency millimeter-waves (MMWs) can be provided so as to assist the configuration of multi-frequency wireless access for the wire/wireless access integrated ROF-PON system. In order to obtain a better performance, the polarization multiplexed FWM effect is investigated in detail. Simulation results successfully verify the feasibility of our proposed scheme.
A Scalable Implementation of Van der Waals Density Functionals
NASA Astrophysics Data System (ADS)
Wu, Jun; Gygi, Francois
2010-03-01
Recently developed Van der Waals density functionals[1] offer the promise to account for weak intermolecular interactions that are not described accurately by local exchange-correlation density functionals. In spite of recent progress [2], the computational cost of such calculations remains high. We present a scalable parallel implementation of the functional proposed by Dion et al.[1]. The method is implemented in the Qbox first-principles simulation code (http://eslab.ucdavis.edu/software/qbox). Application to large molecular systems will be presented. [4pt] [1] M. Dion et al. Phys. Rev. Lett. 92, 246401 (2004).[0pt] [2] G. Roman-Perez and J. M. Soler, Phys. Rev. Lett. 103, 096102 (2009).
Dynamic Simulation of Human Thermoregulation and Heat Transfer for Spaceflight Applications
NASA Technical Reports Server (NTRS)
Miller, Thomas R.; Nelson, David A.; Bue, Grant; Kuznetz, Lawrence
2011-01-01
Models of human thermoregulation and heat transfer date from the early 1970s and have been developed for applications ranging from evaluating thermal comfort in spacecraft and aircraft cabin environments to predicting heat stress during EVAs. Most lumped or compartment models represent the body as an assemblage cylindrical and spherical elements which may be subdivided into layers to describe tissue heterogeneity. Many existing models are of limited usefulness in asymmetric thermal environments, such as may be encountered during an EVA. Conventional whole-body clothing models also limit the ability to describe local surface thermal and evaporation effects in sufficient detail. A further limitation is that models based on a standard man model are not readily scalable to represent large or small subjects. This work describes development of a new human thermal model derived from the 41-node man model. Each segment is divided into four concentric, constant thickness cylinders made up of a central core surrounded by muscle, fat, and skin, respectively. These cylinders are connected by the flow of blood from a central blood pool to each part. The central blood pool is updated at each time step, based on a whole-body energy balance. Results show the model simulates core and surface temperature histories, sweat evaporation and metabolic rates which generally are consistent with controlled exposures of human subjects. Scaling rules are developed to enable simulation of small and large subjects (5th percentile and 95th percentile). Future refinements will include a clothing model that addresses local surface insulation and permeation effects and developing control equations to describe thermoregulatory effects such as may occur with prolonged weightlessness or with aging.
Gorrindo, Tristan; Goldfarb, Elizabeth; Birnbaum, Robert J; Chevalier, Lydia; Meller, Benjamin; Alpert, Jonathan; Herman, John; Weiss, Anthony
2013-07-01
Ongoing professional practice evaluation (OPPE) activities consist of a quantitative, competency-based evaluation of clinical performance. Hospitals must design assessments that measure clinical competencies, are scalable, and minimize impact on the clinician's daily routines. A psychiatry department at a large academic medical center designed and implemented an interactive Web-based psychiatric simulation focusing on violence risk assessment as a tool for a departmentwide OPPE. Of 412 invited clinicians in a large psychiatry department, 410 completed an online simulation in April-May 2012. Participants received scheduled e-mail reminders with instructions describing how to access the simulation. Using the Computer Simulation Assessment Tool, participants viewed an introductory video and were then asked to conduct a risk assessment, acting as a clinician in the encounter by selecting actions from a series of drop-down menus. Each action was paired with a corresponding video segment of a clinical encounter with a standardized patient. Participants were scored on the basis of their actions within the simulation (Measure 1) and by their responses to the open-ended questions in which they were asked to integrate the information from the simulation in a summative manner (Measure 2). Of the 410 clinicians, 381 (92.9%) passed Measure 1,359 (87.6%) passed Measure 2, and 5 (1.2%) failed both measures. Seventy-five (18.3%) participants were referred for focused professional practice evaluation (FPPE) after failing either Measure 1, Measure 2, or both. Overall, Web-based simulation and e-mail engagement tools were a scalable and efficient way to assess a large number of clinicians in OPPE and to identify those who required FPPE.
Schilling, Lisa M.; Kwan, Bethany M.; Drolshagen, Charles T.; Hosokawa, Patrick W.; Brandt, Elias; Pace, Wilson D.; Uhrich, Christopher; Kamerick, Michael; Bunting, Aidan; Payne, Philip R.O.; Stephens, William E.; George, Joseph M.; Vance, Mark; Giacomini, Kelli; Braddy, Jason; Green, Mika K.; Kahn, Michael G.
2013-01-01
Introduction: Distributed Data Networks (DDNs) offer infrastructure solutions for sharing electronic health data from across disparate data sources to support comparative effectiveness research. Data sharing mechanisms must address technical and governance concerns stemming from network security and data disclosure laws and best practices, such as HIPAA. Methods: The Scalable Architecture for Federated Translational Inquiries Network (SAFTINet) deploys TRIAD grid technology, a common data model, detailed technical documentation, and custom software for data harmonization to facilitate data sharing in collaboration with stakeholders in the care of safety net populations. Data sharing partners host TRIAD grid nodes containing harmonized clinical data within their internal or hosted network environments. Authorized users can use a central web-based query system to request analytic data sets. Discussion: SAFTINet DDN infrastructure achieved a number of data sharing objectives, including scalable and sustainable systems for ensuring harmonized data structures and terminologies and secure distributed queries. Initial implementation challenges were resolved through iterative discussions, development and implementation of technical documentation, governance, and technology solutions. PMID:25848567
Schilling, Lisa M; Kwan, Bethany M; Drolshagen, Charles T; Hosokawa, Patrick W; Brandt, Elias; Pace, Wilson D; Uhrich, Christopher; Kamerick, Michael; Bunting, Aidan; Payne, Philip R O; Stephens, William E; George, Joseph M; Vance, Mark; Giacomini, Kelli; Braddy, Jason; Green, Mika K; Kahn, Michael G
2013-01-01
Distributed Data Networks (DDNs) offer infrastructure solutions for sharing electronic health data from across disparate data sources to support comparative effectiveness research. Data sharing mechanisms must address technical and governance concerns stemming from network security and data disclosure laws and best practices, such as HIPAA. The Scalable Architecture for Federated Translational Inquiries Network (SAFTINet) deploys TRIAD grid technology, a common data model, detailed technical documentation, and custom software for data harmonization to facilitate data sharing in collaboration with stakeholders in the care of safety net populations. Data sharing partners host TRIAD grid nodes containing harmonized clinical data within their internal or hosted network environments. Authorized users can use a central web-based query system to request analytic data sets. SAFTINet DDN infrastructure achieved a number of data sharing objectives, including scalable and sustainable systems for ensuring harmonized data structures and terminologies and secure distributed queries. Initial implementation challenges were resolved through iterative discussions, development and implementation of technical documentation, governance, and technology solutions.
SP-100 - The national space reactor power system program in response to future needs
NASA Astrophysics Data System (ADS)
Armijo, J. S.; Josloff, A. T.; Bailey, H. S.; Matteo, D. N.
The SP-100 system has been designed to meet comprehensive and demanding NASA/DOD/DOE requirements. The key requirements include: nuclear safety for all mission phases, scalability from 10's to 100's of kWe, reliable performance at full power for seven years of partial power for ten years, survivability in civil or military threat environments, capability to operate autonomously for up to six months, capability to protect payloads from excessive radiation, and compatibility with shuttle and expendable launch vehicles. The authors address of major progress in terms of design, flexibility/scalability, survivability, and development. These areas, with the exception of survivability, are discussed in detail. There has been significant improvement in the generic flight system design with substantial mass savings and simplification that enhance performance and reliability. Design activity has confirmed the scalability and flexibility of the system and the ability to efficiently meet NASA, AF, and SDIO needs. SP-100 development continues to make significant progress in all key technology areas.
ORNL Cray X1 evaluation status report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Agarwal, P.K.; Alexander, R.A.; Apra, E.
2004-05-01
On August 15, 2002 the Department of Energy (DOE) selected the Center for Computational Sciences (CCS) at Oak Ridge National Laboratory (ORNL) to deploy a new scalable vector supercomputer architecture for solving important scientific problems in climate, fusion, biology, nanoscale materials and astrophysics. ''This program is one of the first steps in an initiative designed to provide U.S. scientists with the computational power that is essential to 21st century scientific leadership,'' said Dr. Raymond L. Orbach, director of the department's Office of Science. In FY03, CCS procured a 256-processor Cray X1 to evaluate the processors, memory subsystem, scalability of themore » architecture, software environment and to predict the expected sustained performance on key DOE applications codes. The results of the micro-benchmarks and kernel bench marks show the architecture of the Cray X1 to be exceptionally fast for most operations. The best results are shown on large problems, where it is not possible to fit the entire problem into the cache of the processors. These large problems are exactly the types of problems that are important for the DOE and ultra-scale simulation. Application performance is found to be markedly improved by this architecture: - Large-scale simulations of high-temperature superconductors run 25 times faster than on an IBM Power4 cluster using the same number of processors. - Best performance of the parallel ocean program (POP v1.4.3) is 50 percent higher than on Japan s Earth Simulator and 5 times higher than on an IBM Power4 cluster. - A fusion application, global GYRO transport, was found to be 16 times faster on the X1 than on an IBM Power3. The increased performance allowed simulations to fully resolve questions raised by a prior study. - The transport kernel in the AGILE-BOLTZTRAN astrophysics code runs 15 times faster than on an IBM Power4 cluster using the same number of processors. - Molecular dynamics simulations related to the phenomenon of photon echo run 8 times faster than previously achieved. Even at 256 processors, the Cray X1 system is already outperforming other supercomputers with thousands of processors for a certain class of applications such as climate modeling and some fusion applications. This evaluation is the outcome of a number of meetings with both high-performance computing (HPC) system vendors and application experts over the past 9 months and has received broad-based support from the scientific community and other agencies.« less
Zary, Nabil; Johnson, Gunilla; Boberg, Jonas; Fors, Uno GH
2006-01-01
Background The Web-based Simulation of Patients (Web-SP) project was initiated in order to facilitate the use of realistic and interactive virtual patients (VP) in medicine and healthcare education. Web-SP focuses on moving beyond the technology savvy teachers, when integrating simulation-based education into health sciences curricula, by making the creation and use of virtual patients easier. The project strives to provide a common generic platform for design/creation, management, evaluation and sharing of web-based virtual patients. The aim of this study was to evaluate if it was possible to develop a web-based virtual patient case simulation environment where the entire case authoring process might be handled by teachers and which would be flexible enough to be used in different healthcare disciplines. Results The Web-SP system was constructed to support easy authoring, management and presentation of virtual patient cases. The case authoring environment was found to facilitate for teachers to create full-fledged patient cases without the assistance of computer specialists. Web-SP was successfully implemented at several universities by taking into account key factors such as cost, access, security, scalability and flexibility. Pilot evaluations in medical, dentistry and pharmacy courses shows that students regarded Web-SP as easy to use, engaging and to be of educational value. Cases adapted for all three disciplines were judged to be of significant educational value by the course leaders. Conclusion The Web-SP system seems to fulfil the aim of providing a common generic platform for creation, management and evaluation of web-based virtual patient cases. The responses regarding the authoring environment indicated that the system might be user-friendly enough to appeal to a majority of the academic staff. In terms of implementation strengths, Web-SP seems to fulfil most needs from course directors and teachers from various educational institutions and disciplines. The system is currently in use or under implementation in several healthcare disciplines at more than ten universities worldwide. Future aims include structuring the exchange of cases between teachers and academic institutions by building a VP library function. We intend to follow up the positive results presented in this paper with other studies looking at the learning outcomes, critical thinking and patient management. Studying the potential of Web-SP as an assessment tool will also be performed. More information about Web-SP: PMID:16504041
Zary, Nabil; Johnson, Gunilla; Boberg, Jonas; Fors, Uno G H
2006-02-21
The Web-based Simulation of Patients (Web-SP) project was initiated in order to facilitate the use of realistic and interactive virtual patients (VP) in medicine and healthcare education. Web-SP focuses on moving beyond the technology savvy teachers, when integrating simulation-based education into health sciences curricula, by making the creation and use of virtual patients easier. The project strives to provide a common generic platform for design/creation, management, evaluation and sharing of web-based virtual patients. The aim of this study was to evaluate if it was possible to develop a web-based virtual patient case simulation environment where the entire case authoring process might be handled by teachers and which would be flexible enough to be used in different healthcare disciplines. The Web-SP system was constructed to support easy authoring, management and presentation of virtual patient cases. The case authoring environment was found to facilitate for teachers to create full-fledged patient cases without the assistance of computer specialists. Web-SP was successfully implemented at several universities by taking into account key factors such as cost, access, security, scalability and flexibility. Pilot evaluations in medical, dentistry and pharmacy courses shows that students regarded Web-SP as easy to use, engaging and to be of educational value. Cases adapted for all three disciplines were judged to be of significant educational value by the course leaders. The Web-SP system seems to fulfil the aim of providing a common generic platform for creation, management and evaluation of web-based virtual patient cases. The responses regarding the authoring environment indicated that the system might be user-friendly enough to appeal to a majority of the academic staff. In terms of implementation strengths, Web-SP seems to fulfil most needs from course directors and teachers from various educational institutions and disciplines. The system is currently in use or under implementation in several healthcare disciplines at more than ten universities worldwide. Future aims include structuring the exchange of cases between teachers and academic institutions by building a VP library function. We intend to follow up the positive results presented in this paper with other studies looking at the learning outcomes, critical thinking and patient management. Studying the potential of Web-SP as an assessment tool will also be performed. More information about Web-SP: http://websp.lime.ki.se.
NASA Technical Reports Server (NTRS)
Hsu, E.; Hung, C.; Kadowaki, N.; Yoshimura, N.; Takahashi, T.; Shopbell, P.; Walker, G.; Wellnitz, D.; Gary, P.; Clark, G.;
2000-01-01
This paper describes the technologies and services used in the experiments and demonstrations using the Trans-Pacific high data rate satellite communications infrastructure, and how the environment tasked protocol adaptability, scalability, efficiency, interoperability, and robustness.
GPAW - massively parallel electronic structure calculations with Python-based software.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Enkovaara, J.; Romero, N.; Shende, S.
2011-01-01
Electronic structure calculations are a widely used tool in materials science and large consumer of supercomputing resources. Traditionally, the software packages for these kind of simulations have been implemented in compiled languages, where Fortran in its different versions has been the most popular choice. While dynamic, interpreted languages, such as Python, can increase the effciency of programmer, they cannot compete directly with the raw performance of compiled languages. However, by using an interpreted language together with a compiled language, it is possible to have most of the productivity enhancing features together with a good numerical performance. We have used thismore » approach in implementing an electronic structure simulation software GPAW using the combination of Python and C programming languages. While the chosen approach works well in standard workstations and Unix environments, massively parallel supercomputing systems can present some challenges in porting, debugging and profiling the software. In this paper we describe some details of the implementation and discuss the advantages and challenges of the combined Python/C approach. We show that despite the challenges it is possible to obtain good numerical performance and good parallel scalability with Python based software.« less
NASA Astrophysics Data System (ADS)
Vilotte, Jean-Pierre; Atkinson, Malcolm; Carpené, Michele; Casarotti, Emanuele; Frank, Anton; Igel, Heiner; Rietbrock, Andreas; Schwichtenberg, Horst; Spinuso, Alessandro
2016-04-01
Seismology pioneers global and open-data access -- with internationally approved data, metadata and exchange standards facilitated worldwide by the Federation of Digital Seismic Networks (FDSN) and in Europe the European Integrated Data Archives (EIDA). The growing wealth of data generated by dense observation and monitoring systems and recent advances in seismic wave simulation capabilities induces a change in paradigm. Data-intensive seismology research requires a new holistic approach combining scalable high-performance wave simulation codes and statistical data analysis methods, and integrating distributed data and computing resources. The European E-Infrastructure project "Virtual Earthquake and seismology Research Community e-science environment in Europe" (VERCE) pioneers the federation of autonomous organisations providing data and computing resources, together with a comprehensive, integrated and operational virtual research environment (VRE) and E-infrastructure devoted to the full path of data use in a research-driven context. VERCE delivers to a broad base of seismology researchers in Europe easily used high-performance full waveform simulations and misfit calculations, together with a data-intensive framework for the collaborative development of innovative statistical data analysis methods, all of which were previously only accessible to a small number of well-resourced groups. It balances flexibility with new integrated capabilities to provide a fluent path from research innovation to production. As such, VERCE is a major contribution to the implementation phase of the ``European Plate Observatory System'' (EPOS), the ESFRI initiative of the solid-Earth community. The VRE meets a range of seismic research needs by eliminating chores and technical difficulties to allow users to focus on their research questions. It empowers researchers to harvest the new opportunities provided by well-established and mature high-performance wave simulation codes of the community. It enables active researchers to invent and refine scalable methods for innovative statistical analysis of seismic waveforms in a wide range of application contexts. The VRE paves the way towards a flexible shared framework for seismic waveform inversion, lowering the barriers to uptake for the next generation of researchers. The VRE can be accessed through the science gateway that puts together computational and data-intensive research into the same framework, integrating multiple data sources and services. It provides a context for task-oriented and data-streaming workflows, and maps user actions to the full gamut of the federated platform resources and procurement policies, activating the necessary behind-the-scene automation and transformation. The platform manages and produces domain metadata, coupling them with the provenance information describing the relationships and the dependencies, which characterise the whole workflow process. This dynamic knowledge base, can be explored for validation purposes via a graphical interface and a web API. Moreover, it fosters the assisted selection and re-use of the data within each phase of the scientific analysis. These phases can be identified as Simulation, Data Access, Preprocessing, Misfit and data processing, and are presented to the users of the gateway as dedicated and interactive workspaces. By enabling researchers to share results and provenance information, VERCE steers open-science behaviour, allowing researchers to discover and build on prior work and thereby to progress faster. A key asset is the agile strategy that VERCE deployed in a multi-organisational context, engaging seismologists, data scientists, ICT researchers, HPC and data resource providers, system administrators into short-lived tasks each with a goal that is a seismology priority, and intimately coupling research thinking with technical innovation. This changes the focus from HPC production environments and community data services to user-focused scenario, avoiding wasteful bouts of technology centricity where technologists collect requirements and develop a system that is not used because the ideas of the planned users have moved on. As such the technologies and concepts developed in VERCE are relevant to many other disciplines in computational and data driven Earth Sciences and can provide the key technologies for a European wide computational and data intensive framework in Earth Sciences.
Mian, Adnan Noor; Fatima, Mehwish; Khan, Raees; Prakash, Ravi
2014-01-01
Energy efficiency is an important design paradigm in Wireless Sensor Networks (WSNs) and its consumption in dynamic environment is even more critical. Duty cycling of sensor nodes is used to address the energy consumption problem. However, along with advantages, duty cycle aware networks introduce some complexities like synchronization and latency. Due to their inherent characteristics, many traditional routing protocols show low performance in densely deployed WSNs with duty cycle awareness, when sensor nodes are supposed to have high mobility. In this paper we first present a three messages exchange Lightweight Random Walk Routing (LRWR) protocol and then evaluate its performance in WSNs for routing low data rate packets. Through NS-2 based simulations, we examine the LRWR protocol by comparing it with DYMO, a widely used WSN protocol, in both static and dynamic environments with varying duty cycles, assuming the standard IEEE 802.15.4 in lower layers. Results for the three metrics, that is, reliability, end-to-end delay, and energy consumption, show that LRWR protocol outperforms DYMO in scalability, mobility, and robustness, showing this protocol as a suitable choice in low duty cycle and dense WSNs.
Quantum networks in divergence-free circuit QED
NASA Astrophysics Data System (ADS)
Parra-Rodriguez, A.; Rico, E.; Solano, E.; Egusquiza, I. L.
2018-04-01
Superconducting circuits are one of the leading quantum platforms for quantum technologies. With growing system complexity, it is of crucial importance to develop scalable circuit models that contain the minimum information required to predict the behaviour of the physical system. Based on microwave engineering methods, divergent and non-divergent Hamiltonian models in circuit quantum electrodynamics have been proposed to explain the dynamics of superconducting quantum networks coupled to infinite-dimensional systems, such as transmission lines and general impedance environments. Here, we study systematically common linear coupling configurations between networks and infinite-dimensional systems. The main result is that the simple Lagrangian models for these configurations present an intrinsic natural length that provides a natural ultraviolet cutoff. This length is due to the unavoidable dressing of the environment modes by the network. In this manner, the coupling parameters between their components correctly manifest their natural decoupling at high frequencies. Furthermore, we show the requirements to correctly separate infinite-dimensional coupled systems in local bases. We also compare our analytical results with other analytical and approximate methods available in the literature. Finally, we propose several applications of these general methods to analogue quantum simulation of multi-spin-boson models in non-perturbative coupling regimes.
Scalability Test of Multiscale Fluid-Platelet Model for Three Top Supercomputers
Zhang, Peng; Zhang, Na; Gao, Chao; Zhang, Li; Gao, Yuxiang; Deng, Yuefan; Bluestein, Danny
2016-01-01
We have tested the scalability of three supercomputers: the Tianhe-2, Stampede and CS-Storm with multiscale fluid-platelet simulations, in which a highly-resolved and efficient numerical model for nanoscale biophysics of platelets in microscale viscous biofluids is considered. Three experiments involving varying problem sizes were performed: Exp-S: 680,718-particle single-platelet; Exp-M: 2,722,872-particle 4-platelet; and Exp-L: 10,891,488-particle 16-platelet. Our implementations of multiple time-stepping (MTS) algorithm improved the performance of single time-stepping (STS) in all experiments. Using MTS, our model achieved the following simulation rates: 12.5, 25.0, 35.5 μs/day for Exp-S and 9.09, 6.25, 14.29 μs/day for Exp-M on Tianhe-2, CS-Storm 16-K80 and Stampede K20. The best rate for Exp-L was 6.25 μs/day for Stampede. Utilizing current advanced HPC resources, the simulation rates achieved by our algorithms bring within reach performing complex multiscale simulations for solving vexing problems at the interface of biology and engineering, such as thrombosis in blood flow which combines millisecond-scale hematology with microscale blood flow at resolutions of micro-to-nanoscale cellular components of platelets. This study of testing the performance characteristics of supercomputers with advanced computational algorithms that offer optimal trade-off to achieve enhanced computational performance serves to demonstrate that such simulations are feasible with currently available HPC resources. PMID:27570250
Toward Interactive Scenario Analysis and Exploration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gayle, Thomas R.; Summers, Kenneth Lee; Jungels, John
2015-01-01
As Modeling and Simulation (M&S) tools have matured, their applicability and importance have increased across many national security challenges. In particular, they provide a way to test how something may behave without the need to do real world testing. However, current and future changes across several factors including capabilities, policy, and funding are driving a need for rapid response or evaluation in ways that many M&S tools cannot address. Issues around large data, computational requirements, delivery mechanisms, and analyst involvement already exist and pose significant challenges. Furthermore, rising expectations, rising input complexity, and increasing depth of analysis will only increasemore » the difficulty of these challenges. In this study we examine whether innovations in M&S software coupled with advances in ''cloud'' computing and ''big-data'' methodologies can overcome many of these challenges. In particular, we propose a simple, horizontally-scalable distributed computing environment that could provide the foundation (i.e. ''cloud'') for next-generation M&S-based applications based on the notion of ''parallel multi-simulation''. In our context, the goal of parallel multi- simulation is to consider as many simultaneous paths of execution as possible. Therefore, with sufficient resources, the complexity is dominated by the cost of single scenario runs as opposed to the number of runs required. We show the feasibility of this architecture through a stable prototype implementation coupled with the Umbra Simulation Framework [6]. Finally, we highlight the utility through multiple novel analysis tools and by showing the performance improvement compared to existing tools.« less
Yigzaw, Kassaye Yitbarek; Michalas, Antonis; Bellika, Johan Gustav
2017-01-03
Techniques have been developed to compute statistics on distributed datasets without revealing private information except the statistical results. However, duplicate records in a distributed dataset may lead to incorrect statistical results. Therefore, to increase the accuracy of the statistical analysis of a distributed dataset, secure deduplication is an important preprocessing step. We designed a secure protocol for the deduplication of horizontally partitioned datasets with deterministic record linkage algorithms. We provided a formal security analysis of the protocol in the presence of semi-honest adversaries. The protocol was implemented and deployed across three microbiology laboratories located in Norway, and we ran experiments on the datasets in which the number of records for each laboratory varied. Experiments were also performed on simulated microbiology datasets and data custodians connected through a local area network. The security analysis demonstrated that the protocol protects the privacy of individuals and data custodians under a semi-honest adversarial model. More precisely, the protocol remains secure with the collusion of up to N - 2 corrupt data custodians. The total runtime for the protocol scales linearly with the addition of data custodians and records. One million simulated records distributed across 20 data custodians were deduplicated within 45 s. The experimental results showed that the protocol is more efficient and scalable than previous protocols for the same problem. The proposed deduplication protocol is efficient and scalable for practical uses while protecting the privacy of patients and data custodians.
NASA Astrophysics Data System (ADS)
Rodriguez, M.; Brualla, L.
2018-04-01
Monte Carlo simulation of radiation transport is computationally demanding to obtain reasonably low statistical uncertainties of the estimated quantities. Therefore, it can benefit in a large extent from high-performance computing. This work is aimed at assessing the performance of the first generation of the many-integrated core architecture (MIC) Xeon Phi coprocessor with respect to that of a CPU consisting of a double 12-core Xeon processor in Monte Carlo simulation of coupled electron-photonshowers. The comparison was made twofold, first, through a suite of basic tests including parallel versions of the random number generators Mersenne Twister and a modified implementation of RANECU. These tests were addressed to establish a baseline comparison between both devices. Secondly, through the p DPM code developed in this work. p DPM is a parallel version of the Dose Planning Method (DPM) program for fast Monte Carlo simulation of radiation transport in voxelized geometries. A variety of techniques addressed to obtain a large scalability on the Xeon Phi were implemented in p DPM. Maximum scalabilities of 84 . 2 × and 107 . 5 × were obtained in the Xeon Phi for simulations of electron and photon beams, respectively. Nevertheless, in none of the tests involving radiation transport the Xeon Phi performed better than the CPU. The disadvantage of the Xeon Phi with respect to the CPU owes to the low performance of the single core of the former. A single core of the Xeon Phi was more than 10 times less efficient than a single core of the CPU for all radiation transport simulations.
Ultrascale collaborative visualization using a display-rich global cyberinfrastructure.
Jeong, Byungil; Leigh, Jason; Johnson, Andrew; Renambot, Luc; Brown, Maxine; Jagodic, Ratko; Nam, Sungwon; Hur, Hyejung
2010-01-01
The scalable adaptive graphics environment (SAGE) is high-performance graphics middleware for ultrascale collaborative visualization using a display-rich global cyberinfrastructure. Dozens of sites worldwide use this cyberinfrastructure middleware, which connects high-performance-computing resources over high-speed networks to distributed ultraresolution displays.
Sandboxes for Model-Based Inquiry
ERIC Educational Resources Information Center
Brady, Corey; Holbert, Nathan; Soylu, Firat; Novak, Michael; Wilensky, Uri
2015-01-01
In this article, we introduce a class of constructionist learning environments that we call "Emergent Systems Sandboxes" ("ESSs"), which have served as a centerpiece of our recent work in developing curriculum to support scalable model-based learning in classroom settings. ESSs are a carefully specified form of virtual…
Toolbox for Urban Mobility Simulation: High Resolution Population Dynamics for Global Cities
NASA Astrophysics Data System (ADS)
Bhaduri, B. L.; Lu, W.; Liu, C.; Thakur, G.; Karthik, R.
2015-12-01
In this rapidly urbanizing world, unprecedented rate of population growth is not only mirrored by increasing demand for energy, food, water, and other natural resources, but has detrimental impacts on environmental and human security. Transportation simulations are frequently used for mobility assessment in urban planning, traffic operation, and emergency management. Previous research, involving purely analytical techniques to simulations capturing behavior, has investigated questions and scenarios regarding the relationships among energy, emissions, air quality, and transportation. Primary limitations of past attempts have been availability of input data, useful "energy and behavior focused" models, validation data, and adequate computational capability that allows adequate understanding of the interdependencies of our transportation system. With increasing availability and quality of traditional and crowdsourced data, we have utilized the OpenStreetMap roads network, and has integrated high resolution population data with traffic simulation to create a Toolbox for Urban Mobility Simulations (TUMS) at global scale. TUMS consists of three major components: data processing, traffic simulation models, and Internet-based visualizations. It integrates OpenStreetMap, LandScanTM population, and other open data (Census Transportation Planning Products, National household Travel Survey, etc.) to generate both normal traffic operation and emergency evacuation scenarios. TUMS integrates TRANSIMS and MITSIM as traffic simulation engines, which are open-source and widely-accepted for scalable traffic simulations. Consistent data and simulation platform allows quick adaption to various geographic areas that has been demonstrated for multiple cities across the world. We are combining the strengths of geospatial data sciences, high performance simulations, transportation planning, and emissions, vehicle and energy technology development to design and develop a simulation framework to assist decision makers at all levels - local, state, regional, and federal. Using Cleveland, Tennessee as an example, in this presentation, we illustrate how emerging cities could easily assess future land use scenario driven impacts on energy and environment utilizing such a capability.
NASA Astrophysics Data System (ADS)
Yamamoto, H.; Nakajima, K.; Zhang, K.; Nanai, S.
2015-12-01
Powerful numerical codes that are capable of modeling complex coupled processes of physics and chemistry have been developed for predicting the fate of CO2 in reservoirs as well as its potential impacts on groundwater and subsurface environments. However, they are often computationally demanding for solving highly non-linear models in sufficient spatial and temporal resolutions. Geological heterogeneity and uncertainties further increase the challenges in modeling works. Two-phase flow simulations in heterogeneous media usually require much longer computational time than that in homogeneous media. Uncertainties in reservoir properties may necessitate stochastic simulations with multiple realizations. Recently, massively parallel supercomputers with more than thousands of processors become available in scientific and engineering communities. Such supercomputers may attract attentions from geoscientist and reservoir engineers for solving the large and non-linear models in higher resolutions within a reasonable time. However, for making it a useful tool, it is essential to tackle several practical obstacles to utilize large number of processors effectively for general-purpose reservoir simulators. We have implemented massively-parallel versions of two TOUGH2 family codes (a multi-phase flow simulator TOUGH2 and a chemically reactive transport simulator TOUGHREACT) on two different types (vector- and scalar-type) of supercomputers with a thousand to tens of thousands of processors. After completing implementation and extensive tune-up on the supercomputers, the computational performance was measured for three simulations with multi-million grid models, including a simulation of the dissolution-diffusion-convection process that requires high spatial and temporal resolutions to simulate the growth of small convective fingers of CO2-dissolved water to larger ones in a reservoir scale. The performance measurement confirmed that the both simulators exhibit excellent scalabilities showing almost linear speedup against number of processors up to over ten thousand cores. Generally this allows us to perform coupled multi-physics (THC) simulations on high resolution geologic models with multi-million grid in a practical time (e.g., less than a second per time step).
NASA Astrophysics Data System (ADS)
White, Christopher Joseph
We describe the implementation of sophisticated numerical techniques for general-relativistic magnetohydrodynamics simulations in the Athena++ code framework. Improvements over many existing codes include the use of advanced Riemann solvers and of staggered-mesh constrained transport. Combined with considerations for computational performance and parallel scalability, these allow us to investigate black hole accretion flows with unprecedented accuracy. The capability of the code is demonstrated by exploring magnetically arrested disks.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rizzi, Silvio; Hereld, Mark; Insley, Joseph
In this work we perform in-situ visualization of molecular dynamics simulations, which can help scientists to visualize simulation output on-the-fly, without incurring storage overheads. We present a case study to couple LAMMPS, the large-scale molecular dynamics simulation code with vl3, our parallel framework for large-scale visualization and analysis. Our motivation is to identify effective approaches for covisualization and exploration of large-scale atomistic simulations at interactive frame rates.We propose a system of coupled libraries and describe its architecture, with an implementation that runs on GPU-based clusters. We present the results of strong and weak scalability experiments, as well as future researchmore » avenues based on our results.« less
Simulation of the Impact of Packet Errors on the Kademlia Peer-to-Peer Routing
2010-09-01
during the routing process. Pastry [15] switches to a proximity based metric when approaching a node closely. This complicates the implementation...and Peter Druschel. Pastry : Scalable, distributed object location and routing for large-scale peer-to-peer systems. IFIP/ACM International Conference
Vroom: designing an augmented environment for remote collaboration in digital cinema production
NASA Astrophysics Data System (ADS)
Margolis, Todd; Cornish, Tracy
2013-03-01
As media technologies become increasingly affordable, compact and inherently networked, new generations of telecollaborative platforms continue to arise which integrate these new affordances. Virtual reality has been primarily concerned with creating simulations of environments that can transport participants to real or imagined spaces that replace the "real world". Meanwhile Augmented Reality systems have evolved to interleave objects from Virtual Reality environments into the physical landscape. Perhaps now there is a new class of systems that reverse this precept to enhance dynamic media landscapes and immersive physical display environments to enable intuitive data exploration through collaboration. Vroom (Virtual Room) is a next-generation reconfigurable tiled display environment in development at the California Institute for Telecommunications and Information Technology (Calit2) at the University of California, San Diego. Vroom enables freely scalable digital collaboratories, connecting distributed, high-resolution visualization resources for collaborative work in the sciences, engineering and the arts. Vroom transforms a physical space into an immersive media environment with large format interactive display surfaces, video teleconferencing and spatialized audio built on a highspeed optical network backbone. Vroom enables group collaboration for local and remote participants to share knowledge and experiences. Possible applications include: remote learning, command and control, storyboarding, post-production editorial review, high resolution video playback, 3D visualization, screencasting and image, video and multimedia file sharing. To support these various scenarios, Vroom features support for multiple user interfaces (optical tracking, touch UI, gesture interface, etc.), support for directional and spatialized audio, giga-pixel image interactivity, 4K video streaming, 3D visualization and telematic production. This paper explains the design process that has been utilized to make Vroom an accessible and intuitive immersive environment for remote collaboration specifically for digital cinema production.
Real-time micro-modelling of city evacuations
NASA Astrophysics Data System (ADS)
Löhner, Rainald; Haug, Eberhard; Zinggerling, Claudio; Oñate, Eugenio
2018-01-01
A methodology to integrate geographical information system (GIS) data with large-scale pedestrian simulations has been developed. Advances in automatic data acquisition and archiving from GIS databases, automatic input for pedestrian simulations, as well as scalable pedestrian simulation tools have made it possible to simulate pedestrians at the individual level for complete cities in real time. An example that simulates the evacuation of the city of Barcelona demonstrates that this is now possible. This is the first step towards a fully integrated crowd prediction and management tool that takes into account not only data gathered in real time from cameras, cell phones or other sensors, but also merges these with advanced simulation tools to predict the future state of the crowd.
NASA Astrophysics Data System (ADS)
Biercamp, Joachim; Adamidis, Panagiotis; Neumann, Philipp
2017-04-01
With the exa-scale era approaching, length and time scales used for climate research on one hand and numerical weather prediction on the other hand blend into each other. The Centre of Excellence in Simulation of Weather and Climate in Europe (ESiWACE) represents a European consortium comprising partners from climate, weather and HPC in their effort to address key scientific challenges that both communities have in common. A particular challenge is to reach global models with spatial resolutions that allow simulating convective clouds and small-scale ocean eddies. These simulations would produce better predictions of trends and provide much more fidelity in the representation of high-impact regional events. However, running such models in operational mode, i.e with sufficient throughput in ensemble mode clearly will require exa-scale computing and data handling capability. We will discuss the ESiWACE initiative and relate it to work-in-progress on high-resolution simulations in Europe. We present recent strong scalability measurements from ESiWACE to demonstrate current computability in weather and climate simulation. A special focus in this particular talk is on the Icosahedal Nonhydrostatic (ICON) model used for a comparison of high resolution regional and global simulations with high quality observation data. We demonstrate that close-to-optimal parallel efficiency can be achieved in strong scaling global resolution experiments on Mistral/DKRZ, e.g. 94% for 5km resolution simulations using 36k cores on Mistral/DKRZ. Based on our scalability and high-resolution experiments, we deduce and extrapolate future capabilities for ICON that are expected for weather and climate research at exascale.
PhreeqcRM: A reaction module for transport simulators based on the geochemical model PHREEQC
Parkhurst, David L.; Wissmeier, Laurin
2015-01-01
PhreeqcRM is a geochemical reaction module designed specifically to perform equilibrium and kinetic reaction calculations for reactive transport simulators that use an operator-splitting approach. The basic function of the reaction module is to take component concentrations from the model cells of the transport simulator, run geochemical reactions, and return updated component concentrations to the transport simulator. If multicomponent diffusion is modeled (e.g., Nernst–Planck equation), then aqueous species concentrations can be used instead of component concentrations. The reaction capabilities are a complete implementation of the reaction capabilities of PHREEQC. In each cell, the reaction module maintains the composition of all of the reactants, which may include minerals, exchangers, surface complexers, gas phases, solid solutions, and user-defined kinetic reactants.PhreeqcRM assigns initial and boundary conditions for model cells based on standard PHREEQC input definitions (files or strings) of chemical compositions of solutions and reactants. Additional PhreeqcRM capabilities include methods to eliminate reaction calculations for inactive parts of a model domain, transfer concentrations and other model properties, and retrieve selected results. The module demonstrates good scalability for parallel processing by using multiprocessing with MPI (message passing interface) on distributed memory systems, and limited scalability using multithreading with OpenMP on shared memory systems. PhreeqcRM is written in C++, but interfaces allow methods to be called from C or Fortran. By using the PhreeqcRM reaction module, an existing multicomponent transport simulator can be extended to simulate a wide range of geochemical reactions. Results of the implementation of PhreeqcRM as the reaction engine for transport simulators PHAST and FEFLOW are shown by using an analytical solution and the reactive transport benchmark of MoMaS.
NASA Astrophysics Data System (ADS)
Cary, John R.; Abell, D.; Amundson, J.; Bruhwiler, D. L.; Busby, R.; Carlsson, J. A.; Dimitrov, D. A.; Kashdan, E.; Messmer, P.; Nieter, C.; Smithe, D. N.; Spentzouris, P.; Stoltz, P.; Trines, R. M.; Wang, H.; Werner, G. R.
2006-09-01
As the size and cost of particle accelerators escalate, high-performance computing plays an increasingly important role; optimization through accurate, detailed computermodeling increases performance and reduces costs. But consequently, computer simulations face enormous challenges. Early approximation methods, such as expansions in distance from the design orbit, were unable to supply detailed accurate results, such as in the computation of wake fields in complex cavities. Since the advent of message-passing supercomputers with thousands of processors, earlier approximations are no longer necessary, and it is now possible to compute wake fields, the effects of dampers, and self-consistent dynamics in cavities accurately. In this environment, the focus has shifted towards the development and implementation of algorithms that scale to large numbers of processors. So-called charge-conserving algorithms evolve the electromagnetic fields without the need for any global solves (which are difficult to scale up to many processors). Using cut-cell (or embedded) boundaries, these algorithms can simulate the fields in complex accelerator cavities with curved walls. New implicit algorithms, which are stable for any time-step, conserve charge as well, allowing faster simulation of structures with details small compared to the characteristic wavelength. These algorithmic and computational advances have been implemented in the VORPAL7 Framework, a flexible, object-oriented, massively parallel computational application that allows run-time assembly of algorithms and objects, thus composing an application on the fly.
Open system environment procurement
NASA Technical Reports Server (NTRS)
Fisher, Gary
1994-01-01
Relationships between the request for procurement (RFP) process and open system environment (OSE) standards are described. A guide was prepared to help Federal agency personnel overcome problems in writing an adequate statement of work and developing realistic evaluation criteria when transitioning to an OSE. The guide contains appropriate decision points and transition strategies for developing applications that are affordable, scalable and interoperable across a broad range of computing environments. While useful, the guide does not eliminate the requirement that agencies posses in-depth expertise in software development, communications, and database technology in order to evaluate open systems.
COMPREHENSIVE PBPK MODELING APPROACH USING THE EXPOSURE RELATED DOSE ESTIMATING MODEL (ERDEM)
ERDEM, a complex PBPK modeling system, is the result of the implementation of a comprehensive PBPK modeling approach. ERDEM provides a scalable and user-friendly environment that enables researchers to focus on data input values rather than writing program code. It efficiently ...
High performance MRI simulations of motion on multi-GPU systems.
Xanthis, Christos G; Venetis, Ioannis E; Aletras, Anthony H
2014-07-04
MRI physics simulators have been developed in the past for optimizing imaging protocols and for training purposes. However, these simulators have only addressed motion within a limited scope. The purpose of this study was the incorporation of realistic motion, such as cardiac motion, respiratory motion and flow, within MRI simulations in a high performance multi-GPU environment. Three different motion models were introduced in the Magnetic Resonance Imaging SIMULator (MRISIMUL) of this study: cardiac motion, respiratory motion and flow. Simulation of a simple Gradient Echo pulse sequence and a CINE pulse sequence on the corresponding anatomical model was performed. Myocardial tagging was also investigated. In pulse sequence design, software crushers were introduced to accommodate the long execution times in order to avoid spurious echoes formation. The displacement of the anatomical model isochromats was calculated within the Graphics Processing Unit (GPU) kernel for every timestep of the pulse sequence. Experiments that would allow simulation of custom anatomical and motion models were also performed. Last, simulations of motion with MRISIMUL on single-node and multi-node multi-GPU systems were examined. Gradient Echo and CINE images of the three motion models were produced and motion-related artifacts were demonstrated. The temporal evolution of the contractility of the heart was presented through the application of myocardial tagging. Better simulation performance and image quality were presented through the introduction of software crushers without the need to further increase the computational load and GPU resources. Last, MRISIMUL demonstrated an almost linear scalable performance with the increasing number of available GPU cards, in both single-node and multi-node multi-GPU computer systems. MRISIMUL is the first MR physics simulator to have implemented motion with a 3D large computational load on a single computer multi-GPU configuration. The incorporation of realistic motion models, such as cardiac motion, respiratory motion and flow may benefit the design and optimization of existing or new MR pulse sequences, protocols and algorithms, which examine motion related MR applications.
Proxy-equation paradigm: A strategy for massively parallel asynchronous computations
NASA Astrophysics Data System (ADS)
Mittal, Ankita; Girimaji, Sharath
2017-09-01
Massively parallel simulations of transport equation systems call for a paradigm change in algorithm development to achieve efficient scalability. Traditional approaches require time synchronization of processing elements (PEs), which severely restricts scalability. Relaxing synchronization requirement introduces error and slows down convergence. In this paper, we propose and develop a novel "proxy equation" concept for a general transport equation that (i) tolerates asynchrony with minimal added error, (ii) preserves convergence order and thus, (iii) expected to scale efficiently on massively parallel machines. The central idea is to modify a priori the transport equation at the PE boundaries to offset asynchrony errors. Proof-of-concept computations are performed using a one-dimensional advection (convection) diffusion equation. The results demonstrate the promise and advantages of the present strategy.
He, Bo; Liu, Yang; Dong, Diya; Shen, Yue; Yan, Tianhong; Nian, Rui
2015-01-01
In this paper, a novel iterative sparse extended information filter (ISEIF) was proposed to solve the simultaneous localization and mapping problem (SLAM), which is very crucial for autonomous vehicles. The proposed algorithm solves the measurement update equations with iterative methods adaptively to reduce linearization errors. With the scalability advantage being kept, the consistency and accuracy of SEIF is improved. Simulations and practical experiments were carried out with both a land car benchmark and an autonomous underwater vehicle. Comparisons between iterative SEIF (ISEIF), standard EKF and SEIF are presented. All of the results convincingly show that ISEIF yields more consistent and accurate estimates compared to SEIF and preserves the scalability advantage over EKF, as well. PMID:26287194
Scalable Entity-Based Modeling of Population-Based Systems, Final LDRD Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cleary, A J; Smith, S G; Vassilevska, T K
2005-01-27
The goal of this project has been to develop tools, capabilities and expertise in the modeling of complex population-based systems via scalable entity-based modeling (EBM). Our initial focal application domain has been the dynamics of large populations exposed to disease-causing agents, a topic of interest to the Department of Homeland Security in the context of bioterrorism. In the academic community, discrete simulation technology based on individual entities has shown initial success, but the technology has not been scaled to the problem sizes or computational resources of LLNL. Our developmental emphasis has been on the extension of this technology to parallelmore » computers and maturation of the technology from an academic to a lab setting.« less
Grundy, Lorena S; Lee, Victoria E; Li, Nannan; Sosa, Chris; Mulhearn, William D; Liu, Rui; Register, Richard A; Nikoubashman, Arash; Prud'homme, Robert K; Panagiotopoulos, Athanassios Z; Priestley, Rodney D
2018-05-08
Colloids with internally structured geometries have shown great promise in applications ranging from biosensors to optics to drug delivery, where the internal particle structure is paramount to performance. The growing demand for such nanomaterials necessitates the development of a scalable processing platform for their production. Flash nanoprecipitation (FNP), a rapid and inherently scalable colloid precipitation technology, is used to prepare internally structured colloids from blends of block copolymers and homopolymers. As revealed by a combination of experiments and simulations, colloids prepared from different molecular weight diblock copolymers adopt either an ordered lamellar morphology consisting of concentric shells or a disordered lamellar morphology when chain dynamics are sufficiently slow to prevent defect annealing during solvent exchange. Blends of homopolymer and block copolymer in the feed stream generate more complex internally structured colloids, such as those with hierarchically structured Janus and patchy morphologies, due to additional phase separation and kinetic trapping effects. The ability of the FNP process to generate such a wide range of morphologies using a simple and scalable setup provides a pathway to manufacturing internally structured colloids on an industrial scale.
Scalability of Several Asynchronous Many-Task Models for In Situ Statistical Analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pebay, Philippe Pierre; Bennett, Janine Camille; Kolla, Hemanth
This report is a sequel to [PB16], in which we provided a first progress report on research and development towards a scalable, asynchronous many-task, in situ statistical analysis engine using the Legion runtime system. This earlier work included a prototype implementation of a proposed solution, using a proxy mini-application as a surrogate for a full-scale scientific simulation code. The first scalability studies were conducted with the above on modestly-sized experimental clusters. In contrast, in the current work we have integrated our in situ analysis engines with a full-size scientific application (S3D, using the Legion-SPMD model), and have conducted nu- mericalmore » tests on the largest computational platform currently available for DOE science ap- plications. We also provide details regarding the design and development of a light-weight asynchronous collectives library. We describe how this library is utilized within our SPMD- Legion S3D workflow, and compare the data aggregation technique deployed herein to the approach taken within our previous work.« less
Numerical simulation on a straight-bladed vertical axis wind turbine with auxiliary blade
NASA Astrophysics Data System (ADS)
Li, Y.; Zheng, Y. F.; Feng, F.; He, Q. B.; Wang, N. X.
2016-08-01
To improve the starting performance of the straight-bladed vertical axis wind turbine (SB-VAWT) at low wind speed, and the output characteristics at high wind speed, a flexible, scalable auxiliary vane mechanism was designed and installed into the rotor of SB-VAWT in this study. This new vertical axis wind turbine is a kind of lift-to-drag combination wind turbine. The flexible blade expanded, and the driving force of the wind turbines comes mainly from drag at low rotational speed. On the other hand, the flexible blade is retracted at higher speed, and the driving force is primarily from a lift. To research the effects of the flexible, scalable auxiliary module on the performance of SB-VAWT and to find its best parameters, the computational fluid dynamics (CFD) numerical calculation was carried out. The calculation result shows that the flexible, scalable blades can automatic expand and retract with the rotational speed. The moment coefficient at low tip speed ratio increased substantially. Meanwhile, the moment coefficient has also been improved at high tip speed ratios in certain ranges.
Scalable metagenomic taxonomy classification using a reference genome database
Ames, Sasha K.; Hysom, David A.; Gardner, Shea N.; Lloyd, G. Scott; Gokhale, Maya B.; Allen, Jonathan E.
2013-01-01
Motivation: Deep metagenomic sequencing of biological samples has the potential to recover otherwise difficult-to-detect microorganisms and accurately characterize biological samples with limited prior knowledge of sample contents. Existing metagenomic taxonomic classification algorithms, however, do not scale well to analyze large metagenomic datasets, and balancing classification accuracy with computational efficiency presents a fundamental challenge. Results: A method is presented to shift computational costs to an off-line computation by creating a taxonomy/genome index that supports scalable metagenomic classification. Scalable performance is demonstrated on real and simulated data to show accurate classification in the presence of novel organisms on samples that include viruses, prokaryotes, fungi and protists. Taxonomic classification of the previously published 150 giga-base Tyrolean Iceman dataset was found to take <20 h on a single node 40 core large memory machine and provide new insights on the metagenomic contents of the sample. Availability: Software was implemented in C++ and is freely available at http://sourceforge.net/projects/lmat Contact: allen99@llnl.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23828782
Statistical exchange-coupling errors and the practicality of scalable silicon donor qubits
NASA Astrophysics Data System (ADS)
Song, Yang; Das Sarma, S.
2016-12-01
Recent experimental efforts have led to considerable interest in donor-based localized electron spins in Si as viable qubits for a scalable silicon quantum computer. With the use of isotopically purified 28Si and the realization of extremely long spin coherence time in single-donor electrons, the recent experimental focus is on two-coupled donors with the eventual goal of a scaled-up quantum circuit. Motivated by this development, we simulate the statistical distribution of the exchange coupling J between a pair of donors under realistic donor placement straggles, and quantify the errors relative to the intended J value. With J values in a broad range of donor-pair separation ( 5 <|R |<60 nm), we work out various cases systematically, for a target donor separation R0 along the [001], [110] and [111] Si crystallographic directions, with |R0|=10 ,20 or 30 nm and standard deviation σR=1 ,2 ,5 or 10 nm. Our extensive theoretical results demonstrate the great challenge for a prescribed J gate even with just a donor pair, a first step for any scalable Si-donor-based quantum computer.
Efficient Measurement of Multiparticle Entanglement with Embedding Quantum Simulator.
Chen, Ming-Cheng; Wu, Dian; Su, Zu-En; Cai, Xin-Dong; Wang, Xi-Lin; Yang, Tao; Li, Li; Liu, Nai-Le; Lu, Chao-Yang; Pan, Jian-Wei
2016-02-19
The quantum measurement of entanglement is a demanding task in the field of quantum information. Here, we report the direct and scalable measurement of multiparticle entanglement with embedding photonic quantum simulators. In this embedding framework [R. Di Candia et al. Phys. Rev. Lett. 111, 240502 (2013)], the N-qubit entanglement, which does not associate with a physical observable directly, can be efficiently measured with only two (for even N) and six (for odd N) local measurement settings. Our experiment uses multiphoton quantum simulators to mimic dynamical concurrence and three-tangle entangled systems and to track their entanglement evolutions.
NASA Astrophysics Data System (ADS)
Gross, Lutz; Altinay, Cihan; Fenwick, Joel; Smith, Troy
2014-05-01
The program package escript has been designed for solving mathematical modeling problems using python, see Gross et al. (2013). Its development and maintenance has been funded by the Australian Commonwealth to provide open source software infrastructure for the Australian Earth Science community (recent funding by the Australian Geophysical Observing System EIF (AGOS) and the AuScope Collaborative Research Infrastructure Scheme (CRIS)). The key concepts of escript are based on the terminology of spatial functions and partial differential equations (PDEs) - an approach providing abstraction from the underlying spatial discretization method (i.e. the finite element method (FEM)). This feature presents a programming environment to the user which is easy to use even for complex models. Due to the fact that implementations are independent from data structures simulations are easily portable across desktop computers and scalable compute clusters without modifications to the program code. escript has been successfully applied in a variety of applications including modeling mantel convection, melting processes, volcanic flow, earthquakes, faulting, multi-phase flow, block caving and mineralization (see Poulet et al. 2013). The recent escript release (see Gross et al. (2013)) provides an open framework for solving joint inversion problems for geophysical data sets (potential field, seismic and electro-magnetic). The strategy bases on the idea to formulate the inversion problem as an optimization problem with PDE constraints where the cost function is defined by the data defect and the regularization term for the rock properties, see Gross & Kemp (2013). This approach of first-optimize-then-discretize avoids the assemblage of the - in general- dense sensitivity matrix as used in conventional approaches where discrete programming techniques are applied to the discretized problem (first-discretize-then-optimize). In this paper we will discuss the mathematical framework for inversion and appropriate solution schemes in escript. We will also give a brief introduction into escript's open framework for defining and solving geophysical inversion problems. Finally we will show some benchmark results to demonstrate the computational scalability of the inversion method across a large number of cores and compute nodes in a parallel computing environment. References: - L. Gross et al. (2013): Escript Solving Partial Differential Equations in Python Version 3.4, The University of Queensland, https://launchpad.net/escript-finley - L. Gross and C. Kemp (2013) Large Scale Joint Inversion of Geophysical Data using the Finite Element Method in escript. ASEG Extended Abstracts 2013, http://dx.doi.org/10.1071/ASEG2013ab306 - T. Poulet, L. Gross, D. Georgiev, J. Cleverley (2012): escript-RT: Reactive transport simulation in Python using escript, Computers & Geosciences, Volume 45, 168-176. http://dx.doi.org/10.1016/j.cageo.2011.11.005.
NASA Astrophysics Data System (ADS)
Shi, X.
2015-12-01
As NSF indicated - "Theory and experimentation have for centuries been regarded as two fundamental pillars of science. It is now widely recognized that computational and data-enabled science forms a critical third pillar." Geocomputation is the third pillar of GIScience and geosciences. With the exponential growth of geodata, the challenge of scalable and high performance computing for big data analytics become urgent because many research activities are constrained by the inability of software or tool that even could not complete the computation process. Heterogeneous geodata integration and analytics obviously magnify the complexity and operational time frame. Many large-scale geospatial problems may be not processable at all if the computer system does not have sufficient memory or computational power. Emerging computer architectures, such as Intel's Many Integrated Core (MIC) Architecture and Graphics Processing Unit (GPU), and advanced computing technologies provide promising solutions to employ massive parallelism and hardware resources to achieve scalability and high performance for data intensive computing over large spatiotemporal and social media data. Exploring novel algorithms and deploying the solutions in massively parallel computing environment to achieve the capability for scalable data processing and analytics over large-scale, complex, and heterogeneous geodata with consistent quality and high-performance has been the central theme of our research team in the Department of Geosciences at the University of Arkansas (UARK). New multi-core architectures combined with application accelerators hold the promise to achieve scalability and high performance by exploiting task and data levels of parallelism that are not supported by the conventional computing systems. Such a parallel or distributed computing environment is particularly suitable for large-scale geocomputation over big data as proved by our prior works, while the potential of such advanced infrastructure remains unexplored in this domain. Within this presentation, our prior and on-going initiatives will be summarized to exemplify how we exploit multicore CPUs, GPUs, and MICs, and clusters of CPUs, GPUs and MICs, to accelerate geocomputation in different applications.
Yang, L. H.; Brooks III, E. D.; Belak, J.
1992-01-01
A molecular dynamics algorithm for performing large-scale simulations using the Parallel C Preprocessor (PCP) programming paradigm on the BBN TC2000, a massively parallel computer, is discussed. The algorithm uses a linked-cell data structure to obtain the near neighbors of each atom as time evoles. Each processor is assigned to a geometric domain containing many subcells and the storage for that domain is private to the processor. Within this scheme, the interdomain (i.e., interprocessor) communication is minimized.
Computer simulation of heterogeneous polymer photovoltaic devices
NASA Astrophysics Data System (ADS)
Kodali, Hari K.; Ganapathysubramanian, Baskar
2012-04-01
Polymer-based photovoltaic devices have the potential for widespread usage due to their low cost per watt and mechanical flexibility. Efficiencies close to 9.0% have been achieved recently in conjugated polymer based organic solar cells (OSCs). These devices were fabricated using solvent-based processing of electron-donating and electron-accepting materials into the so-called bulk heterojunction (BHJ) architecture. Experimental evidence suggests that a key property determining the power-conversion efficiency of such devices is the final morphological distribution of the donor and acceptor constituents. In order to understand the role of morphology on device performance, we develop a scalable computational framework that efficiently interrogates OSCs to investigate relationships between the morphology at the nano-scale with the device performance. In this work, we extend the Buxton and Clarke model (2007 Modelling Simul. Mater. Sci. Eng. 15 13-26) to simulate realistic devices with complex active layer morphologies using a dimensionally independent, scalable, finite-element method. We incorporate all stages involved in current generation, namely (1) exciton generation and diffusion, (2) charge generation and (3) charge transport in a modular fashion. The numerical challenges encountered during interrogation of realistic microstructures are detailed. We compare each stage of the photovoltaic process for two microstructures: a BHJ morphology and an idealized sawtooth morphology. The results are presented for both two- and three-dimensional structures.
Gamell, Marc; Teranishi, Keita; Kolla, Hemanth; ...
2017-10-26
In order to achieve exascale systems, application resilience needs to be addressed. Some programming models, such as task-DAG (directed acyclic graphs) architectures, currently embed resilience features whereas traditional SPMD (single program, multiple data) and message-passing models do not. Since a large part of the community's code base follows the latter models, it is still required to take advantage of application characteristics to minimize the overheads of fault tolerance. To that end, this paper explores how recovering from hard process/node failures in a local manner is a natural approach for certain applications to obtain resilience at lower costs in faulty environments.more » In particular, this paper targets enabling online, semitransparent local recovery for stencil computations on current leadership-class systems as well as presents programming support and scalable runtime mechanisms. Also described and demonstrated in this paper is the effect of failure masking, which allows the effective reduction of impact on total time to solution due to multiple failures. Furthermore, we discuss, implement, and evaluate ghost region expansion and cell-to-rank remapping to increase the probability of failure masking. To conclude, this paper shows the integration of all aforementioned mechanisms with the S3D combustion simulation through an experimental demonstration (using the Titan system) of the ability to tolerate high failure rates (i.e., node failures every five seconds) with low overhead while sustaining performance at large scales. In addition, this demonstration also displays the failure masking probability increase resulting from the combination of both ghost region expansion and cell-to-rank remapping.« less
Slattery, Stuart R.
2015-12-02
In this study we analyze and extend mesh-free algorithms for three-dimensional data transfer problems in partitioned multiphysics simulations. We first provide a direct comparison between a mesh-based weighted residual method using the common-refinement scheme and two mesh-free algorithms leveraging compactly supported radial basis functions: one using a spline interpolation and one using a moving least square reconstruction. Through the comparison we assess both the conservation and accuracy of the data transfer obtained from each of the methods. We do so for a varying set of geometries with and without curvature and sharp features and for functions with and without smoothnessmore » and with varying gradients. Our results show that the mesh-based and mesh-free algorithms are complementary with cases where each was demonstrated to perform better than the other. We then focus on the mesh-free methods by developing a set of algorithms to parallelize them based on sparse linear algebra techniques. This includes a discussion of fast parallel radius searching in point clouds and restructuring the interpolation algorithms to leverage data structures and linear algebra services designed for large distributed computing environments. The scalability of our new algorithms is demonstrated on a leadership class computing facility using a set of basic scaling studies. Finally, these scaling studies show that for problems with reasonable load balance, our new algorithms for both spline interpolation and moving least square reconstruction demonstrate both strong and weak scalability using more than 100,000 MPI processes with billions of degrees of freedom in the data transfer operation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gamell, Marc; Teranishi, Keita; Kolla, Hemanth
In order to achieve exascale systems, application resilience needs to be addressed. Some programming models, such as task-DAG (directed acyclic graphs) architectures, currently embed resilience features whereas traditional SPMD (single program, multiple data) and message-passing models do not. Since a large part of the community's code base follows the latter models, it is still required to take advantage of application characteristics to minimize the overheads of fault tolerance. To that end, this paper explores how recovering from hard process/node failures in a local manner is a natural approach for certain applications to obtain resilience at lower costs in faulty environments.more » In particular, this paper targets enabling online, semitransparent local recovery for stencil computations on current leadership-class systems as well as presents programming support and scalable runtime mechanisms. Also described and demonstrated in this paper is the effect of failure masking, which allows the effective reduction of impact on total time to solution due to multiple failures. Furthermore, we discuss, implement, and evaluate ghost region expansion and cell-to-rank remapping to increase the probability of failure masking. To conclude, this paper shows the integration of all aforementioned mechanisms with the S3D combustion simulation through an experimental demonstration (using the Titan system) of the ability to tolerate high failure rates (i.e., node failures every five seconds) with low overhead while sustaining performance at large scales. In addition, this demonstration also displays the failure masking probability increase resulting from the combination of both ghost region expansion and cell-to-rank remapping.« less
ExSTraCS 2.0: Description and Evaluation of a Scalable Learning Classifier System.
Urbanowicz, Ryan J; Moore, Jason H
2015-09-01
Algorithmic scalability is a major concern for any machine learning strategy in this age of 'big data'. A large number of potentially predictive attributes is emblematic of problems in bioinformatics, genetic epidemiology, and many other fields. Previously, ExS-TraCS was introduced as an extended Michigan-style supervised learning classifier system that combined a set of powerful heuristics to successfully tackle the challenges of classification, prediction, and knowledge discovery in complex, noisy, and heterogeneous problem domains. While Michigan-style learning classifier systems are powerful and flexible learners, they are not considered to be particularly scalable. For the first time, this paper presents a complete description of the ExS-TraCS algorithm and introduces an effective strategy to dramatically improve learning classifier system scalability. ExSTraCS 2.0 addresses scalability with (1) a rule specificity limit, (2) new approaches to expert knowledge guided covering and mutation mechanisms, and (3) the implementation and utilization of the TuRF algorithm for improving the quality of expert knowledge discovery in larger datasets. Performance over a complex spectrum of simulated genetic datasets demonstrated that these new mechanisms dramatically improve nearly every performance metric on datasets with 20 attributes and made it possible for ExSTraCS to reliably scale up to perform on related 200 and 2000-attribute datasets. ExSTraCS 2.0 was also able to reliably solve the 6, 11, 20, 37, 70, and 135 multiplexer problems, and did so in similar or fewer learning iterations than previously reported, with smaller finite training sets, and without using building blocks discovered from simpler multiplexer problems. Furthermore, ExS-TraCS usability was made simpler through the elimination of previously critical run parameters.
ERIC Educational Resources Information Center
Lundquist, Carol; Frieder, Ophir; Holmes, David O.; Grossman, David
1999-01-01
Describes a scalable, parallel, relational database-drive information retrieval engine. To support portability across a wide range of execution environments, all algorithms adhere to the SQL-92 standard. By incorporating relevance feedback algorithms, accuracy is enhanced over prior database-driven information retrieval efforts. Presents…
Cross-Layer Design for Robust and Scalable Video Transmission in Dynamic Wireless Environment
2011-02-01
code rate convolutional codes or prioritized Rate - Compatible Punctured ...34New rate - compatible punctured convolutional codes for Viterbi decoding," IEEE Trans. Communications, Volume 42, Issue 12, pp. 3073-3079, Dec...Quality of service RCPC Rate - compatible and punctured convolutional codes SNR Signal to noise
DEVELOPMENTAL NEUROTOX ASSAY USING SCALABLE NEURONS AND ASTROCYTES IN HIGH-CONTENT IMAGING - PHASE I
The objective of this proposed project and the EPA Computational Toxicology program’s goals are aligned in assessing chemicals for potential risk to human health and the environment through ...
Constrained tri-sphere kinematic positioning system
Viola, Robert J
2010-12-14
A scalable and adaptable, six-degree-of-freedom, kinematic positioning system is described. The system can position objects supported on top of, or suspended from, jacks comprising constrained joints. The system is compatible with extreme low temperature or high vacuum environments. When constant adjustment is not required a removable motor unit is available.
The implementation of a comprehensive PBPK modeling approach resulted in ERDEM, a complex PBPK modeling system. ERDEM provides a scalable and user-friendly environment that enables researchers to focus on data input values rather than writing program code. ERDEM efficiently m...
SCIMITAR: Scalable Stream-Processing for Sensor Information Brokering
2013-11-01
IaaS) cloud frameworks including Amazon Web Services and Eucalyptus . For load testing, we used The Grinder [9], a Java load testing framework that...internal Eucalyptus cluster which we could not scale as large as the Amazon environment due to a lack of computation resources. We recreated our
2004-12-01
handling using the X10 home automation protocol. Each 3D graphics client renders its scene according to an assigned virtual camera position. By having...control protocol. DMX is a versatile and robust framework which overcomes limitations of the X10 home automation protocol which we are currently using
Towards New Multiplatform Hybrid Online Laboratory Models
ERIC Educational Resources Information Center
Rodriguez-Gil, Luis; García-Zubia, Javier; Orduña, Pablo; López-de-Ipiña, Diego
2017-01-01
Online laboratories have traditionally been split between virtual labs, with simulated components; and remote labs, with real components. The former tend to provide less realism but to be easily scalable and less expensive to maintain, while the latter are fully real but tend to require a higher maintenance effort and be more error-prone. This…
SSC San Diego Command History Calendar Year 2004
2005-03-01
operational capability for testing on 1 October. JTRS radios will be software- reprogrammable , multi-band/multi-mode capable, networkable, scalable in terms of...Simulator 6,710,737 B1 23 Mar 04 Scheps, Richard Automobile Engine Disabling Device 6,723,225 B2 20 Apr 04 Ramirez, Ayax D. Resonance Tunable Optical Filter
Scalable Nonlinear Solvers for Fully Implicit Coupled Nuclear Fuel Modeling. Final Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cai, Xiao-Chuan; Keyes, David; Yang, Chao
2014-09-29
The focus of the project is on the development and customization of some highly scalable domain decomposition based preconditioning techniques for the numerical solution of nonlinear, coupled systems of partial differential equations (PDEs) arising from nuclear fuel simulations. These high-order PDEs represent multiple interacting physical fields (for example, heat conduction, oxygen transport, solid deformation), each is modeled by a certain type of Cahn-Hilliard and/or Allen-Cahn equations. Most existing approaches involve a careful splitting of the fields and the use of field-by-field iterations to obtain a solution of the coupled problem. Such approaches have many advantages such as ease of implementationmore » since only single field solvers are needed, but also exhibit disadvantages. For example, certain nonlinear interactions between the fields may not be fully captured, and for unsteady problems, stable time integration schemes are difficult to design. In addition, when implemented on large scale parallel computers, the sequential nature of the field-by-field iterations substantially reduces the parallel efficiency. To overcome the disadvantages, fully coupled approaches have been investigated in order to obtain full physics simulations.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
D'Azevedo, Eduardo; Abbott, Stephen; Koskela, Tuomas
The XGC fusion gyrokinetic code combines state-of-the-art, portable computational and algorithmic technologies to enable complicated multiscale simulations of turbulence and transport dynamics in ITER edge plasma on the largest US open-science computer, the CRAY XK7 Titan, at its maximal heterogeneous capability, which have not been possible before due to a factor of over 10 shortage in the time-to-solution for less than 5 days of wall-clock time for one physics case. Frontier techniques such as nested OpenMP parallelism, adaptive parallel I/O, staging I/O and data reduction using dynamic and asynchronous applications interactions, dynamic repartitioning for balancing computational work in pushing particlesmore » and in grid related work, scalable and accurate discretization algorithms for non-linear Coulomb collisions, and communication-avoiding subcycling technology for pushing particles on both CPUs and GPUs are also utilized to dramatically improve the scalability and time-to-solution, hence enabling the difficult kinetic ITER edge simulation on a present-day leadership class computer.« less
NASA Astrophysics Data System (ADS)
Tabik, S.; Romero, L. F.; Mimica, P.; Plata, O.; Zapata, E. L.
2012-09-01
A broad area in astronomy focuses on simulating extragalactic objects based on Very Long Baseline Interferometry (VLBI) radio-maps. Several algorithms in this scope simulate what would be the observed radio-maps if emitted from a predefined extragalactic object. This work analyzes the performance and scaling of this kind of algorithms on multi-socket, multi-core architectures. In particular, we evaluate a sharing approach, a privatizing approach and a hybrid approach on systems with complex memory hierarchy that includes shared Last Level Cache (LLC). In addition, we investigate which manual processes can be systematized and then automated in future works. The experiments show that the data-privatizing model scales efficiently on medium scale multi-socket, multi-core systems (up to 48 cores) while regardless of algorithmic and scheduling optimizations, the sharing approach is unable to reach acceptable scalability on more than one socket. However, the hybrid model with a specific level of data-sharing provides the best scalability over all used multi-socket, multi-core systems.
De, Suvranu; Deo, Dhannanjay; Sankaranarayanan, Ganesh; Arikatla, Venkata S.
2012-01-01
Background While an update rate of 30 Hz is considered adequate for real time graphics, a much higher update rate of about 1 kHz is necessary for haptics. Physics-based modeling of deformable objects, especially when large nonlinear deformations and complex nonlinear material properties are involved, at these very high rates is one of the most challenging tasks in the development of real time simulation systems. While some specialized solutions exist, there is no general solution for arbitrary nonlinearities. Methods In this work we present PhyNNeSS - a Physics-driven Neural Networks-based Simulation System - to address this long-standing technical challenge. The first step is an off-line pre-computation step in which a database is generated by applying carefully prescribed displacements to each node of the finite element models of the deformable objects. In the next step, the data is condensed into a set of coefficients describing neurons of a Radial Basis Function network (RBFN). During real-time computation, these neural networks are used to reconstruct the deformation fields as well as the interaction forces. Results We present realistic simulation examples from interactive surgical simulation with real time force feedback. As an example, we have developed a deformable human stomach model and a Penrose-drain model used in the Fundamentals of Laparoscopic Surgery (FLS) training tool box. Conclusions A unique computational modeling system has been developed that is capable of simulating the response of nonlinear deformable objects in real time. The method distinguishes itself from previous efforts in that a systematic physics-based pre-computational step allows training of neural networks which may be used in real time simulations. We show, through careful error analysis, that the scheme is scalable, with the accuracy being controlled by the number of neurons used in the simulation. PhyNNeSS has been integrated into SoFMIS (Software Framework for Multimodal Interactive Simulation) for general use. PMID:22629108
Modeling, Simulation and Analysis of Public Key Infrastructure
NASA Technical Reports Server (NTRS)
Liu, Yuan-Kwei; Tuey, Richard; Ma, Paul (Technical Monitor)
1998-01-01
Security is an essential part of network communication. The advances in cryptography have provided solutions to many of the network security requirements. Public Key Infrastructure (PKI) is the foundation of the cryptography applications. The main objective of this research is to design a model to simulate a reliable, scalable, manageable, and high-performance public key infrastructure. We build a model to simulate the NASA public key infrastructure by using SimProcess and MatLab Software. The simulation is from top level all the way down to the computation needed for encryption, decryption, digital signature, and secure web server. The application of secure web server could be utilized in wireless communications. The results of the simulation are analyzed and confirmed by using queueing theory.
OligoIS: Scalable Instance Selection for Class-Imbalanced Data Sets.
García-Pedrajas, Nicolás; Perez-Rodríguez, Javier; de Haro-García, Aida
2013-02-01
In current research, an enormous amount of information is constantly being produced, which poses a challenge for data mining algorithms. Many of the problems in extremely active research areas, such as bioinformatics, security and intrusion detection, or text mining, share the following two features: large data sets and class-imbalanced distribution of samples. Although many methods have been proposed for dealing with class-imbalanced data sets, most of these methods are not scalable to the very large data sets common to those research fields. In this paper, we propose a new approach to dealing with the class-imbalance problem that is scalable to data sets with many millions of instances and hundreds of features. This proposal is based on the divide-and-conquer principle combined with application of the selection process to balanced subsets of the whole data set. This divide-and-conquer principle allows the execution of the algorithm in linear time. Furthermore, the proposed method is easy to implement using a parallel environment and can work without loading the whole data set into memory. Using 40 class-imbalanced medium-sized data sets, we will demonstrate our method's ability to improve the results of state-of-the-art instance selection methods for class-imbalanced data sets. Using three very large data sets, we will show the scalability of our proposal to millions of instances and hundreds of features.
The TeraShake Computational Platform for Large-Scale Earthquake Simulations
NASA Astrophysics Data System (ADS)
Cui, Yifeng; Olsen, Kim; Chourasia, Amit; Moore, Reagan; Maechling, Philip; Jordan, Thomas
Geoscientific and computer science researchers with the Southern California Earthquake Center (SCEC) are conducting a large-scale, physics-based, computationally demanding earthquake system science research program with the goal of developing predictive models of earthquake processes. The computational demands of this program continue to increase rapidly as these researchers seek to perform physics-based numerical simulations of earthquake processes for larger meet the needs of this research program, a multiple-institution team coordinated by SCEC has integrated several scientific codes into a numerical modeling-based research tool we call the TeraShake computational platform (TSCP). A central component in the TSCP is a highly scalable earthquake wave propagation simulation program called the TeraShake anelastic wave propagation (TS-AWP) code. In this chapter, we describe how we extended an existing, stand-alone, wellvalidated, finite-difference, anelastic wave propagation modeling code into the highly scalable and widely used TS-AWP and then integrated this code into the TeraShake computational platform that provides end-to-end (initialization to analysis) research capabilities. We also describe the techniques used to enhance the TS-AWP parallel performance on TeraGrid supercomputers, as well as the TeraShake simulations phases including input preparation, run time, data archive management, and visualization. As a result of our efforts to improve its parallel efficiency, the TS-AWP has now shown highly efficient strong scaling on over 40K processors on IBM’s BlueGene/L Watson computer. In addition, the TSCP has developed into a computational system that is useful to many members of the SCEC community for performing large-scale earthquake simulations.
NASA Astrophysics Data System (ADS)
Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen
2018-07-01
Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.
Scalable and expressive medical terminologies.
Mays, E; Weida, R; Dionne, R; Laker, M; White, B; Liang, C; Oles, F J
1996-01-01
The K-Rep system, based on description logic, is used to represent and reason with large and expressive controlled medical terminologies. Expressive concept descriptions incorporate semantically precise definitions composed using logical operators, together with important non-semantic information such as synonyms and codes. Examples are drawn from our experience with K-Rep in modeling the InterMed laboratory terminology and also developing a large clinical terminology now in production use at Kaiser-Permanente. System-level scalability of performance is achieved through an object-oriented database system which efficiently maps persistent memory to virtual memory. Equally important is conceptual scalability-the ability to support collaborative development, organization, and visualization of a substantial terminology as it evolves over time. K-Rep addresses this need by logically completing concept definitions and automatically classifying concepts in a taxonomy via subsumption inferences. The K-Rep system includes a general-purpose GUI environment for terminology development and browsing, a custom interface for formulary term maintenance, a C+2 application program interface, and a distributed client-server mode which provides lightweight clients with efficient run-time access to K-Rep by means of a scripting language.
NASA Astrophysics Data System (ADS)
Singh, Surya P. N.; Thayer, Scott M.
2002-02-01
This paper presents a novel algorithmic architecture for the coordination and control of large scale distributed robot teams derived from the constructs found within the human immune system. Using this as a guide, the Immunology-derived Distributed Autonomous Robotics Architecture (IDARA) distributes tasks so that broad, all-purpose actions are refined and followed by specific and mediated responses based on each unit's utility and capability to timely address the system's perceived need(s). This method improves on initial developments in this area by including often overlooked interactions of the innate immune system resulting in a stronger first-order, general response mechanism. This allows for rapid reactions in dynamic environments, especially those lacking significant a priori information. As characterized via computer simulation of a of a self-healing mobile minefield having up to 7,500 mines and 2,750 robots, IDARA provides an efficient, communications light, and scalable architecture that yields significant operation and performance improvements for large-scale multi-robot coordination and control.
Laser-based pedestrian tracking in outdoor environments by multiple mobile robots.
Ozaki, Masataka; Kakimuma, Kei; Hashimoto, Masafumi; Takahashi, Kazuhiko
2012-10-29
This paper presents an outdoors laser-based pedestrian tracking system using a group of mobile robots located near each other. Each robot detects pedestrians from its own laser scan image using an occupancy-grid-based method, and the robot tracks the detected pedestrians via Kalman filtering and global-nearest-neighbor (GNN)-based data association. The tracking data is broadcast to multiple robots through intercommunication and is combined using the covariance intersection (CI) method. For pedestrian tracking, each robot identifies its own posture using real-time-kinematic GPS (RTK-GPS) and laser scan matching. Using our cooperative tracking method, all the robots share the tracking data with each other; hence, individual robots can always recognize pedestrians that are invisible to any other robot. The simulation and experimental results show that cooperating tracking provides the tracking performance better than conventional individual tracking does. Our tracking system functions in a decentralized manner without any central server, and therefore, this provides a degree of scalability and robustness that cannot be achieved by conventional centralized architectures.
Design and Development of a Flight Route Modification, Logging, and Communication Network
NASA Technical Reports Server (NTRS)
Merlino, Daniel K.; Wilson, C. Logan; Carboneau, Lindsey M.; Wilder, Andrew J.; Underwood, Matthew C.
2016-01-01
There is an overwhelming desire to create and enhance communication mechanisms between entities that operate within the National Airspace System. Furthermore, airlines are always extremely interested in increasing the efficiency of their flights. An innovative system prototype was developed and tested that improves collaborative decision making without modifying existing infrastructure or operational procedures within the current Air Traffic Management System. This system enables collaboration between flight crew and airline dispatchers to share and assess optimized flight routes through an Internet connection. Using a sophisticated medium-fidelity flight simulation environment, a rapid-prototyping development, and a unified modeling language, the software was designed to ensure reliability and scalability for future growth and applications. Ensuring safety and security were primary design goals, therefore the software does not interact or interfere with major flight control or safety systems. The system prototype demonstrated an unprecedented use of in-flight Internet to facilitate effective communication with Airline Operations Centers, which may contribute to increased flight efficiency for airlines.
Design of a biochemical circuit motif for learning linear functions
Lakin, Matthew R.; Minnich, Amanda; Lane, Terran; Stefanovic, Darko
2014-01-01
Learning and adaptive behaviour are fundamental biological processes. A key goal in the field of bioengineering is to develop biochemical circuit architectures with the ability to adapt to dynamic chemical environments. Here, we present a novel design for a biomolecular circuit capable of supervised learning of linear functions, using a model based on chemical reactions catalysed by DNAzymes. To achieve this, we propose a novel mechanism of maintaining and modifying internal state in biochemical systems, thereby advancing the state of the art in biomolecular circuit architecture. We use simulations to demonstrate that the circuit is capable of learning behaviour and assess its asymptotic learning performance, scalability and robustness to noise. Such circuits show great potential for building autonomous in vivo nanomedical devices. While such a biochemical system can tell us a great deal about the fundamentals of learning in living systems and may have broad applications in biomedicine (e.g. autonomous and adaptive drugs), it also offers some intriguing challenges and surprising behaviours from a machine learning perspective. PMID:25401175
Design of a biochemical circuit motif for learning linear functions.
Lakin, Matthew R; Minnich, Amanda; Lane, Terran; Stefanovic, Darko
2014-12-06
Learning and adaptive behaviour are fundamental biological processes. A key goal in the field of bioengineering is to develop biochemical circuit architectures with the ability to adapt to dynamic chemical environments. Here, we present a novel design for a biomolecular circuit capable of supervised learning of linear functions, using a model based on chemical reactions catalysed by DNAzymes. To achieve this, we propose a novel mechanism of maintaining and modifying internal state in biochemical systems, thereby advancing the state of the art in biomolecular circuit architecture. We use simulations to demonstrate that the circuit is capable of learning behaviour and assess its asymptotic learning performance, scalability and robustness to noise. Such circuits show great potential for building autonomous in vivo nanomedical devices. While such a biochemical system can tell us a great deal about the fundamentals of learning in living systems and may have broad applications in biomedicine (e.g. autonomous and adaptive drugs), it also offers some intriguing challenges and surprising behaviours from a machine learning perspective.
Towards real-time photon Monte Carlo dose calculation in the cloud
NASA Astrophysics Data System (ADS)
Ziegenhein, Peter; Kozin, Igor N.; Kamerling, Cornelis Ph; Oelfke, Uwe
2017-06-01
Near real-time application of Monte Carlo (MC) dose calculation in clinic and research is hindered by the long computational runtimes of established software. Currently, fast MC software solutions are available utilising accelerators such as graphical processing units (GPUs) or clusters based on central processing units (CPUs). Both platforms are expensive in terms of purchase costs and maintenance and, in case of the GPU, provide only limited scalability. In this work we propose a cloud-based MC solution, which offers high scalability of accurate photon dose calculations. The MC simulations run on a private virtual supercomputer that is formed in the cloud. Computational resources can be provisioned dynamically at low cost without upfront investment in expensive hardware. A client-server software solution has been developed which controls the simulations and transports data to and from the cloud efficiently and securely. The client application integrates seamlessly into a treatment planning system. It runs the MC simulation workflow automatically and securely exchanges simulation data with the server side application that controls the virtual supercomputer. Advanced encryption standards were used to add an additional security layer, which encrypts and decrypts patient data on-the-fly at the processor register level. We could show that our cloud-based MC framework enables near real-time dose computation. It delivers excellent linear scaling for high-resolution datasets with absolute runtimes of 1.1 seconds to 10.9 seconds for simulating a clinical prostate and liver case up to 1% statistical uncertainty. The computation runtimes include the transportation of data to and from the cloud as well as process scheduling and synchronisation overhead. Cloud-based MC simulations offer a fast, affordable and easily accessible alternative for near real-time accurate dose calculations to currently used GPU or cluster solutions.
Towards real-time photon Monte Carlo dose calculation in the cloud.
Ziegenhein, Peter; Kozin, Igor N; Kamerling, Cornelis Ph; Oelfke, Uwe
2017-06-07
Near real-time application of Monte Carlo (MC) dose calculation in clinic and research is hindered by the long computational runtimes of established software. Currently, fast MC software solutions are available utilising accelerators such as graphical processing units (GPUs) or clusters based on central processing units (CPUs). Both platforms are expensive in terms of purchase costs and maintenance and, in case of the GPU, provide only limited scalability. In this work we propose a cloud-based MC solution, which offers high scalability of accurate photon dose calculations. The MC simulations run on a private virtual supercomputer that is formed in the cloud. Computational resources can be provisioned dynamically at low cost without upfront investment in expensive hardware. A client-server software solution has been developed which controls the simulations and transports data to and from the cloud efficiently and securely. The client application integrates seamlessly into a treatment planning system. It runs the MC simulation workflow automatically and securely exchanges simulation data with the server side application that controls the virtual supercomputer. Advanced encryption standards were used to add an additional security layer, which encrypts and decrypts patient data on-the-fly at the processor register level. We could show that our cloud-based MC framework enables near real-time dose computation. It delivers excellent linear scaling for high-resolution datasets with absolute runtimes of 1.1 seconds to 10.9 seconds for simulating a clinical prostate and liver case up to 1% statistical uncertainty. The computation runtimes include the transportation of data to and from the cloud as well as process scheduling and synchronisation overhead. Cloud-based MC simulations offer a fast, affordable and easily accessible alternative for near real-time accurate dose calculations to currently used GPU or cluster solutions.
Convergence in France facing Big Data era and Exascale challenges for Climate Sciences
NASA Astrophysics Data System (ADS)
Denvil, Sébastien; Dufresne, Jean-Louis; Salas, David; Meurdesoif, Yann; Valcke, Sophie; Caubel, Arnaud; Foujols, Marie-Alice; Servonnat, Jérôme; Sénési, Stéphane; Derouillat, Julien; Voury, Pascal
2014-05-01
The presentation will introduce a french national project : CONVERGENCE that has been funded for four years. This project will tackle big data and computational challenges faced by climate modeling community in HPC context. Model simulations are central to the study of complex mechanisms and feedbacks in the climate system and to provide estimates of future and past climate changes. Recent trends in climate modelling are to add more physical components in the modelled system, increasing the resolution of each individual component and the more systematic use of large suites of simulations to address many scientific questions. Climate simulations may therefore differ in their initial state, parameter values, representation of physical processes, spatial resolution, model complexity, and degree of realism or degree of idealisation. In addition, there is a strong need for evaluating, improving and monitoring the performance of climate models using a large ensemble of diagnostics and better integration of model outputs and observational data. High performance computing is currently reaching the exascale and has the potential to produce this exponential increase of size and numbers of simulations. However, post-processing, analysis, and exploration of the generated data have stalled and there is a strong need for new tools to cope with the growing size and complexity of the underlying simulations and datasets. Exascale simulations require new scalable software tools to generate, manage and mine those simulations ,and data to extract the relevant information and to take the correct decision. The primary purpose of this project is to develop a platform capable of running large ensembles of simulations with a suite of models, to handle the complex and voluminous datasets generated, to facilitate the evaluation and validation of the models and the use of higher resolution models. We propose to gather interdisciplinary skills to design, using a component-based approach, a specific programming environment for scalable scientific simulations and analytics, integrating new and efficient ways of deploying and analysing the applications on High Performance Computing (HPC) system. CONVERGENCE, gathering HPC and informatics expertise that cuts across the individual partners and the broader HPC community, will allow the national climate community to leverage information technology (IT) innovations to address its specific needs. Our methodology consists in developing an ensemble of generic elements needed to run the French climate models with different grids and different resolution, ensuring efficient and reliable execution of these models, managing large volume and number of data and allowing analysis of the results and precise evaluation of the models. These elements include data structure definition and input-output (IO), code coupling and interpolation, as well as runtime and pre/post-processing environments. A common data and metadata structure will allow transferring consistent information between the various elements. All these generic elements will be open source and publicly available. The IPSL-CM and CNRM-CM climate models will make use of these elements that will constitute a national platform for climate modelling. This platform will be used, in its entirety, to optimise and tune the next version of the IPSL-CM model and to develop a global coupled climate model with a regional grid refinement. It will also be used, at least partially, to run ensembles of the CNRM-CM model at relatively high resolution and to run a very-high resolution prototype of this model. The climate models we developed are already involved in many international projects. For instance we participate to the CMIP (Coupled Model Intercomparison Project) project that is very demanding but has a high visibility: its results are widely used and are in particular synthesised in the IPCC (Intergovernmental Panel on Climate Change) assessment reports. The CONVERGENCE project will constitute an invaluable step for the French climate community to prepare and better contribute to the next phase of the CMIP project.
Chelonia: A self-healing, replicated storage system
NASA Astrophysics Data System (ADS)
Kerr Nilsen, Jon; Toor, Salman; Nagy, Zsombor; Read, Alex
2011-12-01
Chelonia is a novel grid storage system designed to fill the requirements gap between those of large, sophisticated scientific collaborations which have adopted the grid paradigm for their distributed storage needs, and of corporate business communities gravitating towards the cloud paradigm. Chelonia is an integrated system of heterogeneous, geographically dispersed storage sites which is easily and dynamically expandable and optimized for high availability and scalability. The architecture and implementation in term of web-services running inside the Advanced Resource Connector Hosting Environment Dameon (ARC HED) are described and results of tests in both local -area and wide-area networks that demonstrate the fault tolerance, stability and scalability of Chelonia will be presented. In addition, example setups for production deployments for small and medium-sized VO's are described.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rudkevich, Aleksandr; Goldis, Evgeniy
This research conducted by the Newton Energy Group, LLC (NEG) is dedicated to the development of pCloud: a Cloud-based Power Market Simulation Environment. pCloud is offering power industry stakeholders the capability to model electricity markets and is organized around the Software as a Service (SaaS) concept -- a software application delivery model in which software is centrally hosted and provided to many users via the internet. During the Phase I of this project NEG developed a prototype design for pCloud as a SaaS-based commercial service offering, system architecture supporting that design, ensured feasibility of key architecture's elements, formed technological partnershipsmore » and negotiated commercial agreements with partners, conducted market research and other related activities and secured funding for continue development of pCloud between the end of Phase I and beginning of Phase II, if awarded. Based on the results of Phase I activities, NEG has established that the development of a cloud-based power market simulation environment within the Windows Azure platform is technologically feasible, can be accomplished within the budget and timeframe available through the Phase II SBIR award with additional external funding. NEG believes that pCloud has the potential to become a game-changing technology for the modeling and analysis of electricity markets. This potential is due to the following critical advantages of pCloud over its competition: - Standardized access to advanced and proven power market simulators offered by third parties. - Automated parallelization of simulations and dynamic provisioning of computing resources on the cloud. This combination of automation and scalability dramatically reduces turn-around time while offering the capability to increase the number of analyzed scenarios by a factor of 10, 100 or even 1000. - Access to ready-to-use data and to cloud-based resources leading to a reduction in software, hardware, and IT costs. - Competitive pricing structure, which will make high-volume usage of simulation services affordable. - Availability and affordability of high quality power simulators, which presently only large corporate clients can afford, will level the playing field in developing regional energy policies, determining prudent cost recovery mechanisms and assuring just and reasonable rates to consumers. - Users that presently do not have the resources to internally maintain modeling capabilities will now be able to run simulations. This will invite more players into the industry, ultimately leading to more transparent and liquid power markets.« less
Molecular Simulations in Astrobiology
NASA Technical Reports Server (NTRS)
Pohorille, Andrew; Wilson, Michael A.; Schweighofer, Karl; Chipot, Christophe; New, Michael H.
2000-01-01
One of the main goals of astrobiology is to understand the origin of cellular life. The most direct approach to this problem is to construct laboratory models of protocells. Such efforts, currently underway in the NASA Astrobiology Program, are accompanied by computational studies aimed at explaining self-organization of simple molecules into ordered structures that are capable of performing protocellular functions. Many of these functions, such as importing nutrients, capturing energy and responding to changes in the environment, are carried out by proteins bound to membranes. We use computer simulations to address the following questions about these proteins: (1) How do small proteins self-organize into ordered structures at water-membrane interfaces and insert into membranes? (2) How do peptides form membrane-spanning structures (e.g. channels)? (3) By what mechanisms do such structures perform their functions? The simulations are performed using the molecular dynamics method. In this method, Newton's equations of motion for each atom in the system are solved iteratively. At each time step, the forces exerted on each atom by the remaining atoms are evaluated by dividing them into two parts. Short-range forces are calculated in real space while long-range forces are evaluated in reciprocal space, using a particle-mesh algorithm which is of order O(NInN). With a time step of 2 femtoseconds, problems occurring on multi-nanosecond time scales (10(exp 6)-10(exp 8) time steps) are accessible. To address a broader range of problems, simulations need to be extended by three orders of magnitude, which requires algorithmic improvements and codes scalable to a large number of processors. Work in this direction is in progress. Two series of simulations are discussed. In one series, it is shown that nonpolar peptides, disordered in water, translocate to the nonpolar interior of the membrane and fold into helical structures (see Figure). Once in the membrane, the peptides exhibit orientational flexibility with changing conditions, which may have provided a mechanism of transmitting signals between the protocell and its environment. In another series of simulations, the mechanism by which a simple protein channel efficiently mediates proton transport across membranes was investigated. This process is a key step in cellular bioenergetics. In the channel under study, proton transport is gated by four histidines that occlude the channel pore. The simulations identify the mechanisms by which protons move through the gate.
The high performance parallel algorithm for Unified Gas-Kinetic Scheme
NASA Astrophysics Data System (ADS)
Li, Shiyi; Li, Qibing; Fu, Song; Xu, Jinxiu
2016-11-01
A high performance parallel algorithm for UGKS is developed to simulate three-dimensional flows internal and external on arbitrary grid system. The physical domain and velocity domain are divided into different blocks and distributed according to the two-dimensional Cartesian topology with intra-communicators in physical domain for data exchange and other intra-communicators in velocity domain for sum reduction to moment integrals. Numerical results of three-dimensional cavity flow and flow past a sphere agree well with the results from the existing studies and validate the applicability of the algorithm. The scalability of the algorithm is tested both on small (1-16) and large (729-5832) scale processors. The tested speed-up ratio is near linear ashind thus the efficiency is around 1, which reveals the good scalability of the present algorithm.
High-Fidelity Preservation of Quantum Information During Trapped-Ion Transport
NASA Astrophysics Data System (ADS)
Kaufmann, Peter; Gloger, Timm F.; Kaufmann, Delia; Johanning, Michael; Wunderlich, Christof
2018-01-01
A promising scheme for building scalable quantum simulators and computers is the synthesis of a scalable system using interconnected subsystems. A prerequisite for this approach is the ability to faithfully transfer quantum information between subsystems. With trapped atomic ions, this can be realized by transporting ions with quantum information encoded into their internal states. Here, we measure with high precision the fidelity of quantum information encoded into hyperfine states of a
A complexity-scalable software-based MPEG-2 video encoder.
Chen, Guo-bin; Lu, Xin-ning; Wang, Xing-guo; Liu, Ji-lin
2004-05-01
With the development of general-purpose processors (GPP) and video signal processing algorithms, it is possible to implement a software-based real-time video encoder on GPP, and its low cost and easy upgrade attract developers' interests to transfer video encoding from specialized hardware to more flexible software. In this paper, the encoding structure is set up first to support complexity scalability; then a lot of high performance algorithms are used on the key time-consuming modules in coding process; finally, at programming level, processor characteristics are considered to improve data access efficiency and processing parallelism. Other programming methods such as lookup table are adopted to reduce the computational complexity. Simulation results showed that these ideas could not only improve the global performance of video coding, but also provide great flexibility in complexity regulation.
Vashishta, Priya; Kalia, Rajiv K; Nakano, Aiichiro
2006-03-02
We have developed a first-principles-based hierarchical simulation framework, which seamlessly integrates (1) a quantum mechanical description based on the density functional theory (DFT), (2) multilevel molecular dynamics (MD) simulations based on a reactive force field (ReaxFF) that describes chemical reactions and polarization, a nonreactive force field that employs dynamic atomic charges, and an effective force field (EFF), and (3) an atomistically informed continuum model to reach macroscopic length scales. For scalable hierarchical simulations, we have developed parallel linear-scaling algorithms for (1) DFT calculation based on a divide-and-conquer algorithm on adaptive multigrids, (2) chemically reactive MD based on a fast ReaxFF (F-ReaxFF) algorithm, and (3) EFF-MD based on a space-time multiresolution MD (MRMD) algorithm. On 1920 Intel Itanium2 processors, we have demonstrated 1.4 million atom (0.12 trillion grid points) DFT, 0.56 billion atom F-ReaxFF, and 18.9 billion atom MRMD calculations, with parallel efficiency as high as 0.953. Through the use of these algorithms, multimillion atom MD simulations have been performed to study the oxidation of an aluminum nanoparticle. Structural and dynamic correlations in the oxide region are calculated as well as the evolution of charges, surface oxide thickness, diffusivities of atoms, and local stresses. In the microcanonical ensemble, the oxidizing reaction becomes explosive in both molecular and atomic oxygen environments, due to the enormous energy release associated with Al-O bonding. In the canonical ensemble, an amorphous oxide layer of a thickness of approximately 40 angstroms is formed after 466 ps, in good agreement with experiments. Simulations have been performed to study nanoindentation on crystalline, amorphous, and nanocrystalline silicon nitride and silicon carbide. Simulation on nanocrystalline silicon carbide reveals unusual deformation mechanisms in brittle nanophase materials, due to coexistence of brittle grains and soft amorphous-like grain boundary phases. Simulations predict a crossover from intergranular continuous deformation to intragrain discrete deformation at a critical indentation depth.
A domain specific language for performance portable molecular dynamics algorithms
NASA Astrophysics Data System (ADS)
Saunders, William Robert; Grant, James; Müller, Eike Hermann
2018-03-01
Developers of Molecular Dynamics (MD) codes face significant challenges when adapting existing simulation packages to new hardware. In a continuously diversifying hardware landscape it becomes increasingly difficult for scientists to be experts both in their own domain (physics/chemistry/biology) and specialists in the low level parallelisation and optimisation of their codes. To address this challenge, we describe a "Separation of Concerns" approach for the development of parallel and optimised MD codes: the science specialist writes code at a high abstraction level in a domain specific language (DSL), which is then translated into efficient computer code by a scientific programmer. In a related context, an abstraction for the solution of partial differential equations with grid based methods has recently been implemented in the (Py)OP2 library. Inspired by this approach, we develop a Python code generation system for molecular dynamics simulations on different parallel architectures, including massively parallel distributed memory systems and GPUs. We demonstrate the efficiency of the auto-generated code by studying its performance and scalability on different hardware and compare it to other state-of-the-art simulation packages. With growing data volumes the extraction of physically meaningful information from the simulation becomes increasingly challenging and requires equally efficient implementations. A particular advantage of our approach is the easy expression of such analysis algorithms. We consider two popular methods for deducing the crystalline structure of a material from the local environment of each atom, show how they can be expressed in our abstraction and implement them in the code generation framework.
Bridging FPGA and GPU technologies for AO real-time control
NASA Astrophysics Data System (ADS)
Perret, Denis; Lainé, Maxime; Bernard, Julien; Gratadour, Damien; Sevin, Arnaud
2016-07-01
Our team has developed a common environment for high performance simulations and real-time control of AO systems based on the use of Graphics Processors Units in the context of the COMPASS project. Such a solution, based on the ability of the real time core in the simulation to provide adequate computing performance, limits the cost of developing AO RTC systems and makes them more scalable. A code developed and validated in the context of the simulation may be injected directly into the system and tested on sky. Furthermore, the use of relatively low cost components also offers significant advantages for the system hardware platform. However, the use of GPUs in an AO loop comes with drawbacks: the traditional way of offloading computation from CPU to GPUs - involving multiple copies and unacceptable overhead in kernel launching - is not well suited in a real time context. This last application requires the implementation of a solution enabling direct memory access (DMA) to the GPU memory from a third party device, bypassing the operating system. This allows this device to communicate directly with the real-time core of the simulation feeding it with the WFS camera pixel stream. We show that DMA between a custom FPGA-based frame-grabber and a computation unit (GPU, FPGA, or Coprocessor such as Xeon-phi) across PCIe allows us to get latencies compatible with what will be needed on ELTs. As a fine-grained synchronization mechanism is not yet made available by GPU vendors, we propose the use of memory polling to avoid interrupts handling and involvement of a CPU. Network and Vision protocols are handled by the FPGA-based Network Interface Card (NIC). We present the results we obtained on a complete AO loop using camera and deformable mirror simulators.
2012-01-01
Background Efficient rule authoring tools are critical to allow clinical Knowledge Engineers (KEs), Software Engineers (SEs), and Subject Matter Experts (SMEs) to convert medical knowledge into machine executable clinical decision support rules. The goal of this analysis was to identify the critical success factors and challenges of a fully functioning Rule Authoring Environment (RAE) in order to define requirements for a scalable, comprehensive tool to manage enterprise level rules. Methods The authors evaluated RAEs in active use across Partners Healthcare, including enterprise wide, ambulatory only, and system specific tools, with a focus on rule editors for reminder and medication rules. We conducted meetings with users of these RAEs to discuss their general experience and perceived advantages and limitations of these tools. Results While the overall rule authoring process is similar across the 10 separate RAEs, the system capabilities and architecture vary widely. Most current RAEs limit the ability of the clinical decision support (CDS) interventions to be standardized, sharable, interoperable, and extensible. No existing system meets all requirements defined by knowledge management users. Conclusions A successful, scalable, integrated rule authoring environment will need to support a number of key requirements and functions in the areas of knowledge representation, metadata, terminology, authoring collaboration, user interface, integration with electronic health record (EHR) systems, testing, and reporting. PMID:23145874
Cyber Security Research Frameworks For Coevolutionary Network Defense
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rush, George D.; Tauritz, Daniel Remy
Several architectures have been created for developing and testing systems used in network security, but most are meant to provide a platform for running cyber security experiments as opposed to automating experiment processes. In the first paper, we propose a framework termed Distributed Cyber Security Automation Framework for Experiments (DCAFE) that enables experiment automation and control in a distributed environment. Predictive analysis of adversaries is another thorny issue in cyber security. Game theory can be used to mathematically analyze adversary models, but its scalability limitations restrict its use. Computational game theory allows us to scale classical game theory to larger,more » more complex systems. In the second paper, we propose a framework termed Coevolutionary Agent-based Network Defense Lightweight Event System (CANDLES) that can coevolve attacker and defender agent strategies and capabilities and evaluate potential solutions with a custom network defense simulation. The third paper is a continuation of the CANDLES project in which we rewrote key parts of the framework. Attackers and defenders have been redesigned to evolve pure strategy, and a new network security simulation is devised which specifies network architecture and adds a temporal aspect. We also add a hill climber algorithm to evaluate the search space and justify the use of a coevolutionary algorithm.« less
NASA Astrophysics Data System (ADS)
Rossi, Francesco; Londrillo, Pasquale; Sgattoni, Andrea; Sinigardi, Stefano; Turchetti, Giorgio
2012-12-01
We present `jasmine', an implementation of a fully relativistic, 3D, electromagnetic Particle-In-Cell (PIC) code, capable of running simulations in various laser plasma acceleration regimes on Graphics-Processing-Units (GPUs) HPC clusters. Standard energy/charge preserving FDTD-based algorithms have been implemented using double precision and quadratic (or arbitrary sized) shape functions for the particle weighting. When porting a PIC scheme to the GPU architecture (or, in general, a shared memory environment), the particle-to-grid operations (e.g. the evaluation of the current density) require special care to avoid memory inconsistencies and conflicts. Here we present a robust implementation of this operation that is efficient for any number of particles per cell and particle shape function order. Our algorithm exploits the exposed GPU memory hierarchy and avoids the use of atomic operations, which can hurt performance especially when many particles lay on the same cell. We show the code multi-GPU scalability results and present a dynamic load-balancing algorithm. The code is written using a python-based C++ meta-programming technique which translates in a high level of modularity and allows for easy performance tuning and simple extension of the core algorithms to various simulation schemes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Song, Hyun-Seob; Thomas, Dennis G.; Stegen, James C.
In a recent study of denitrification dynamics in hyporheic zone sediments, we observed a significant time lag (up to several days) in enzymatic response to the changes in substrate concentration. To explore an underlying mechanism and understand the interactive dynamics between enzymes and nutrients, we developed a trait-based model that associates a community’s traits with functional enzymes, instead of typically used species guilds (or functional guilds). This enzyme-based formulation allows to collectively describe biogeochemical functions of microbial communities without directly parameterizing the dynamics of species guilds, therefore being scalable to complex communities. As a key component of modeling, we accountedmore » for microbial regulation occurring through transcriptional and translational processes, the dynamics of which was parameterized based on the temporal profiles of enzyme concentrations measured using a new signature peptide-based method. The simulation results using the resulting model showed several days of a time lag in enzymatic responses as observed in experiments. Further, the model showed that the delayed enzymatic reactions could be primarily controlled by transcriptional responses and that the dynamics of transcripts and enzymes are closely correlated. The developed model can serve as a useful tool for predicting biogeochemical processes in natural environments, either independently or through integration with hydrologic flow simulators.« less
AsyncStageOut: Distributed user data management for CMS Analysis
NASA Astrophysics Data System (ADS)
Riahi, H.; Wildish, T.; Ciangottini, D.; Hernández, J. M.; Andreeva, J.; Balcas, J.; Karavakis, E.; Mascheroni, M.; Tanasijczuk, A. J.; Vaandering, E. W.
2015-12-01
AsyncStageOut (ASO) is a new component of the distributed data analysis system of CMS, CRAB, designed for managing users' data. It addresses a major weakness of the previous model, namely that mass storage of output data was part of the job execution resulting in inefficient use of job slots and an unacceptable failure rate at the end of the jobs. ASO foresees the management of up to 400k files per day of various sizes, spread worldwide across more than 60 sites. It must handle up to 1000 individual users per month, and work with minimal delay. This creates challenging requirements for system scalability, performance and monitoring. ASO uses FTS to schedule and execute the transfers between the storage elements of the source and destination sites. It has evolved from a limited prototype to a highly adaptable service, which manages and monitors the user file placement and bookkeeping. To ensure system scalability and data monitoring, it employs new technologies such as a NoSQL database and re-uses existing components of PhEDEx and the FTS Dashboard. We present the asynchronous stage-out strategy and the architecture of the solution we implemented to deal with those issues and challenges. The deployment model for the high availability and scalability of the service is discussed. The performance of the system during the commissioning and the first phase of production are also shown, along with results from simulations designed to explore the limits of scalability.
AsyncStageOut: Distributed User Data Management for CMS Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Riahi, H.; Wildish, T.; Ciangottini, D.
2015-12-23
AsyncStageOut (ASO) is a new component of the distributed data analysis system of CMS, CRAB, designed for managing users' data. It addresses a major weakness of the previous model, namely that mass storage of output data was part of the job execution resulting in inefficient use of job slots and an unacceptable failure rate at the end of the jobs. ASO foresees the management of up to 400k files per day of various sizes, spread worldwide across more than 60 sites. It must handle up to 1000 individual users per month, and work with minimal delay. This creates challenging requirementsmore » for system scalability, performance and monitoring. ASO uses FTS to schedule and execute the transfers between the storage elements of the source and destination sites. It has evolved from a limited prototype to a highly adaptable service, which manages and monitors the user file placement and bookkeeping. To ensure system scalability and data monitoring, it employs new technologies such as a NoSQL database and re-uses existing components of PhEDEx and the FTS Dashboard. We present the asynchronous stage-out strategy and the architecture of the solution we implemented to deal with those issues and challenges. The deployment model for the high availability and scalability of the service is discussed. The performance of the system during the commissioning and the first phase of production are also shown, along with results from simulations designed to explore the limits of scalability.« less
Architectural Principles and Experimentation of Distributed High Performance Virtual Clusters
ERIC Educational Resources Information Center
Younge, Andrew J.
2016-01-01
With the advent of virtualization and Infrastructure-as-a-Service (IaaS), the broader scientific computing community is considering the use of clouds for their scientific computing needs. This is due to the relative scalability, ease of use, advanced user environment customization abilities, and the many novel computing paradigms available for…
Scalability of Findability: Decentralized Search and Retrieval in Large Information Networks
ERIC Educational Resources Information Center
Ke, Weimao
2010-01-01
Amid the rapid growth of information today is the increasing challenge for people to survive and navigate its magnitude. Dynamics and heterogeneity of large information spaces such as the Web challenge information retrieval in these environments. Collection of information in advance and centralization of IR operations are hardly possible because…
Coordinating Decentralized Learning and Conflict Resolution across Agent Boundaries
ERIC Educational Resources Information Center
Cheng, Shanjun
2012-01-01
It is crucial for embedded systems to adapt to the dynamics of open environments. This adaptation process becomes especially challenging in the context of multiagent systems because of scalability, partial information accessibility and complex interaction of agents. It is a challenge for agents to learn good policies, when they need to plan and…
A Framework for Collaborative and Convenient Learning on Cloud Computing Platforms
ERIC Educational Resources Information Center
Sharma, Deepika; Kumar, Vikas
2017-01-01
The depth of learning resides in collaborative work with more engagement and fun. Technology can enhance collaboration with a higher level of convenience and cloud computing can facilitate this in a cost effective and scalable manner. However, to deploy a successful online learning environment, elementary components of learning pedagogy must be…
Evaluation of Scalable Applications of Information Technology to On-Campus Learning.
ERIC Educational Resources Information Center
Matthews, Harry R.; Sommer, Barbara; Maher, Michael; Acredolo, Curt; Ho, Arnold; Falk, Richard
The University of California, Davis, is transforming large general education courses in a cost-effective way that maintains high academic standards in the face of rising student enrollments. Two multi-year pilot projects have demonstrated the feasibility of using rich online learning environments to reduce the dependence on formal lectures. This…
ERIC Educational Resources Information Center
Anaya, Antonio R.; Boticario, Jesus G.
2009-01-01
Data mining methods are successful in educational environments to discover new knowledge or learner skills or features. Unfortunately, they have not been used in depth with collaboration. We have developed a scalable data mining method, whose objective is to infer information on the collaboration during the collaboration process in a…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Erez, Mattan; Yelick, Katherine; Sarkar, Vivek
The Dynamic, Exascale Global Address Space programming environment (DEGAS) project will develop the next generation of programming models and runtime systems to meet the challenges of Exascale computing. Our approach is to provide an efficient and scalable programming model that can be adapted to application needs through the use of dynamic runtime features and domain-specific languages for computational kernels. We address the following technical challenges: Programmability: Rich set of programming constructs based on a Hierarchical Partitioned Global Address Space (HPGAS) model, demonstrated in UPC++. Scalability: Hierarchical locality control, lightweight communication (extended GASNet), and ef- ficient synchronization mechanisms (Phasers). Performance Portability:more » Just-in-time specialization (SEJITS) for generating hardware-specific code and scheduling libraries for domain-specific adaptive runtimes (Habanero). Energy Efficiency: Communication-optimal code generation to optimize energy efficiency by re- ducing data movement. Resilience: Containment Domains for flexible, domain-specific resilience, using state capture mechanisms and lightweight, asynchronous recovery mechanisms. Interoperability: Runtime and language interoperability with MPI and OpenMP to encourage broad adoption.« less
On delay adjustment for dynamic load balancing in distributed virtual environments.
Deng, Yunhua; Lau, Rynson W H
2012-04-01
Distributed virtual environments (DVEs) are becoming very popular in recent years, due to the rapid growing of applications, such as massive multiplayer online games (MMOGs). As the number of concurrent users increases, scalability becomes one of the major challenges in designing an interactive DVE system. One solution to address this scalability problem is to adopt a multi-server architecture. While some methods focus on the quality of partitioning the load among the servers, others focus on the efficiency of the partitioning process itself. However, all these methods neglect the effect of network delay among the servers on the accuracy of the load balancing solutions. As we show in this paper, the change in the load of the servers due to network delay would affect the performance of the load balancing algorithm. In this work, we conduct a formal analysis of this problem and discuss two efficient delay adjustment schemes to address the problem. Our experimental results show that our proposed schemes can significantly improve the performance of the load balancing algorithm with neglectable computation overhead.
NASA Astrophysics Data System (ADS)
Bednar, Earl; Drager, Steven L.
2007-04-01
Quantum information processing's objective is to utilize revolutionary computing capability based on harnessing the paradigm shift offered by quantum computing to solve classically hard and computationally challenging problems. Some of our computationally challenging problems of interest include: the capability for rapid image processing, rapid optimization of logistics, protecting information, secure distributed simulation, and massively parallel computation. Currently, one important problem with quantum information processing is that the implementation of quantum computers is difficult to realize due to poor scalability and great presence of errors. Therefore, we have supported the development of Quantum eXpress and QuIDD Pro, two quantum computer simulators running on classical computers for the development and testing of new quantum algorithms and processes. This paper examines the different methods used by these two quantum computing simulators. It reviews both simulators, highlighting each simulators background, interface, and special features. It also demonstrates the implementation of current quantum algorithms on each simulator. It concludes with summary comments on both simulators.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jain, Atul K.
The overall objectives of this DOE funded project is to combine scientific and computational challenges in climate modeling by expanding our understanding of the biogeophysical-biogeochemical processes and their interactions in the northern high latitudes (NHLs) using an earth system modeling (ESM) approach, and by adopting an adaptive parallel runtime system in an ESM to achieve efficient and scalable climate simulations through improved load balancing algorithms.
Optimization of a Circularly Polarized Patch Antenna for Two Frequency Bands
2015-09-01
the various techniques that can be used to improve the performance of a circularly polarized microstrip patch antenna . These adjustments include... microstrip antenna . 15. SUBJECT TERMS Patch Antenna , Circular Polarization 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT...Frequency Structural Simulator (HFSS) has allowed engineers to create scalable multiband microstrip antennas . Several factors were taken into
Scalable Deployment of Advanced Building Energy Management Systems
2013-06-01
Building Automation and Control Network BDAS Building Data Acquisition System BEM building energy model BIM building information modeling BMS...A prototype toolkit to seamlessly and automatically transfer a Building Information Model ( BIM ) to a Building Energy Model (BEM) has been...circumvent the need to manually construct and maintain a detailed building energy simulation model . This detailed
Design and testing of a mesocosm-scale habitat for culturing the endangered Devils Hole Pupfish
Feuerbacher, Olin; Bonar, Scott A.; Barrett, Paul J.
2016-01-01
aptive propagation of desert spring fishes, whether for conservation or research, is often difficult, given the unique and often challenging environments these fish utilize in nature. High temperatures, low dissolved oxygen, minimal water flow, and highly variable lighting are some conditions a researcher might need to recreate to simulate their natural environments. Here we describe a mesocosm-scale habitat created to maintain hybrid Devils Hole × Ash Meadows Amargosa Pupfish (Cyprinodon diabolis × C. nevadensis mionectes) under conditions similar to those found in Devils Hole, Nevada. This 13,000-L system utilized flow control and natural processes to maintain these conditions rather than utilizing complex and expensive automation. We designed a rotating solar collector to control natural sunlight, a biological reactor to consume oxygen while buffering water quality, and a reverse-daylight photosynthesis sump system to stabilize nighttime pH and swings in dissolved oxygen levels. This system successfully controlled many desired parameters and helped inform development of a larger, more permanent desert fish conservation facility at the U.S. Fish and Wildlife Service’s Ash Meadows National Wildlife Refuge, Nevada. For others who need to raise fish from unique habitats, many components of the scalable and modular design of this system can be adapted at reasonable cost.
NASA Astrophysics Data System (ADS)
Nightingale, James; Wang, Qi; Grecos, Christos; Goma, Sergio
2014-02-01
High Efficiency Video Coding (HEVC), the latest video compression standard (also known as H.265), can deliver video streams of comparable quality to the current H.264 Advanced Video Coding (H.264/AVC) standard with a 50% reduction in bandwidth. Research into SHVC, the scalable extension to the HEVC standard, is still in its infancy. One important area for investigation is whether, given the greater compression ratio of HEVC (and SHVC), the loss of packets containing video content will have a greater impact on the quality of delivered video than is the case with H.264/AVC or its scalable extension H.264/SVC. In this work we empirically evaluate the layer-based, in-network adaptation of video streams encoded using SHVC in situations where dynamically changing bandwidths and datagram loss ratios require the real-time adaptation of video streams. Through the use of extensive experimentation, we establish a comprehensive set of benchmarks for SHVC-based highdefinition video streaming in loss prone network environments such as those commonly found in mobile networks. Among other results, we highlight that packet losses of only 1% can lead to a substantial reduction in PSNR of over 3dB and error propagation in over 130 pictures following the one in which the loss occurred. This work would be one of the earliest studies in this cutting-edge area that reports benchmark evaluation results for the effects of datagram loss on SHVC picture quality and offers empirical and analytical insights into SHVC adaptation to lossy, mobile networking conditions.
Can High Bandwidth and Latency Justify Large Cache Blocks in Scalable Multiprocessors?
1994-01-01
400 MB/second. 4 Dubnicki’s work used trace-driven simulation, with traces collected on an 8-processor machine. We would expect such small-scale...312 1 6 32 64 of odk Sb* Bad64.M Figure 17: Miss rate of Ind Blocked LU. Figure 18: MCPR of Ind Blocked LU. overall miss rate of TGauss is a factor of...easily. 17 (’his approach assunics that the model paramelers we collect from simulations with infinite band- width (such as the miss rate and the
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dmitriy Morozov, Tom Peterka
2014-07-29
Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets. As the scale of simulations and observations surpasses billions of particles, a distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this software is a distributed-memory parallel Delaunay and Voronoi tessellation algorithm based on existing serial computational geometry libraries that automatically determines which neighbor points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include the addition of periodic and wall boundary conditions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seal, Sudip K; Perumalla, Kalyan S; Hirshman, Steven Paul
2013-01-01
Simulations that require solutions of block tridiagonal systems of equations rely on fast parallel solvers for runtime efficiency. Leading parallel solvers that are highly effective for general systems of equations, dense or sparse, are limited in scalability when applied to block tridiagonal systems. This paper presents scalability results as well as detailed analyses of two parallel solvers that exploit the special structure of block tridiagonal matrices to deliver superior performance, often by orders of magnitude. A rigorous analysis of their relative parallel runtimes is shown to reveal the existence of a critical block size that separates the parameter space spannedmore » by the number of block rows, the block size and the processor count, into distinct regions that favor one or the other of the two solvers. Dependence of this critical block size on the above parameters as well as on machine-specific constants is established. These formal insights are supported by empirical results on up to 2,048 cores of a Cray XT4 system. To the best of our knowledge, this is the highest reported scalability for parallel block tridiagonal solvers to date.« less
Effect of asynchrony on numerical simulations of fluid flow phenomena
NASA Astrophysics Data System (ADS)
Konduri, Aditya; Mahoney, Bryan; Donzis, Diego
2015-11-01
Designing scalable CFD codes on massively parallel computers is a challenge. This is mainly due to the large number of communications between processing elements (PEs) and their synchronization, leading to idling of PEs. Indeed, communication will likely be the bottleneck in the scalability of codes on Exascale machines. Our recent work on asynchronous computing for PDEs based on finite-differences has shown that it is possible to relax synchronization between PEs at a mathematical level. Computations then proceed regardless of the status of communication, reducing the idle time of PEs and improving the scalability. However, accuracy of the schemes is greatly affected. We have proposed asynchrony-tolerant (AT) schemes to address this issue. In this work, we study the effect of asynchrony on the solution of fluid flow problems using standard and AT schemes. We show that asynchrony creates additional scales with low energy content. The specific wavenumbers affected can be shown to be due to two distinct effects: the randomness in the arrival of messages and the corresponding switching between schemes. Understanding these errors allow us to effectively control them, rendering the method's feasibility in solving turbulent flows at realistic conditions on future computing systems.
Palomar, Esther; Chen, Xiaohong; Liu, Zhiming; Maharjan, Sabita; Bowen, Jonathan
2016-10-28
Smart city systems embrace major challenges associated with climate change, energy efficiency, mobility and future services by embedding the virtual space into a complex cyber-physical system. Those systems are constantly evolving and scaling up, involving a wide range of integration among users, devices, utilities, public services and also policies. Modelling such complex dynamic systems' architectures has always been essential for the development and application of techniques/tools to support design and deployment of integration of new components, as well as for the analysis, verification, simulation and testing to ensure trustworthiness. This article reports on the definition and implementation of a scalable component-based architecture that supports a cooperative energy demand response (DR) system coordinating energy usage between neighbouring households. The proposed architecture, called refinement of Cyber-Physical Component Systems (rCPCS), which extends the refinement calculus for component and object system (rCOS) modelling method, is implemented using Eclipse Extensible Coordination Tools (ECT), i.e., Reo coordination language. With rCPCS implementation in Reo, we specify the communication, synchronisation and co-operation amongst the heterogeneous components of the system assuring, by design scalability and the interoperability, correctness of component cooperation.
Palomar, Esther; Chen, Xiaohong; Liu, Zhiming; Maharjan, Sabita; Bowen, Jonathan
2016-01-01
Smart city systems embrace major challenges associated with climate change, energy efficiency, mobility and future services by embedding the virtual space into a complex cyber-physical system. Those systems are constantly evolving and scaling up, involving a wide range of integration among users, devices, utilities, public services and also policies. Modelling such complex dynamic systems’ architectures has always been essential for the development and application of techniques/tools to support design and deployment of integration of new components, as well as for the analysis, verification, simulation and testing to ensure trustworthiness. This article reports on the definition and implementation of a scalable component-based architecture that supports a cooperative energy demand response (DR) system coordinating energy usage between neighbouring households. The proposed architecture, called refinement of Cyber-Physical Component Systems (rCPCS), which extends the refinement calculus for component and object system (rCOS) modelling method, is implemented using Eclipse Extensible Coordination Tools (ECT), i.e., Reo coordination language. With rCPCS implementation in Reo, we specify the communication, synchronisation and co-operation amongst the heterogeneous components of the system assuring, by design scalability and the interoperability, correctness of component cooperation. PMID:27801829
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment.
Meng, Bowen; Pratx, Guillem; Xing, Lei
2011-12-01
Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT∕CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. In this work, we accelerated the Feldcamp-Davis-Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT∕CT reconstruction algorithm. Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10(-7). Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. An ultrafast, reliable and scalable 4D CBCT∕CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment.
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment
Meng, Bowen; Pratx, Guillem; Xing, Lei
2011-01-01
Purpose: Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT/CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. Methods: In this work, we accelerated the Feldcamp–Davis–Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT/CT reconstruction algorithm. Results: Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10−7. Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. Conclusions: An ultrafast, reliable and scalable 4D CBCT/CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment. PMID:22149842
NASA Astrophysics Data System (ADS)
Wong, N.; Grace, J. M.; Liang, J.; Owyang, S.; Storrs, A.; Zhou, J.; Rothschild, L. J.; Gentry, D.
2014-12-01
Life acclimated to harsh conditions is frequently difficult to study using normal lab techniques and conventional equipment. Simplified studies using in-lab 'simulated' extreme environments, such as UV bulbs or cold blocks, are manually intensive, error-prone, and lose many complexities of the microbe/environment interaction. We have built a prototype instrument to address this dilemma by allowing automated iterations of microbial cultures to be subject to combinations of modular environmental pressures such as heat, radiation, and chemical exposure. The presence of multiple sensors allows the state of the culture and internal environment to be continuously monitored and adjusted in response.Our first prototype showed successful iterations of microbial growth and thermal exposure. Our second prototype, presented here, performs an demonstration of repeated exposure of Escherichia coli to ultraviolet radiation, a well-established procedure. As the E. coli becomes more resistant to ultraviolet radiation, the device detects their increased survival and growth and increases the dosage accordingly. Calibration data for the instrument was generated by performing the same proof-of-concept exposure experiment, at a smaller scale, by hand. Current performance data indicates that our finalized instrument will have the ability to run hundreds of iterations with multiple selection pressures. The automated sensing and adaptive exposure that the device provides will inform the challenges of managing and culturing life tailored to uncommon environmental stresses. We have designed this device to be flexible, extensible, low-cost and easy to reproduce. We hope that it enter wide use as a tool for conducting scalable studies of the interaction between extremophiles and multiple environmental stresses, and potentially for generating artificial extremophiles as analogues for life we might find in extreme environments here on Earth or elsewhere.
NASA Astrophysics Data System (ADS)
Cacace, Mauro; Jacquey, Antoine B.
2017-09-01
Theory and numerical implementation describing groundwater flow and the transport of heat and solute mass in fully saturated fractured rocks with elasto-plastic mechanical feedbacks are developed. In our formulation, fractures are considered as being of lower dimension than the hosting deformable porous rock and we consider their hydraulic and mechanical apertures as scaling parameters to ensure continuous exchange of fluid mass and energy within the fracture-solid matrix system. The coupled system of equations is implemented in a new simulator code that makes use of a Galerkin finite-element technique. The code builds on a flexible, object-oriented numerical framework (MOOSE, Multiphysics Object Oriented Simulation Environment) which provides an extensive scalable parallel and implicit coupling to solve for the multiphysics problem. The governing equations of groundwater flow, heat and mass transport, and rock deformation are solved in a weak sense (either by classical Newton-Raphson or by free Jacobian inexact Newton-Krylow schemes) on an underlying unstructured mesh. Nonlinear feedbacks among the active processes are enforced by considering evolving fluid and rock properties depending on the thermo-hydro-mechanical state of the system and the local structure, i.e. degree of connectivity, of the fracture system. A suite of applications is presented to illustrate the flexibility and capability of the new simulator to address problems of increasing complexity and occurring at different spatial (from centimetres to tens of kilometres) and temporal scales (from minutes to hundreds of years).
Duro, Francisco Rodrigo; Blas, Javier Garcia; Isaila, Florin; ...
2016-10-06
The increasing volume of scientific data and the limited scalability and performance of storage systems are currently presenting a significant limitation for the productivity of the scientific workflows running on both high-performance computing (HPC) and cloud platforms. Clearly needed is better integration of storage systems and workflow engines to address this problem. This paper presents and evaluates a novel solution that leverages codesign principles for integrating Hercules—an in-memory data store—with a workflow management system. We consider four main aspects: workflow representation, task scheduling, task placement, and task termination. As a result, the experimental evaluation on both cloud and HPC systemsmore » demonstrates significant performance and scalability improvements over existing state-of-the-art approaches.« less
Leverage hadoop framework for large scale clinical informatics applications.
Dong, Xiao; Bahroos, Neil; Sadhu, Eugene; Jackson, Tommie; Chukhman, Morris; Johnson, Robert; Boyd, Andrew; Hynes, Denise
2013-01-01
In this manuscript, we present our experiences using the Apache Hadoop framework for high data volume and computationally intensive applications, and discuss some best practice guidelines in a clinical informatics setting. There are three main aspects in our approach: (a) process and integrate diverse, heterogeneous data sources using standard Hadoop programming tools and customized MapReduce programs; (b) after fine-grained aggregate results are obtained, perform data analysis using the Mahout data mining library; (c) leverage the column oriented features in HBase for patient centric modeling and complex temporal reasoning. This framework provides a scalable solution to meet the rapidly increasing, imperative "Big Data" needs of clinical and translational research. The intrinsic advantage of fault tolerance, high availability and scalability of Hadoop platform makes these applications readily deployable at the enterprise level cluster environment.
NASA Technical Reports Server (NTRS)
Kikuchi, Hideaki; Kalia, Rajiv; Nakano, Aiichiro; Vashishta, Priya; Iyetomi, Hiroshi; Ogata, Shuji; Kouno, Takahisa; Shimojo, Fuyuki; Tsuruta, Kanji; Saini, Subhash;
2002-01-01
A multidisciplinary, collaborative simulation has been performed on a Grid of geographically distributed PC clusters. The multiscale simulation approach seamlessly combines i) atomistic simulation backed on the molecular dynamics (MD) method and ii) quantum mechanical (QM) calculation based on the density functional theory (DFT), so that accurate but less scalable computations are performed only where they are needed. The multiscale MD/QM simulation code has been Grid-enabled using i) a modular, additive hybridization scheme, ii) multiple QM clustering, and iii) computation/communication overlapping. The Gridified MD/QM simulation code has been used to study environmental effects of water molecules on fracture in silicon. A preliminary run of the code has achieved a parallel efficiency of 94% on 25 PCs distributed over 3 PC clusters in the US and Japan, and a larger test involving 154 processors on 5 distributed PC clusters is in progress.
On the energy footprint of I/O management in Exascale HPC systems
Dorier, Matthieu; Yildiz, Orcun; Ibrahim, Shadi; ...
2016-03-21
The advent of unprecedentedly scalable yet energy hungry Exascale supercomputers poses a major challenge in sustaining a high performance-per-watt ratio. With I/O management acquiring a crucial role in supporting scientific simulations, various I/O management approaches have been proposed to achieve high performance and scalability. But, the details of how these approaches affect energy consumption have not been studied yet. Therefore, this paper aims to explore how much energy a supercomputer consumes while running scientific simulations when adopting various I/O management approaches. In particular, we closely examine three radically different I/O schemes including time partitioning, dedicated cores, and dedicated nodes. Tomore » accomplish this, we implement the three approaches within the Damaris I/O middleware and perform extensive experiments with one of the target HPC applications of the Blue Waters sustained-petaflop supercomputer project: the CM1 atmospheric model. Our experimental results obtained on the French Grid'5000 platform highlight the differences among these three approaches and illustrate in which way various configurations of the application and of the system can impact performance and energy consumption. Moreover, we propose and validate a mathematical model that estimates the energy consumption of a HPC simulation under different I/O approaches. This proposed model gives hints to pre-select the most energy-efficient I/O approach for a particular simulation on a particular HPC system and therefore provides a step towards energy-efficient HPC simulations in Exascale systems. To the best of our knowledge, our work provides the first in-depth look into the energy-performance tradeoffs of I/O management approaches.« less
Flexible high-resolution display systems for the next generation of radiology reading rooms
NASA Astrophysics Data System (ADS)
Caban, Jesus J.; Wood, Bradford J.; Park, Adrian
2007-03-01
A flexible, scalable, high-resolution display system is presented to support the next generation of radiology reading rooms or interventional radiology suites. The project aims to create an environment for radiologists that will simultaneously facilitate image interpretation, analysis, and understanding while lowering visual and cognitive stress. Displays currently in use present radiologists with technical challenges to exploring complex datasets that we seek to address. These include resolution and brightness, display and ambient lighting differences, and degrees of complexity in addition to side-by-side comparison of time-variant and 2D/3D images. We address these issues through a scalable projector-based system that uses our custom-designed geometrical and photometrical calibration process to create a seamless, bright, high-resolution display environment that can reduce the visual fatigue commonly experienced by radiologists. The system we have designed uses an array of casually aligned projectors to cooperatively increase overall resolution and brightness. Images from a set of projectors in their narrowest zoom are combined at a shared projection surface, thus increasing the global "pixels per inch" (PPI) of the display environment. Two primary challenges - geometric calibration and photometric calibration - remained to be resolved before our high-resolution display system could be used in a radiology reading room or procedure suite. In this paper we present a method that accomplishes those calibrations and creates a flexible high-resolution display environment that appears seamless, sharp, and uniform across different devices.
A Grid Metadata Service for Earth and Environmental Sciences
NASA Astrophysics Data System (ADS)
Fiore, Sandro; Negro, Alessandro; Aloisio, Giovanni
2010-05-01
Critical challenges for climate modeling researchers are strongly connected with the increasingly complex simulation models and the huge quantities of produced datasets. Future trends in climate modeling will only increase computational and storage requirements. For this reason the ability to transparently access to both computational and data resources for large-scale complex climate simulations must be considered as a key requirement for Earth Science and Environmental distributed systems. From the data management perspective (i) the quantity of data will continuously increases, (ii) data will become more and more distributed and widespread, (iii) data sharing/federation will represent a key challenging issue among different sites distributed worldwide, (iv) the potential community of users (large and heterogeneous) will be interested in discovery experimental results, searching of metadata, browsing collections of files, compare different results, display output, etc.; A key element to carry out data search and discovery, manage and access huge and distributed amount of data is the metadata handling framework. What we propose for the management of distributed datasets is the GRelC service (a data grid solution focusing on metadata management). Despite the classical approaches, the proposed data-grid solution is able to address scalability, transparency, security and efficiency and interoperability. The GRelC service we propose is able to provide access to metadata stored in different and widespread data sources (relational databases running on top of MySQL, Oracle, DB2, etc. leveraging SQL as query language, as well as XML databases - XIndice, eXist, and libxml2 based documents, adopting either XPath or XQuery) providing a strong data virtualization layer in a grid environment. Such a technological solution for distributed metadata management leverages on well known adopted standards (W3C, OASIS, etc.); (ii) supports role-based management (based on VOMS), which increases flexibility and scalability; (iii) provides full support for Grid Security Infrastructure, which means (authorization, mutual authentication, data integrity, data confidentiality and delegation); (iv) is compatible with existing grid middleware such as gLite and Globus and finally (v) is currently adopted at the Euro-Mediterranean Centre for Climate Change (CMCC - Italy) to manage the entire CMCC data production activity as well as in the international Climate-G testbed.
High Performance Parallel Computational Nanotechnology
NASA Technical Reports Server (NTRS)
Saini, Subhash; Craw, James M. (Technical Monitor)
1995-01-01
At a recent press conference, NASA Administrator Dan Goldin encouraged NASA Ames Research Center to take a lead role in promoting research and development of advanced, high-performance computer technology, including nanotechnology. Manufacturers of leading-edge microprocessors currently perform large-scale simulations in the design and verification of semiconductor devices and microprocessors. Recently, the need for this intensive simulation and modeling analysis has greatly increased, due in part to the ever-increasing complexity of these devices, as well as the lessons of experiences such as the Pentium fiasco. Simulation, modeling, testing, and validation will be even more important for designing molecular computers because of the complex specification of millions of atoms, thousands of assembly steps, as well as the simulation and modeling needed to ensure reliable, robust and efficient fabrication of the molecular devices. The software for this capacity does not exist today, but it can be extrapolated from the software currently used in molecular modeling for other applications: semi-empirical methods, ab initio methods, self-consistent field methods, Hartree-Fock methods, molecular mechanics; and simulation methods for diamondoid structures. In as much as it seems clear that the application of such methods in nanotechnology will require powerful, highly powerful systems, this talk will discuss techniques and issues for performing these types of computations on parallel systems. We will describe system design issues (memory, I/O, mass storage, operating system requirements, special user interface issues, interconnects, bandwidths, and programming languages) involved in parallel methods for scalable classical, semiclassical, quantum, molecular mechanics, and continuum models; molecular nanotechnology computer-aided designs (NanoCAD) techniques; visualization using virtual reality techniques of structural models and assembly sequences; software required to control mini robotic manipulators for positional control; scalable numerical algorithms for reliability, verifications and testability. There appears no fundamental obstacle to simulating molecular compilers and molecular computers on high performance parallel computers, just as the Boeing 777 was simulated on a computer before manufacturing it.
Scalability of Parallel Spatial Direct Numerical Simulations on Intel Hypercube and IBM SP1 and SP2
NASA Technical Reports Server (NTRS)
Joslin, Ronald D.; Hanebutte, Ulf R.; Zubair, Mohammad
1995-01-01
The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube and IBM SP1 and SP2 parallel computers is documented. Spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows are computed with the PSDNS code. The feasibility of using the PSDNS to perform transition studies on these computers is examined. The results indicate that PSDNS approach can effectively be parallelized on a distributed-memory parallel machine by remapping the distributed data structure during the course of the calculation. Scalability information is provided to estimate computational costs to match the actual costs relative to changes in the number of grid points. By increasing the number of processors, slower than linear speedups are achieved with optimized (machine-dependent library) routines. This slower than linear speedup results because the computational cost is dominated by FFT routine, which yields less than ideal speedups. By using appropriate compile options and optimized library routines on the SP1, the serial code achieves 52-56 M ops on a single node of the SP1 (45 percent of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a "real world" simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP supercomputer. For the same simulation, 32-nodes of the SP1 and SP2 are required to reach the performance of a Cray C-90. A 32 node SP1 (SP2) configuration is 2.9 (4.6) times faster than a Cray Y/MP for this simulation, while the hypercube is roughly 2 times slower than the Y/MP for this application. KEY WORDS: Spatial direct numerical simulations; incompressible viscous flows; spectral methods; finite differences; parallel computing.
Status of Solar Sail Technology Within NASA
NASA Technical Reports Server (NTRS)
Johnson, Les; Young, Roy; Montgomery, Edward; Alhorn, Dean
2010-01-01
In the early 2000s, NASA made substantial progress in the development of solar sail propulsion systems for use in robotic science and exploration of the solar system. Two different 20-m solar sail systems were produced and they successfully completed functional vacuum testing in NASA Glenn Research Center's (GRC's) Space Power Facility at Plum Brook Station, Ohio. The sails were designed and developed by ATK Space Systems and L Garde, respectively. The sail systems consist of a central structure with four deployable booms that support the sails. These sail designs are robust enough for deployment in a one-atmosphere, one-gravity environment and were scalable to much larger solar sails perhaps as large as 150 m on a side. Computation modeling and analytical simulations were also performed to assess the scalability of the technology to the large sizes required to implement the first generation of missions using solar sails. Life and space environmental effects testing of sail and component materials were also conducted. NASA terminated funding for solar sails and other advanced space propulsion technologies shortly after these ground demonstrations were completed. In order to capitalize on the $30M investment made in solar sail technology to that point, NASA Marshall Space Flight Center (MSFC) funded the NanoSail-D, a subscale solar sail system designed for possible small spacecraft applications. The NanoSail-D mission flew on board the ill-fated Falcon-1 Rocket launched August 2, 2008, and due to the failure of that rocket, never achieved orbit. The NanoSail-D flight spare will be flown in the Fall of 2010. This paper will summarize NASA's investment in solar sail technology to-date and discuss future opportunities
Status of solar sail technology within NASA
NASA Astrophysics Data System (ADS)
Johnson, Les; Young, Roy; Montgomery, Edward; Alhorn, Dean
2011-12-01
In the early 2000s, NASA made substantial progress in the development of solar sail propulsion systems for use in robotic science and exploration of the solar system. Two different 20-m solar sail systems were produced. NASA has successfully completed functional vacuum testing in their Glenn Research Center's Space Power Facility at Plum Brook Station, Ohio. The sails were designed and developed by Alliant Techsystems Space Systems and L'Garde, respectively. The sail systems consist of a central structure with four deployable booms that support each sail. These sail designs are robust enough for deployment in a one-atmosphere, one-gravity environment and are scalable to much larger solar sails - perhaps as large as 150 m on a side. Computation modeling and analytical simulations were performed in order to assess the scalability of the technology to the larger sizes that are required to implement the first generation of missions using solar sails. Furthermore, life and space environmental effects testing of sail and component materials was also conducted.NASA terminated funding for solar sails and other advanced space propulsion technologies shortly after these ground demonstrations were completed. In order to capitalize on the $30 M investment made in solar sail technology to that point, NASA Marshall Space Flight Center funded the NanoSail-D, a subscale solar sail system designed for possible small spacecraft applications. The NanoSail-D mission flew on board a Falcon-1 rocket, launched August 2, 2008. As a result of the failure of that rocket, the NanoSail-D was never successfully given the opportunity to achieve orbit. The NanoSail-D flight spare was flown in the Fall of 2010. This review paper summarizes NASA's investment in solar sail technology to date and discusses future opportunities.
NASA Technical Reports Server (NTRS)
Long, M. S.; Yantosca, R.; Nielsen, J. E; Keller, C. A.; Da Silva, A.; Sulprizio, M. P.; Pawson, S.; Jacob, D. J.
2015-01-01
The GEOS-Chem global chemical transport model (CTM), used by a large atmospheric chemistry research community, has been re-engineered to also serve as an atmospheric chemistry module for Earth system models (ESMs). This was done using an Earth System Modeling Framework (ESMF) interface that operates independently of the GEOSChem scientific code, permitting the exact same GEOSChem code to be used as an ESM module or as a standalone CTM. In this manner, the continual stream of updates contributed by the CTM user community is automatically passed on to the ESM module, which remains state of science and referenced to the latest version of the standard GEOS-Chem CTM. A major step in this re-engineering was to make GEOS-Chem grid independent, i.e., capable of using any geophysical grid specified at run time. GEOS-Chem data sockets were also created for communication between modules and with external ESM code. The grid-independent, ESMF-compatible GEOS-Chem is now the standard version of the GEOS-Chem CTM. It has been implemented as an atmospheric chemistry module into the NASA GEOS- 5 ESM. The coupled GEOS-5-GEOS-Chem system was tested for scalability and performance with a tropospheric oxidant-aerosol simulation (120 coupled species, 66 transported tracers) using 48-240 cores and message-passing interface (MPI) distributed-memory parallelization. Numerical experiments demonstrate that the GEOS-Chem chemistry module scales efficiently for the number of cores tested, with no degradation as the number of cores increases. Although inclusion of atmospheric chemistry in ESMs is computationally expensive, the excellent scalability of the chemistry module means that the relative cost goes down with increasing number of cores in a massively parallel environment.
Tan, Apphia Jia Qi; Lee, Cindy Ching Siang; Lin, Patrick Yongxing; Cooper, Simon; Lau, Lydia Siew Tiang; Chua, Wei Ling; Liaw, Sok Ying
2017-08-01
Preparing nursing students for the knowledge and skills required for the administration and monitoring of blood components is crucial for entry into clinical practice. Serious games create opportunities to develop this competency, which can be used as a self-directed learning strategy to complement existing didactic learning and simulation-based strategies. To describe the development and evaluation of a serious game to improve nursing students' knowledge, confidence, and performance in blood transfusion. An experiential gaming model was applied to guide the design of the serious game environment. A clustered, randomized controlled trial was conducted with 103 second-year undergraduate nursing students who were randomized into control or experimental groups. After a baseline evaluation of the participants' knowledge and confidence on blood transfusion procedure, the experimental group undertook a blood transfusion serious game and completed a questionnaire to evaluate their learning experience. All participants' clinical performances were evaluated in a simulated environment. The post-test knowledge and confidence mean scores of the experimental group improved significantly (p<0.001) after the serious game intervention compared to pre-test mean scores and to post-test mean scores of the control group (p<0.001). However, no significance difference (p=0.11) was found between the experimental and control groups on the post-test performance mean scores. The participants evaluated the serious game positively. The study provided evidence on the effectiveness of a serious game in improving the knowledge and confidence of nursing students on blood transfusion practice. The features of this serious game could be further developed to incorporate additional scenarios with repetitive exercises and feedback to enhance the impact on clinical performance. Given the flexibility, practicality, and scalability of such a game, they can serve as a promising approach to optimize learning when blended with high-fidelity simulation. Copyright © 2017. Published by Elsevier Ltd.
ERIC Educational Resources Information Center
Islam, Muhammad Faysal
2013-01-01
Cloud computing offers the advantage of on-demand, reliable and cost efficient computing solutions without the capital investment and management resources to build and maintain in-house data centers and network infrastructures. Scalability of cloud solutions enable consumers to upgrade or downsize their services as needed. In a cloud environment,…
CLON: Overlay Networks and Gossip Protocols for Cloud Environments
NASA Astrophysics Data System (ADS)
Matos, Miguel; Sousa, António; Pereira, José; Oliveira, Rui; Deliot, Eric; Murray, Paul
Although epidemic or gossip-based multicast is a robust and scalable approach to reliable data dissemination, its inherent redundancy results in high resource consumption on both links and nodes. This problem is aggravated in settings that have costlier or resource constrained links as happens in Cloud Computing infrastructures composed by several interconnected data centers across the globe.
Sze, Yan Yan; Stein, Jeffrey S; Bickel, Warren K; Paluch, Rocco A; Epstein, Leonard H
2017-07-01
Obesity is associated with steep discounting of the future and increased food reinforcement. Episodic future thinking (EFT), a type of prospective thinking, has been observed to reduce delay discounting (DD) and improve dietary decision making. In contrast, negative income shock (i.e., abrupt transitions to poverty) has been shown to increase discounting and may worsen dietary decision-making. Scalability of EFT training and protective effects of EFT against simulated negative income shock on DD and demand for food were assessed. In two experiments we showed online-administered EFT reliably reduced DD. Furthermore, EFT reduced DD and demand for fast foods even when challenged by negative income shock. Our findings suggest EFT is a scalable intervention that has implications for improving public health by reducing discounting of the future and demand for high energy dense food.
Simplified Parallel Domain Traversal
DOE Office of Scientific and Technical Information (OSTI.GOV)
Erickson III, David J
2011-01-01
Many data-intensive scientific analysis techniques require global domain traversal, which over the years has been a bottleneck for efficient parallelization across distributed-memory architectures. Inspired by MapReduce and other simplified parallel programming approaches, we have designed DStep, a flexible system that greatly simplifies efficient parallelization of domain traversal techniques at scale. In order to deliver both simplicity to users as well as scalability on HPC platforms, we introduce a novel two-tiered communication architecture for managing and exploiting asynchronous communication loads. We also integrate our design with advanced parallel I/O techniques that operate directly on native simulation output. We demonstrate DStep bymore » performing teleconnection analysis across ensemble runs of terascale atmospheric CO{sub 2} and climate data, and we show scalability results on up to 65,536 IBM BlueGene/P cores.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gerrits, Thomas; Lita, Adriana E.; Calkins, Brice
Integration is currently the only feasible route toward scalable photonic quantum processing devices that are sufficiently complex to be genuinely useful in computing, metrology, and simulation. Embedded on-chip detection will be critical to such devices. We demonstrate an integrated photon-number-resolving detector, operating in the telecom band at 1550 nm, employing an evanescently coupled design that allows it to be placed at arbitrary locations within a planar circuit. Up to five photons are resolved in the guided optical mode via absorption from the evanescent field into a tungsten transition-edge sensor. The detection efficiency is 7.2{+-}0.5 %. The polarization sensitivity of themore » detector is also demonstrated. Detailed modeling of device designs shows a clear and feasible route to reaching high detection efficiencies.« less
ERIC Educational Resources Information Center
Repenning, Alexander; Webb, David C.; Koh, Kyu Han; Nickerson, Hilarie; Miller, Susan B.; Brand, Catharine; Her Many Horses, Ian; Basawapatna, Ashok; Gluck, Fred; Grover, Ryan; Gutierrez, Kris; Repenning, Nadia
2015-01-01
An educated citizenry that participates in and contributes to science technology engineering and mathematics innovation in the 21st century will require broad literacy and skills in computer science (CS). School systems will need to give increased attention to opportunities for students to engage in computational thinking and ways to promote a…
A Framework for Developing Scalable Geodesign Products
2017-09-01
Jean S. Noellsch Writer/Editor (CTR) Information Science and Knowledge Management Branch Engineer Research and Development Center ERDC/CERL MP-17...a design proposal with im- pact simulations informed by geographic context” (Flaxman 2009, 29). Stephen Ervin, a landscape architect, emphasizes the...communications technologies to foster collaborative, information -based design projects, and that depends upon timely feedback about impacts and implications of
NASA Astrophysics Data System (ADS)
Wang, Shengtao
The ability to precisely and coherently control atomic systems has improved dramatically in the last two decades, driving remarkable advancements in quantum computation and simulation. In recent years, atomic and atom-like systems have also been served as a platform to study topological phases of matter and non-equilibrium many-body physics. Integrated with rapid theoretical progress, the employment of these systems is expanding the realm of our understanding on a range of physical phenomena. In this dissertation, I draw on state-of-the-art experimental technology to develop several new ideas for controlling and applying atomic systems. In the first part of this dissertation, we propose several novel schemes to realize, detect, and probe topological phases in atomic and atom-like systems. We first theoretically study the intriguing properties of Hopf insulators, a peculiar type of topological insulators beyond the standard classification paradigm of topological phases. Using a solid-state quantum simulator, we report the first experimental observation of Hopf insulators. We demonstrate the Hopf fibration with fascinating topological links in the experiment, showing clear signals of topological phase transitions for the underlying Hamiltonian. Next, we propose a feasible experimental scheme to realize the chiral topological insulator in three dimensions. They are a type of topological insulators protected by the chiral symmetry and have thus far remained unobserved in experiment. We then introduce a method to directly measure topological invariants in cold-atom experiments. This detection scheme is general and applicable to probe of different topological insulators in any spatial dimension. In another study, we theoretically discover a new type of topological gapless rings, dubbed a Weyl exceptional ring, in three-dimensional dissipative cold atomic systems. In the second part of this dissertation, we focus on the application of atomic systems in quantum computation and simulation. Trapped atomic ions are one of the leading platforms to build a scalable, universal quantum computer. The common one-dimensional setup, however, greatly limits the system's scalability. By solving the critical problem of micromotion, we propose a two-dimensional architecture for scalable trapped-ion quantum computation. Hamiltonian tomography for many-body quantum systems is essential for benchmarking quantum computation and simulation. By employing dynamical decoupling, we propose a scalable scheme for full Hamiltonian tomography. The required number of measurements increases only polynomially with the system size, in contrast to an exponential scaling in common methods. Finally, we work toward the goal of demonstrating quantum supremacy. A number of sampling tasks, such as the boson sampling problem, have been proposed to be classically intractable under mild assumptions. An intermediate quantum computer can efficiently solve the sampling problem, but the correct operation of the device is not known to be classically verifiable. Toward practical verification, we present an experimental friendly scheme to extract useful and robust information from the quantum boson samplers based on coarse-grained measurements. In a separate study, we introduce a new model built from translation-invariant Ising-interacting spins. This model possesses several advantageous properties, catalyzing the ultimate experimental demonstration of quantum supremacy.
NASA Astrophysics Data System (ADS)
Watari, S.; Morikawa, Y.; Yamamoto, K.; Inoue, S.; Tsubouchi, K.; Fukazawa, K.; Kimura, E.; Tatebe, O.; Kato, H.; Shimojo, S.; Murata, K. T.
2010-12-01
In the Solar-Terrestrial Physics (STP) field, spatio-temporal resolution of computer simulations is getting higher and higher because of tremendous advancement of supercomputers. A more advanced technology is Grid Computing that integrates distributed computational resources to provide scalable computing resources. In the simulation research, it is effective that a researcher oneself designs his physical model, performs calculations with a supercomputer, and analyzes and visualizes for consideration by a familiar method. A supercomputer is far from an analysis and visualization environment. In general, a researcher analyzes and visualizes in the workstation (WS) managed at hand because the installation and the operation of software in the WS are easy. Therefore, it is necessary to copy the data from the supercomputer to WS manually. Time necessary for the data transfer through long delay network disturbs high-accuracy simulations actually. In terms of usefulness, integrating a supercomputer and an analysis and visualization environment seamlessly with a researcher's familiar method is important. NICT has been developing a cloud computing environment (NICT Space Weather Cloud). In the NICT Space Weather Cloud, disk servers are located near its supercomputer and WSs for data analysis and visualization. They are connected to JGN2plus that is high-speed network for research and development. Distributed virtual high-capacity storage is also constructed by Grid Datafarm (Gfarm v2). Huge-size data output from the supercomputer is transferred to the virtual storage through JGN2plus. A researcher can concentrate on the research by a familiar method without regard to distance between a supercomputer and an analysis and visualization environment. Now, total 16 disk servers are setup in NICT headquarters (at Koganei, Tokyo), JGN2plus NOC (at Otemachi, Tokyo), Okinawa Subtropical Environment Remote-Sensing Center, and Cybermedia Center, Osaka University. They are connected on JGN2plus, and they constitute 1PB (physical size) virtual storage by Gfarm v2. These disk servers are connected with supercomputers of NICT and Osaka University. A system that data output from the supercomputers are automatically transferred to the virtual storage had been built up. Transfer rate is about 50 GB/hrs by actual measurement. It is estimated that the performance is reasonable for a certain simulation and analysis for reconstruction of coronal magnetic field. This research is assumed an experiment of the system, and the verification of practicality is advanced at the same time. Herein we introduce an overview of the space weather cloud system so far we have developed. We also demonstrate several scientific results using the space weather cloud system. We also introduce several web applications of the cloud as a service of the space weather cloud, which is named as "e-SpaceWeather" (e-SW). The e-SW provides with a variety of space weather online services from many aspects.
Perspectives on the Future of CFD
NASA Technical Reports Server (NTRS)
Kwak, Dochan
2000-01-01
This viewgraph presentation gives an overview of the future of computational fluid dynamics (CFD), which in the past has pioneered the field of flow simulation. Over time CFD has progressed as computing power. Numerical methods have been advanced as CPU and memory capacity increases. Complex configurations are routinely computed now and direct numerical simulations (DNS) and large eddy simulations (LES) are used to study turbulence. As the computing resources changed to parallel and distributed platforms, computer science aspects such as scalability (algorithmic and implementation) and portability and transparent codings have advanced. Examples of potential future (or current) challenges include risk assessment, limitations of the heuristic model, and the development of CFD and information technology (IT) tools.
A highly efficient 3D level-set grain growth algorithm tailored for ccNUMA architecture
NASA Astrophysics Data System (ADS)
Mießen, C.; Velinov, N.; Gottstein, G.; Barrales-Mora, L. A.
2017-12-01
A highly efficient simulation model for 2D and 3D grain growth was developed based on the level-set method. The model introduces modern computational concepts to achieve excellent performance on parallel computer architectures. Strong scalability was measured on cache-coherent non-uniform memory access (ccNUMA) architectures. To achieve this, the proposed approach considers the application of local level-set functions at the grain level. Ideal and non-ideal grain growth was simulated in 3D with the objective to study the evolution of statistical representative volume elements in polycrystals. In addition, microstructure evolution in an anisotropic magnetic material affected by an external magnetic field was simulated.
NASA Astrophysics Data System (ADS)
Mapakshi, N. K.; Chang, J.; Nakshatrala, K. B.
2018-04-01
Mathematical models for flow through porous media typically enjoy the so-called maximum principles, which place bounds on the pressure field. It is highly desirable to preserve these bounds on the pressure field in predictive numerical simulations, that is, one needs to satisfy discrete maximum principles (DMP). Unfortunately, many of the existing formulations for flow through porous media models do not satisfy DMP. This paper presents a robust, scalable numerical formulation based on variational inequalities (VI), to model non-linear flows through heterogeneous, anisotropic porous media without violating DMP. VI is an optimization technique that places bounds on the numerical solutions of partial differential equations. To crystallize the ideas, a modification to Darcy equations by taking into account pressure-dependent viscosity will be discretized using the lowest-order Raviart-Thomas (RT0) and Variational Multi-scale (VMS) finite element formulations. It will be shown that these formulations violate DMP, and, in fact, these violations increase with an increase in anisotropy. It will be shown that the proposed VI-based formulation provides a viable route to enforce DMP. Moreover, it will be shown that the proposed formulation is scalable, and can work with any numerical discretization and weak form. A series of numerical benchmark problems are solved to demonstrate the effects of heterogeneity, anisotropy and non-linearity on DMP violations under the two chosen formulations (RT0 and VMS), and that of non-linearity on solver convergence for the proposed VI-based formulation. Parallel scalability on modern computational platforms will be illustrated through strong-scaling studies, which will prove the efficiency of the proposed formulation in a parallel setting. Algorithmic scalability as the problem size is scaled up will be demonstrated through novel static-scaling studies. The performed static-scaling studies can serve as a guide for users to be able to select an appropriate discretization for a given problem size.
High performance MRI simulations of motion on multi-GPU systems
2014-01-01
Background MRI physics simulators have been developed in the past for optimizing imaging protocols and for training purposes. However, these simulators have only addressed motion within a limited scope. The purpose of this study was the incorporation of realistic motion, such as cardiac motion, respiratory motion and flow, within MRI simulations in a high performance multi-GPU environment. Methods Three different motion models were introduced in the Magnetic Resonance Imaging SIMULator (MRISIMUL) of this study: cardiac motion, respiratory motion and flow. Simulation of a simple Gradient Echo pulse sequence and a CINE pulse sequence on the corresponding anatomical model was performed. Myocardial tagging was also investigated. In pulse sequence design, software crushers were introduced to accommodate the long execution times in order to avoid spurious echoes formation. The displacement of the anatomical model isochromats was calculated within the Graphics Processing Unit (GPU) kernel for every timestep of the pulse sequence. Experiments that would allow simulation of custom anatomical and motion models were also performed. Last, simulations of motion with MRISIMUL on single-node and multi-node multi-GPU systems were examined. Results Gradient Echo and CINE images of the three motion models were produced and motion-related artifacts were demonstrated. The temporal evolution of the contractility of the heart was presented through the application of myocardial tagging. Better simulation performance and image quality were presented through the introduction of software crushers without the need to further increase the computational load and GPU resources. Last, MRISIMUL demonstrated an almost linear scalable performance with the increasing number of available GPU cards, in both single-node and multi-node multi-GPU computer systems. Conclusions MRISIMUL is the first MR physics simulator to have implemented motion with a 3D large computational load on a single computer multi-GPU configuration. The incorporation of realistic motion models, such as cardiac motion, respiratory motion and flow may benefit the design and optimization of existing or new MR pulse sequences, protocols and algorithms, which examine motion related MR applications. PMID:24996972
Reversible Parallel Discrete-Event Execution of Large-scale Epidemic Outbreak Models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perumalla, Kalyan S; Seal, Sudip K
2010-01-01
The spatial scale, runtime speed and behavioral detail of epidemic outbreak simulations together require the use of large-scale parallel processing. In this paper, an optimistic parallel discrete event execution of a reaction-diffusion simulation model of epidemic outbreaks is presented, with an implementation over themore » $$\\mu$$sik simulator. Rollback support is achieved with the development of a novel reversible model that combines reverse computation with a small amount of incremental state saving. Parallel speedup and other runtime performance metrics of the simulation are tested on a small (8,192-core) Blue Gene / P system, while scalability is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes (up to several hundred million individuals in the largest case) are exercised.« less
2HOT: An Improved Parallel Hashed Oct-Tree N-Body Algorithm for Cosmological Simulation
Warren, Michael S.
2014-01-01
We report on improvements made over the past two decades to our adaptive treecode N-body method (HOT). A mathematical and computational approach to the cosmological N-body problem is described, with performance and scalability measured up to 256k (2 18 ) processors. We present error analysis and scientific application results from a series of more than ten 69 billion (4096 3 ) particle cosmological simulations, accounting for 4×10 20 floating point operations. These results include the first simulations using the new constraints on the standard model of cosmology from the Planck satellite. Our simulations set a new standard for accuracy andmore » scientific throughput, while meeting or exceeding the computational efficiency of the latest generation of hybrid TreePM N-body methods.« less
Parallel Grand Canonical Monte Carlo (ParaGrandMC) Simulation Code
NASA Technical Reports Server (NTRS)
Yamakov, Vesselin I.
2016-01-01
This report provides an overview of the Parallel Grand Canonical Monte Carlo (ParaGrandMC) simulation code. This is a highly scalable parallel FORTRAN code for simulating the thermodynamic evolution of metal alloy systems at the atomic level, and predicting the thermodynamic state, phase diagram, chemical composition and mechanical properties. The code is designed to simulate multi-component alloy systems, predict solid-state phase transformations such as austenite-martensite transformations, precipitate formation, recrystallization, capillary effects at interfaces, surface absorption, etc., which can aid the design of novel metallic alloys. While the software is mainly tailored for modeling metal alloys, it can also be used for other types of solid-state systems, and to some degree for liquid or gaseous systems, including multiphase systems forming solid-liquid-gas interfaces.
Parallel Simulation of Three-Dimensional Free Surface Fluid Flow Problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
BAER,THOMAS A.; SACKINGER,PHILIP A.; SUBIA,SAMUEL R.
1999-10-14
Simulation of viscous three-dimensional fluid flow typically involves a large number of unknowns. When free surfaces are included, the number of unknowns increases dramatically. Consequently, this class of problem is an obvious application of parallel high performance computing. We describe parallel computation of viscous, incompressible, free surface, Newtonian fluid flow problems that include dynamic contact fines. The Galerkin finite element method was used to discretize the fully-coupled governing conservation equations and a ''pseudo-solid'' mesh mapping approach was used to determine the shape of the free surface. In this approach, the finite element mesh is allowed to deform to satisfy quasi-staticmore » solid mechanics equations subject to geometric or kinematic constraints on the boundaries. As a result, nodal displacements must be included in the set of unknowns. Other issues discussed are the proper constraints appearing along the dynamic contact line in three dimensions. Issues affecting efficient parallel simulations include problem decomposition to equally distribute computational work among a SPMD computer and determination of robust, scalable preconditioners for the distributed matrix systems that must be solved. Solution continuation strategies important for serial simulations have an enhanced relevance in a parallel coquting environment due to the difficulty of solving large scale systems. Parallel computations will be demonstrated on an example taken from the coating flow industry: flow in the vicinity of a slot coater edge. This is a three dimensional free surface problem possessing a contact line that advances at the web speed in one region but transitions to static behavior in another region. As such, a significant fraction of the computational time is devoted to processing boundary data. Discussion focuses on parallel speed ups for fixed problem size, a class of problems of immediate practical importance.« less
Multivariable Robust Control of a Simulated Hybrid Solid Oxide Fuel Cell Gas Turbine Plant
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tsai, Alex; Banta, Larry; Tucker, David
2010-08-01
This work presents a systematic approach to the multivariable robust control of a hybrid fuel cell gas turbine plant. The hybrid configuration under investigation built by the National Energy Technology Laboratory comprises a physical simulation of a 300kW fuel cell coupled to a 120kW auxiliary power unit single spool gas turbine. The public facility provides for the testing and simulation of different fuel cell models that in turn help identify the key difficulties encountered in the transient operation of such systems. An empirical model of the built facility comprising a simulated fuel cell cathode volume and balance of plant componentsmore » is derived via frequency response data. Through the modulation of various airflow bypass valves within the hybrid configuration, Bode plots are used to derive key input/output interactions in transfer function format. A multivariate system is then built from individual transfer functions, creating a matrix that serves as the nominal plant in an H{sub {infinity}} robust control algorithm. The controller’s main objective is to track and maintain hybrid operational constraints in the fuel cell’s cathode airflow, and the turbo machinery states of temperature and speed, under transient disturbances. This algorithm is then tested on a Simulink/MatLab platform for various perturbations of load and fuel cell heat effluence. As a complementary tool to the aforementioned empirical plant, a nonlinear analytical model faithful to the existing process and instrumentation arrangement is evaluated and designed in the Simulink environment. This parallel task intends to serve as a building block to scalable hybrid configurations that might require a more detailed nonlinear representation for a wide variety of controller schemes and hardware implementations.« less
Millimeter-Wave Wireless Power Transfer Technology for Space Applications
NASA Technical Reports Server (NTRS)
Chattopadhyay, Goutam; Manohara, Harish; Mojarradi, Mohammad M.; Vo, Tuan A.; Mojarradi, Hadi; Bae, Sam Y.; Marzwell, Neville
2008-01-01
In this paper we present a new compact, scalable, and low cost technology for efficient receiving of power using RF waves at 94 GHz. This technology employs a highly innovative array of slot antennas that is integrated on substrate composed of gold (Au), silicon (Si), and silicon dioxide (SiO2) layers. The length of the slots and spacing between them are optimized for a highly efficient beam through a 3-D electromagnetic simulation process. Antenna simulation results shows a good beam profile with very low side lobe levels and better than 93% antenna efficiency.
NASA Astrophysics Data System (ADS)
Poulet, T.; Veveakis, M.; Paesold, M.; Regenauer-Lieb, K.
2014-12-01
Multiphysics modelling has become an indispensable tool for geoscientists to simulate the complex behaviours observed in their various fields of study where multiple processes are involved, including thermal, hydraulic, mechanical and chemical (THMC) laws. This modelling activity involves simulations that are computationally expensive and its soaring uptake is tightly linked to the increasing availability of supercomputing power and easy access to powerful nonlinear solvers such as PETSc (http://www.mcs.anl.gov/petsc/). The Multiphysics Object-Oriented Simulation Environment (MOOSE) is a finite-element, multiphysics framework (http://mooseframework.org) that can harness such computational power and allow scientists to develop easily some tightly-coupled fully implicit multiphysics simulations that run automatically in parallel on large clusters. This open-source framework provides a powerful tool to collaborate on numerical modelling activities and we are contributing to its development with REDBACK (https://github.com/pou036/redback), a module for Rock mEchanics with Dissipative feedBACKs. REDBACK builds on the tensor mechanics finite strain implementation available in MOOSE to provide a THMC simulator where the energetic formulation highlights the importance of all dissipative terms in the coupled system of equations. We show first applications of fully coupled dehydration reactions triggering episodic fluid transfer through shear zones (Alevizos et al, 2014). The dimensionless approach used allows focusing on the critical underlying variables which are driving the resulting behaviours observed and this tool is specifically designed to study material instabilities underpinning geological features like faulting, folding, boudinage, shearing, fracturing, etc. REDBACK provides a collaborative and educational tool which captures the physical and mathematical understanding of such material instabilities and provides an easy way to apply this knowledge to realistic scenarios, where the size and complexity of the geometries considered, along with the material parameters distributions, add as many sources of different instabilities. References: Alevizos, S., T. Poulet, and E. Veveakis (2014), J. Geophys. Res., 119, 4558-4582, doi:10.1002/2013JB010070.
Medical image archive node simulation and architecture
NASA Astrophysics Data System (ADS)
Chiang, Ted T.; Tang, Yau-Kuo
1996-05-01
It is a well known fact that managed care and new treatment technologies are revolutionizing the health care provider world. Community Health Information Network and Computer-based Patient Record projects are underway throughout the United States. More and more hospitals are installing digital, `filmless' radiology (and other imagery) systems. They generate a staggering amount of information around the clock. For example, a typical 500-bed hospital might accumulate more than 5 terabytes of image data in a period of 30 years for conventional x-ray images and digital images such as Magnetic Resonance Imaging and Computer Tomography images. With several hospitals contributing to the archive, the storage required will be in the hundreds of terabytes. Systems for reliable, secure, and inexpensive storage and retrieval of digital medical information do not exist today. In this paper, we present a Medical Image Archive and Distribution Service (MIADS) concept. MIADS is a system shared by individual and community hospitals, laboratories, and doctors' offices that need to store and retrieve medical images. Due to the large volume and complexity of the data, as well as the diversified user access requirement, implementation of the MIADS will be a complex procedure. One of the key challenges to implementing a MIADS is to select a cost-effective, scalable system architecture to meet the ingest/retrieval performance requirements. We have performed an in-depth system engineering study, and developed a sophisticated simulation model to address this key challenge. This paper describes the overall system architecture based on our system engineering study and simulation results. In particular, we will emphasize system scalability and upgradability issues. Furthermore, we will discuss our simulation results in detail. The simulations study the ingest/retrieval performance requirements based on different system configurations and architectures for variables such as workload, tape access time, number of drives, number of exams per patient, number of Central Processing Units, patient grouping, and priority impacts. The MIADS, which could be a key component of a broader data repository system, will be able to communicate with and obtain data from existing hospital information systems. We will discuss the external interfaces enabling MIADS to communicate with and obtain data from existing Radiology Information Systems such as the Picture Archiving and Communication System (PACS). Our system design encompasses the broader aspects of the archive node, which could include multimedia data such as image, audio, video, and free text data. This system is designed to be integrated with current hospital PACS through a Digital Imaging and Communications in Medicine interface. However, the system can also be accessed through the Internet using Hypertext Transport Protocol or Simple File Transport Protocol. Our design and simulation work will be key to implementing a successful, scalable medical image archive and distribution system.
Core Flight System (cFS) a Low Cost Solution for SmallSats
NASA Technical Reports Server (NTRS)
McComas, David; Strege, Susanne; Wilmot, Jonathan
2015-01-01
The cFS is a FSW product line that uses a layered architecture and compile-time configuration parameters which make it portable and scalable for a wide range of platforms. The software layers that defined the application run-time environment are now under a NASA-wide configuration control board with the goal of sustaining an open-source application ecosystem.
TENTACLE Multi-Camera Immersive Surveillance System Phase 2
2015-04-16
successful in solving the most challenging video analytics problems and taking the advanced research concepts into working systems for end- users in both...commercial, space and military applications. Notable successes include winning the DARPA Urban Challenge , software autonomy to guide the NASA robots (spirit... challenging urban environments. CMU is developing a scalable and extensible architecture, improving search/pursuit/tracking capabilities, and addressing
Architectures and Applications for Scalable Quantum Information Systems
2007-01-01
quantum computation models, such as adiabatic quantum computing , can be converted to quantum circuits. Therefore, in our design flow’s first phase...vol. 26, no. 5, pp. 1484–1509, 1997. [19] A. Childs, E. Farhi, and J. Preskill, “Robustness of adiabatic quantum computation ,” Phys. Rev. A, vol. 65...magnetic resonance computer with three quantum bits that simulates an adiabatic quantum optimization algorithm. Adiabatic
Tera-node Network Technology (Task 3) Scalable Personal Telecommunications
2000-03-14
Simulation results of this work may be found in http://north.east.isi.edu/spt/ audio.html. 6. Internet Research Task Force Reliable Multicast...Adaptation, 4. Multimedia Proxy Caching, 5. Experiments with the Rate Adaptation Protocol (RAP) 6. Providing leadership and innovation to the Internet ... Research Task Force (IRTF) Reliable Multicast Research Group (RMRG) 1. End-to-end Architecture for Quality-adaptive Streaming Applications over the
Optimal quantum control of multimode couplings between trapped ion qubits for scalable entanglement.
Choi, T; Debnath, S; Manning, T A; Figgatt, C; Gong, Z-X; Duan, L-M; Monroe, C
2014-05-16
We demonstrate entangling quantum gates within a chain of five trapped ion qubits by optimally shaping optical fields that couple to multiple collective modes of motion. We individually address qubits with segmented optical pulses to construct multipartite entangled states in a programmable way. This approach enables high-fidelity gates that can be scaled to larger qubit registers for quantum computation and simulation.
Navier-Stokes Aerodynamic Simulation of the V-22 Osprey on the Intel Paragon MPP
NASA Technical Reports Server (NTRS)
Vadyak, Joseph; Shrewsbury, George E.; Narramore, Jim C.; Montry, Gary; Holst, Terry; Kwak, Dochan (Technical Monitor)
1995-01-01
The paper will describe the Development of a general three-dimensional multiple grid zone Navier-Stokes flowfield simulation program (ENS3D-MPP) designed for efficient execution on the Intel Paragon Massively Parallel Processor (MPP) supercomputer, and the subsequent application of this method to the prediction of the viscous flowfield about the V-22 Osprey tiltrotor vehicle. The flowfield simulation code solves the thin Layer or full Navier-Stoke's equation - for viscous flow modeling, or the Euler equations for inviscid flow modeling on a structured multi-zone mesh. In the present paper only viscous simulations will be shown. The governing difference equations are solved using a time marching implicit approximate factorization method with either TVD upwind or central differencing used for the convective terms and central differencing used for the viscous diffusion terms. Steady state or Lime accurate solutions can be calculated. The present paper will focus on steady state applications, although time accurate solution analysis is the ultimate goal of this effort. Laminar viscosity is calculated using Sutherland's law and the Baldwin-Lomax two layer algebraic turbulence model is used to compute the eddy viscosity. The Simulation method uses an arbitrary block, curvilinear grid topology. An automatic grid adaption scheme is incorporated which concentrates grid points in high density gradient regions. A variety of user-specified boundary conditions are available. This paper will present the application of the scalable and superscalable versions to the steady state viscous flow analysis of the V-22 Osprey using a multiple zone global mesh. The mesh consists of a series of sheared cartesian grid blocks with polar grids embedded within to better simulate the wing tip mounted nacelle. MPP solutions will be shown in comparison to equivalent Cray C-90 results and also in comparison to experimental data. Discussions on meshing considerations, wall clock execution time, load balancing, and scalability will be provided.
Six-Tube Freezable Radiator Testing and Model Correlation
NASA Technical Reports Server (NTRS)
Lillibridge, Sean; Navarro, Moses
2011-01-01
Freezable radiators offer an attractive solution to the issue of thermal control system scalability. As thermal environments change, a freezable radiator will effectively scale the total heat rejection it is capable of as a function of the thermal environment and flow rate through the radiator. Scalable thermal control systems are a critical technology for spacecraft that will endure missions with widely varying thermal requirements. These changing requirements are a result of the spacecraft s surroundings and because of different thermal loads rejected during different mission phases. However, freezing and thawing (recovering) a freezable radiator is a process that has historically proven very difficult to predict through modeling, resulting in highly inaccurate predictions of recovery time. These predictions are a critical step in gaining the capability to quickly design and produce optimized freezable radiators for a range of mission requirements. This paper builds upon previous efforts made to correlate a Thermal Desktop(TradeMark) model with empirical testing data from two test articles, with additional model modifications and empirical data from a sub-component radiator for a full scale design. Two working fluids were tested, namely MultiTherm WB-58 and a 50-50 mixture of DI water and Amsoil ANT.
Six-Tube Freezable Radiator Testing and Model Correlation
NASA Technical Reports Server (NTRS)
Lilibridge, Sean T.; Navarro, Moses
2012-01-01
Freezable Radiators offer an attractive solution to the issue of thermal control system scalability. As thermal environments change, a freezable radiator will effectively scale the total heat rejection it is capable of as a function of the thermal environment and flow rate through the radiator. Scalable thermal control systems are a critical technology for spacecraft that will endure missions with widely varying thermal requirements. These changing requirements are a result of the spacecraft?s surroundings and because of different thermal loads rejected during different mission phases. However, freezing and thawing (recov ering) a freezable radiator is a process that has historically proven very difficult to predict through modeling, resulting in highly inaccurate predictions of recovery time. These predictions are a critical step in gaining the capability to quickly design and produce optimized freezable radiators for a range of mission requirements. This paper builds upon previous efforts made to correlate a Thermal Desktop(TM) model with empirical testing data from two test articles, with additional model modifications and empirical data from a sub-component radiator for a full scale design. Two working fluids were tested: MultiTherm WB-58 and a 50-50 mixture of DI water and Amsoil ANT.
NASA Astrophysics Data System (ADS)
Xu, Boyi; Xu, Li Da; Fei, Xiang; Jiang, Lihong; Cai, Hongming; Wang, Shuai
2017-08-01
Facing the rapidly changing business environments, implementation of flexible business process is crucial, but difficult especially in data-intensive application areas. This study aims to provide scalable and easily accessible information resources to leverage business process management. In this article, with a resource-oriented approach, enterprise data resources are represented as data-centric Web services, grouped on-demand of business requirement and configured dynamically to adapt to changing business processes. First, a configurable architecture CIRPA involving information resource pool is proposed to act as a scalable and dynamic platform to virtualise enterprise information resources as data-centric Web services. By exposing data-centric resources as REST services in larger granularities, tenant-isolated information resources could be accessed in business process execution. Second, dynamic information resource pool is designed to fulfil configurable and on-demand data accessing in business process execution. CIRPA also isolates transaction data from business process while supporting diverse business processes composition. Finally, a case study of using our method in logistics application shows that CIRPA provides an enhanced performance both in static service encapsulation and dynamic service execution in cloud computing environment.
NASA Astrophysics Data System (ADS)
Sanan, P.; Tackley, P. J.; Gerya, T.; Kaus, B. J. P.; May, D.
2017-12-01
StagBL is an open-source parallel solver and discretization library for geodynamic simulation,encapsulating and optimizing operations essential to staggered-grid finite volume Stokes flow solvers.It provides a parallel staggered-grid abstraction with a high-level interface in C and Fortran.On top of this abstraction, tools are available to define boundary conditions and interact with particle systems.Tools and examples to efficiently solve Stokes systems defined on the grid are provided in small (direct solver), medium (simple preconditioners), and large (block factorization and multigrid) model regimes.By working directly with leading application codes (StagYY, I3ELVIS, and LaMEM) and providing an API and examples to integrate with others, StagBL aims to become a community tool supplying scalable, portable, reproducible performance toward novel science in regional- and planet-scale geodynamics and planetary science.By implementing kernels used by many research groups beneath a uniform abstraction layer, the library will enable optimization for modern hardware, thus reducing community barriers to large- or extreme-scale parallel simulation on modern architectures. In particular, the library will include CPU-, Manycore-, and GPU-optimized variants of matrix-free operators and multigrid components.The common layer provides a framework upon which to introduce innovative new tools.StagBL will leverage p4est to provide distributed adaptive meshes, and incorporate a multigrid convergence analysis tool.These options, in addition to a wealth of solver options provided by an interface to PETSc, will make the most modern solution techniques available from a common interface. StagBL in turn provides a PETSc interface, DMStag, to its central staggered grid abstraction.We present public version 0.5 of StagBL, including preliminary integration with application codes and demonstrations with its own demonstration application, StagBLDemo. Central to StagBL is the notion of an uninterrupted pipeline from toy/teaching codes to high-performance, extreme-scale solves. StagBLDemo replicates the functionality of an advanced MATLAB-style regional geodynamics code, thus providing users with a concrete procedure to exceed the performance and scalability limitations of smaller-scale tools.
Introducing a distributed unstructured mesh into gyrokinetic particle-in-cell code, XGC
NASA Astrophysics Data System (ADS)
Yoon, Eisung; Shephard, Mark; Seol, E. Seegyoung; Kalyanaraman, Kaushik
2017-10-01
XGC has shown good scalability for large leadership supercomputers. The current production version uses a copy of the entire unstructured finite element mesh on every MPI rank. Although an obvious scalability issue if the mesh sizes are to be dramatically increased, the current approach is also not optimal with respect to data locality of particles and mesh information. To address these issues we have initiated the development of a distributed mesh PIC method. This approach directly addresses the base scalability issue with respect to mesh size and, through the use of a mesh entity centric view of the particle mesh relationship, provides opportunities to address data locality needs of many core and GPU supported heterogeneous systems. The parallel mesh PIC capabilities are being built on the Parallel Unstructured Mesh Infrastructure (PUMI). The presentation will first overview the form of mesh distribution used and indicate the structures and functions used to support the mesh, the particles and their interaction. Attention will then focus on the node-level optimizations being carried out to ensure performant operation of all PIC operations on the distributed mesh. Partnership for Edge Physics Simulation (EPSI) Grant No. DE-SC0008449 and Center for Extended Magnetohydrodynamic Modeling (CEMM) Grant No. DE-SC0006618.
Performance and scalability evaluation of "Big Memory" on Blue Gene Linux.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoshii, K.; Iskra, K.; Naik, H.
2011-05-01
We address memory performance issues observed in Blue Gene Linux and discuss the design and implementation of 'Big Memory' - an alternative, transparent memory space introduced to eliminate the memory performance issues. We evaluate the performance of Big Memory using custom memory benchmarks, NAS Parallel Benchmarks, and the Parallel Ocean Program, at a scale of up to 4,096 nodes. We find that Big Memory successfully resolves the performance issues normally encountered in Blue Gene Linux. For the ocean simulation program, we even find that Linux with Big Memory provides better scalability than does the lightweight compute node kernel designed solelymore » for high-performance applications. Originally intended exclusively for compute node tasks, our new memory subsystem dramatically improves the performance of certain I/O node applications as well. We demonstrate this performance using the central processor of the LOw Frequency ARray radio telescope as an example.« less
Algorithmically scalable block preconditioner for fully implicit shallow-water equations in CAM-SE
Lott, P. Aaron; Woodward, Carol S.; Evans, Katherine J.
2014-10-19
Performing accurate and efficient numerical simulation of global atmospheric climate models is challenging due to the disparate length and time scales over which physical processes interact. Implicit solvers enable the physical system to be integrated with a time step commensurate with processes being studied. The dominant cost of an implicit time step is the ancillary linear system solves, so we have developed a preconditioner aimed at improving the efficiency of these linear system solves. Our preconditioner is based on an approximate block factorization of the linearized shallow-water equations and has been implemented within the spectral element dynamical core within themore » Community Atmospheric Model (CAM-SE). Furthermore, in this paper we discuss the development and scalability of the preconditioner for a suite of test cases with the implicit shallow-water solver within CAM-SE.« less
A fully electric field driven scalable magnetoelectric switching element
NASA Astrophysics Data System (ADS)
Ahmed, R.; Victora, R. H.
2018-04-01
A technique for micromagnetic simulation of the magnetoelectric (ME) effect in Cr2O3 based structures has been developed. It has been observed that the microscopic ME susceptibility differs significantly from the experimentally measured values. The deviation between the two susceptibilities becomes more prominent near the Curie temperature, affecting the operation of the device at room temperature. A fully electric field controlled ME switching element has been proposed for use at technologically interesting densities: it employs quantum mechanical exchange at the boundaries instead of the applied magnetic field needed in traditional switching schemes. After establishing temperature dependent physics-based parameters, switching performances have been studied for different temperatures, applied electric fields, and Cr2O3 cross-sections. It has been found that our proposed use of quantum mechanical exchange favors reduced electric field operation and enhanced scalability while retaining reliable thermal stability.
Multi-service small-cell cloud wired/wireless access network based on tunable optical frequency comb
NASA Astrophysics Data System (ADS)
Xiang, Yu; Zhou, Kun; Yang, Liu; Pan, Lei; Liao, Zhen-wan; Zhang, Qiang
2015-11-01
In this paper, we demonstrate a novel multi-service wired/wireless integrated access architecture of cloud radio access network (C-RAN) based on radio-over-fiber passive optical network (RoF-PON) system, which utilizes scalable multiple- frequency millimeter-wave (MF-MMW) generation based on tunable optical frequency comb (TOFC). In the baseband unit (BBU) pool, the generated optical comb lines are modulated into wired, RoF and WiFi/WiMAX signals, respectively. The multi-frequency RoF signals are generated by beating the optical comb line pairs in the small cell. The WiFi/WiMAX signals are demodulated after passing through the band pass filter (BPF) and band stop filter (BSF), respectively, whereas the wired signal can be received directly. The feasibility and scalability of the proposed multi-service wired/wireless integrated C-RAN are confirmed by the simulations.
Robot formation control in stealth mode with scalable team size
NASA Astrophysics Data System (ADS)
Yu, Hongjun; Shi, Peng; Lim, Cheng-Chew
2016-11-01
In situations where robots need to keep electromagnetic silent in a formation, communication channels become unavailable. Moreover, as passive displacement sensors are used, limited sensing ranges are inevitable due to power insufficiency and limited noise reduction. To address the formation control problem for a scalable team of robots subject to the above restrictions, a flexible strategy is necessary. In this paper, under the assumption that the data transmission among the robots is not available, a novel controller and a protocol are designed that do not rely on communication. As the controller only drives the robots to a partially desired formation, a distributed coordination protocol is proposed to resolve the imperfections. It is shown that the effectiveness of the controller and the protocol rely on the formation connectivity, and a condition is given on the sensing range. Simulations are conducted to illustrate the feasibility and advantages of the new design scheme developed.
Status Report on NEAMS PROTEUS/ORIGEN Integration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wieselquist, William A
2016-02-18
The US Department of Energy’s Nuclear Energy Advanced Modeling and Simulation (NEAMS) Program has contributed significantly to the development of the PROTEUS neutron transport code at Argonne National Laboratory and to the Oak Ridge Isotope Generation and Depletion Code (ORIGEN) depletion/decay code at Oak Ridge National Laboratory. PROTEUS’s key capability is the efficient and scalable (up to hundreds of thousands of cores) neutron transport solver on general, unstructured, three-dimensional finite-element-type meshes. The scalability and mesh generality enable the transfer of neutron and power distributions to other codes in the NEAMS toolkit for advanced multiphysics analysis. Recently, ORIGEN has received considerablemore » modernization to provide the high-performance depletion/decay capability within the NEAMS toolkit. This work presents a description of the initial integration of ORIGEN in PROTEUS, mainly performed during FY 2015, with minor updates in FY 2016.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Katti, Amogh; Di Fatta, Giuseppe; Naughton III, Thomas J
Future extreme-scale high-performance computing systems will be required to work under frequent component failures. The MPI Forum's User Level Failure Mitigation proposal has introduced an operation, MPI_Comm_shrink, to synchronize the alive processes on the list of failed processes, so that applications can continue to execute even in the presence of failures by adopting algorithm-based fault tolerance techniques. This MPI_Comm_shrink operation requires a fault tolerant failure detection and consensus algorithm. This paper presents and compares two novel failure detection and consensus algorithms. The proposed algorithms are based on Gossip protocols and are inherently fault-tolerant and scalable. The proposed algorithms were implementedmore » and tested using the Extreme-scale Simulator. The results show that in both algorithms the number of Gossip cycles to achieve global consensus scales logarithmically with system size. The second algorithm also shows better scalability in terms of memory and network bandwidth usage and a perfect synchronization in achieving global consensus.« less
permGPU: Using graphics processing units in RNA microarray association studies.
Shterev, Ivo D; Jung, Sin-Ho; George, Stephen L; Owzar, Kouros
2010-06-16
Many analyses of microarray association studies involve permutation, bootstrap resampling and cross-validation, that are ideally formulated as embarrassingly parallel computing problems. Given that these analyses are computationally intensive, scalable approaches that can take advantage of multi-core processor systems need to be developed. We have developed a CUDA based implementation, permGPU, that employs graphics processing units in microarray association studies. We illustrate the performance and applicability of permGPU within the context of permutation resampling for a number of test statistics. An extensive simulation study demonstrates a dramatic increase in performance when using permGPU on an NVIDIA GTX 280 card compared to an optimized C/C++ solution running on a conventional Linux server. permGPU is available as an open-source stand-alone application and as an extension package for the R statistical environment. It provides a dramatic increase in performance for permutation resampling analysis in the context of microarray association studies. The current version offers six test statistics for carrying out permutation resampling analyses for binary, quantitative and censored time-to-event traits.
An overview of the DII-HEP OpenStack based CMS data analysis
NASA Astrophysics Data System (ADS)
Osmani, L.; Tarkoma, S.; Eerola, P.; Komu, M.; Kortelainen, M. J.; Kraemer, O.; Lindén, T.; Toor, S.; White, J.
2015-05-01
An OpenStack based private cloud with the Cluster File System has been built and used with both CMS analysis and Monte Carlo simulation jobs in the Datacenter Indirection Infrastructure for Secure High Energy Physics (DII-HEP) project. On the cloud we run the ARC middleware that allows running CMS applications without changes on the job submission side. Our test results indicate that the adopted approach provides a scalable and resilient solution for managing resources without compromising on performance and high availability. To manage the virtual machines (VM) dynamically in an elastic fasion, we are testing the EMI authorization service (Argus) and the Execution Environment Service (Argus-EES). An OpenStackplugin has been developed for Argus-EES. The Host Identity Protocol (HIP) has been designed for mobile networks and it provides a secure method for IP multihoming. HIP separates the end-point identifier and locator role for IP address which increases the network availability for the applications. Our solution leverages HIP for traffic management. This presentation gives an update on the status of the work and our lessons learned in creating an OpenStackbased cloud for HEP.
A suffix arrays based approach to semantic search in P2P systems
NASA Astrophysics Data System (ADS)
Shi, Qingwei; Zhao, Zheng; Bao, Hu
2007-09-01
Building a semantic search system on top of peer-to-peer (P2P) networks is becoming an attractive and promising alternative scheme for the reason of scalability, Data freshness and search cost. In this paper, we present a Suffix Arrays based algorithm for Semantic Search (SASS) in P2P systems, which generates a distributed Semantic Overlay Network (SONs) construction for full-text search in P2P networks. For each node through the P2P network, SASS distributes document indices based on a set of suffix arrays, by which clusters are created depending on words or phrases shared between documents, therefore, the search cost for a given query is decreased by only scanning semantically related documents. In contrast to recently announced SONs scheme designed by using metadata or predefined-class, SASS is an unsupervised approach for decentralized generation of SONs. SASS is also an incremental, linear time algorithm, which efficiently handle the problem of nodes update in P2P networks. Our simulation results demonstrate that SASS yields high search efficiency in dynamic environments.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kurt Derr
Mobile Ad hoc NETworks (MANETs) are distributed self-organizing networks that can change locations and configure themselves on the fly. This paper focuses on an algorithmic approach for the deployment of a MANET within an enclosed area, such as a building in a disaster scenario, which can provide a robust communication infrastructure for search and rescue operations. While a virtual spring mesh (VSM) algorithm provides scalable, self-organizing, and fault-tolerant capabilities required by aMANET, the VSM lacks the MANET's capabilities of deployment mechanisms for blanket coverage of an area and does not provide an obstacle avoidance mechanism. This paper presents a newmore » technique, an extended VSM (EVSM) algorithm that provides the following novelties: (1) new control laws for exploration and expansion to provide blanket coverage, (2) virtual adaptive springs enabling the mesh to expand as necessary, (3) adapts to communications disturbances by varying the density and movement of mobile nodes, and (4) new metrics to assess the performance of the EVSM algorithm. Simulation results show that EVSM provides up to 16% more coverage and is 3.5 times faster than VSM in environments with eight obstacles.« less
Laser-Based Pedestrian Tracking in Outdoor Environments by Multiple Mobile Robots
Ozaki, Masataka; Kakimuma, Kei; Hashimoto, Masafumi; Takahashi, Kazuhiko
2012-01-01
This paper presents an outdoors laser-based pedestrian tracking system using a group of mobile robots located near each other. Each robot detects pedestrians from its own laser scan image using an occupancy-grid-based method, and the robot tracks the detected pedestrians via Kalman filtering and global-nearest-neighbor (GNN)-based data association. The tracking data is broadcast to multiple robots through intercommunication and is combined using the covariance intersection (CI) method. For pedestrian tracking, each robot identifies its own posture using real-time-kinematic GPS (RTK-GPS) and laser scan matching. Using our cooperative tracking method, all the robots share the tracking data with each other; hence, individual robots can always recognize pedestrians that are invisible to any other robot. The simulation and experimental results show that cooperating tracking provides the tracking performance better than conventional individual tracking does. Our tracking system functions in a decentralized manner without any central server, and therefore, this provides a degree of scalability and robustness that cannot be achieved by conventional centralized architectures. PMID:23202171
Scalable approximate policies for Markov decision process models of hospital elective admissions.
Zhu, George; Lizotte, Dan; Hoey, Jesse
2014-05-01
To demonstrate the feasibility of using stochastic simulation methods for the solution of a large-scale Markov decision process model of on-line patient admissions scheduling. The problem of admissions scheduling is modeled as a Markov decision process in which the states represent numbers of patients using each of a number of resources. We investigate current state-of-the-art real time planning methods to compute solutions to this Markov decision process. Due to the complexity of the model, traditional model-based planners are limited in scalability since they require an explicit enumeration of the model dynamics. To overcome this challenge, we apply sample-based planners along with efficient simulation techniques that given an initial start state, generate an action on-demand while avoiding portions of the model that are irrelevant to the start state. We also propose a novel variant of a popular sample-based planner that is particularly well suited to the elective admissions problem. Results show that the stochastic simulation methods allow for the problem size to be scaled by a factor of almost 10 in the action space, and exponentially in the state space. We have demonstrated our approach on a problem with 81 actions, four specialities and four treatment patterns, and shown that we can generate solutions that are near-optimal in about 100s. Sample-based planners are a viable alternative to state-based planners for large Markov decision process models of elective admissions scheduling. Copyright © 2014 Elsevier B.V. All rights reserved.
al3c: high-performance software for parameter inference using Approximate Bayesian Computation.
Stram, Alexander H; Marjoram, Paul; Chen, Gary K
2015-11-01
The development of Approximate Bayesian Computation (ABC) algorithms for parameter inference which are both computationally efficient and scalable in parallel computing environments is an important area of research. Monte Carlo rejection sampling, a fundamental component of ABC algorithms, is trivial to distribute over multiple processors but is inherently inefficient. While development of algorithms such as ABC Sequential Monte Carlo (ABC-SMC) help address the inherent inefficiencies of rejection sampling, such approaches are not as easily scaled on multiple processors. As a result, current Bayesian inference software offerings that use ABC-SMC lack the ability to scale in parallel computing environments. We present al3c, a C++ framework for implementing ABC-SMC in parallel. By requiring only that users define essential functions such as the simulation model and prior distribution function, al3c abstracts the user from both the complexities of parallel programming and the details of the ABC-SMC algorithm. By using the al3c framework, the user is able to scale the ABC-SMC algorithm in parallel computing environments for his or her specific application, with minimal programming overhead. al3c is offered as a static binary for Linux and OS-X computing environments. The user completes an XML configuration file and C++ plug-in template for the specific application, which are used by al3c to obtain the desired results. Users can download the static binaries, source code, reference documentation and examples (including those in this article) by visiting https://github.com/ahstram/al3c. astram@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Lemkul, Justin A; Roux, Benoît; van der Spoel, David; MacKerell, Alexander D
2015-07-15
Explicit treatment of electronic polarization in empirical force fields used for molecular dynamics simulations represents an important advancement in simulation methodology. A straightforward means of treating electronic polarization in these simulations is the inclusion of Drude oscillators, which are auxiliary, charge-carrying particles bonded to the cores of atoms in the system. The additional degrees of freedom make these simulations more computationally expensive relative to simulations using traditional fixed-charge (additive) force fields. Thus, efficient tools are needed for conducting these simulations. Here, we present the implementation of highly scalable algorithms in the GROMACS simulation package that allow for the simulation of polarizable systems using extended Lagrangian dynamics with a dual Nosé-Hoover thermostat as well as simulations using a full self-consistent field treatment of polarization. The performance of systems of varying size is evaluated, showing that the present code parallelizes efficiently and is the fastest implementation of the extended Lagrangian methods currently available for simulations using the Drude polarizable force field. © 2015 Wiley Periodicals, Inc.
Establishing a Novel Modeling Tool: A Python-Based Interface for a Neuromorphic Hardware System
Brüderle, Daniel; Müller, Eric; Davison, Andrew; Muller, Eilif; Schemmel, Johannes; Meier, Karlheinz
2008-01-01
Neuromorphic hardware systems provide new possibilities for the neuroscience modeling community. Due to the intrinsic parallelism of the micro-electronic emulation of neural computation, such models are highly scalable without a loss of speed. However, the communities of software simulator users and neuromorphic engineering in neuroscience are rather disjoint. We present a software concept that provides the possibility to establish such hardware devices as valuable modeling tools. It is based on the integration of the hardware interface into a simulator-independent language which allows for unified experiment descriptions that can be run on various simulation platforms without modification, implying experiment portability and a huge simplification of the quantitative comparison of hardware and simulator results. We introduce an accelerated neuromorphic hardware device and describe the implementation of the proposed concept for this system. An example setup and results acquired by utilizing both the hardware system and a software simulator are demonstrated. PMID:19562085
Establishing a novel modeling tool: a python-based interface for a neuromorphic hardware system.
Brüderle, Daniel; Müller, Eric; Davison, Andrew; Muller, Eilif; Schemmel, Johannes; Meier, Karlheinz
2009-01-01
Neuromorphic hardware systems provide new possibilities for the neuroscience modeling community. Due to the intrinsic parallelism of the micro-electronic emulation of neural computation, such models are highly scalable without a loss of speed. However, the communities of software simulator users and neuromorphic engineering in neuroscience are rather disjoint. We present a software concept that provides the possibility to establish such hardware devices as valuable modeling tools. It is based on the integration of the hardware interface into a simulator-independent language which allows for unified experiment descriptions that can be run on various simulation platforms without modification, implying experiment portability and a huge simplification of the quantitative comparison of hardware and simulator results. We introduce an accelerated neuromorphic hardware device and describe the implementation of the proposed concept for this system. An example setup and results acquired by utilizing both the hardware system and a software simulator are demonstrated.
A framework for service enterprise workflow simulation with multi-agents cooperation
NASA Astrophysics Data System (ADS)
Tan, Wenan; Xu, Wei; Yang, Fujun; Xu, Lida; Jiang, Chuanqun
2013-11-01
Process dynamic modelling for service business is the key technique for Service-Oriented information systems and service business management, and the workflow model of business processes is the core part of service systems. Service business workflow simulation is the prevalent approach to be used for analysis of service business process dynamically. Generic method for service business workflow simulation is based on the discrete event queuing theory, which is lack of flexibility and scalability. In this paper, we propose a service workflow-oriented framework for the process simulation of service businesses using multi-agent cooperation to address the above issues. Social rationality of agent is introduced into the proposed framework. Adopting rationality as one social factor for decision-making strategies, a flexible scheduling for activity instances has been implemented. A system prototype has been developed to validate the proposed simulation framework through a business case study.
Parallel Dynamics Simulation Using a Krylov-Schwarz Linear Solution Scheme
Abhyankar, Shrirang; Constantinescu, Emil M.; Smith, Barry F.; ...
2016-11-07
Fast dynamics simulation of large-scale power systems is a computational challenge because of the need to solve a large set of stiff, nonlinear differential-algebraic equations at every time step. The main bottleneck in dynamic simulations is the solution of a linear system during each nonlinear iteration of Newton’s method. In this paper, we present a parallel Krylov- Schwarz linear solution scheme that uses the Krylov subspacebased iterative linear solver GMRES with an overlapping restricted additive Schwarz preconditioner. As a result, performance tests of the proposed Krylov-Schwarz scheme for several large test cases ranging from 2,000 to 20,000 buses, including amore » real utility network, show good scalability on different computing architectures.« less
Parallel Dynamics Simulation Using a Krylov-Schwarz Linear Solution Scheme
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abhyankar, Shrirang; Constantinescu, Emil M.; Smith, Barry F.
Fast dynamics simulation of large-scale power systems is a computational challenge because of the need to solve a large set of stiff, nonlinear differential-algebraic equations at every time step. The main bottleneck in dynamic simulations is the solution of a linear system during each nonlinear iteration of Newton’s method. In this paper, we present a parallel Krylov- Schwarz linear solution scheme that uses the Krylov subspacebased iterative linear solver GMRES with an overlapping restricted additive Schwarz preconditioner. As a result, performance tests of the proposed Krylov-Schwarz scheme for several large test cases ranging from 2,000 to 20,000 buses, including amore » real utility network, show good scalability on different computing architectures.« less
Simulating advanced life support systems to test integrated control approaches
NASA Astrophysics Data System (ADS)
Kortenkamp, D.; Bell, S.
Simulations allow for testing of life support control approaches before hardware is designed and built. Simulations also allow for the safe exploration of alternative control strategies during life support operation. As such, they are an important component of any life support research program and testbed. This paper describes a specific advanced life support simulation being created at NASA Johnson Space Center. It is a discrete-event simulation that is dynamic and stochastic. It simulates all major components of an advanced life support system, including crew (with variable ages, weights and genders), biomass production (with scalable plantings of ten different crops), water recovery, air revitalization, food processing, solid waste recycling and energy production. Each component is modeled as a producer of certain resources and a consumer of certain resources. The control system must monitor (via sensors) and control (via actuators) the flow of resources throughout the system to provide life support functionality. The simulation is written in an object-oriented paradigm that makes it portable, extensible and reconfigurable.
Scalable nanohelices for predictive studies and enhanced 3D visualization.
Meagher, Kwyn A; Doblack, Benjamin N; Ramirez, Mercedes; Davila, Lilian P
2014-11-12
Spring-like materials are ubiquitous in nature and of interest in nanotechnology for energy harvesting, hydrogen storage, and biological sensing applications. For predictive simulations, it has become increasingly important to be able to model the structure of nanohelices accurately. To study the effect of local structure on the properties of these complex geometries one must develop realistic models. To date, software packages are rather limited in creating atomistic helical models. This work focuses on producing atomistic models of silica glass (SiO₂) nanoribbons and nanosprings for molecular dynamics (MD) simulations. Using an MD model of "bulk" silica glass, two computational procedures to precisely create the shape of nanoribbons and nanosprings are presented. The first method employs the AWK programming language and open-source software to effectively carve various shapes of silica nanoribbons from the initial bulk model, using desired dimensions and parametric equations to define a helix. With this method, accurate atomistic silica nanoribbons can be generated for a range of pitch values and dimensions. The second method involves a more robust code which allows flexibility in modeling nanohelical structures. This approach utilizes a C++ code particularly written to implement pre-screening methods as well as the mathematical equations for a helix, resulting in greater precision and efficiency when creating nanospring models. Using these codes, well-defined and scalable nanoribbons and nanosprings suited for atomistic simulations can be effectively created. An added value in both open-source codes is that they can be adapted to reproduce different helical structures, independent of material. In addition, a MATLAB graphical user interface (GUI) is used to enhance learning through visualization and interaction for a general user with the atomistic helical structures. One application of these methods is the recent study of nanohelices via MD simulations for mechanical energy harvesting purposes.
A scalable moment-closure approximation for large-scale biochemical reaction networks
Kazeroonian, Atefeh; Theis, Fabian J.; Hasenauer, Jan
2017-01-01
Abstract Motivation: Stochastic molecular processes are a leading cause of cell-to-cell variability. Their dynamics are often described by continuous-time discrete-state Markov chains and simulated using stochastic simulation algorithms. As these stochastic simulations are computationally demanding, ordinary differential equation models for the dynamics of the statistical moments have been developed. The number of state variables of these approximating models, however, grows at least quadratically with the number of biochemical species. This limits their application to small- and medium-sized processes. Results: In this article, we present a scalable moment-closure approximation (sMA) for the simulation of statistical moments of large-scale stochastic processes. The sMA exploits the structure of the biochemical reaction network to reduce the covariance matrix. We prove that sMA yields approximating models whose number of state variables depends predominantly on local properties, i.e. the average node degree of the reaction network, instead of the overall network size. The resulting complexity reduction is assessed by studying a range of medium- and large-scale biochemical reaction networks. To evaluate the approximation accuracy and the improvement in computational efficiency, we study models for JAK2/STAT5 signalling and NFκB signalling. Our method is applicable to generic biochemical reaction networks and we provide an implementation, including an SBML interface, which renders the sMA easily accessible. Availability and implementation: The sMA is implemented in the open-source MATLAB toolbox CERENA and is available from https://github.com/CERENADevelopers/CERENA. Contact: jan.hasenauer@helmholtz-muenchen.de or atefeh.kazeroonian@tum.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28881983
Wang, Sibo; Wu, Yunchao; Miao, Ran; ...
2017-07-26
Scalable and cost-effective synthesis and assembly of technologically important nanostructures in three-dimensional (3D) substrates hold keys to bridge the demonstrated nanotechnologies in academia with industrially relevant scalable manufacturing. In this paper, using ZnO nanorod arrays as an example, a hydrothermal-based continuous flow synthesis (CFS) method is successfully used to integrate the nano-arrays in multi-channeled monolithic cordierite. Compared to the batch process, CFS enhances the average growth rate of nano-arrays by 125%, with the average length increasing from 2 μm to 4.5 μm within the same growth time of 4 hours. The precursor utilization efficiency of CFS is enhanced by 9more » times compared to that of batch process by preserving the majority of precursors in recyclable solution. Computational fluid dynamic simulation suggests a steady-state solution flow and mass transport inside the channels of honeycomb substrates, giving rise to steady and consecutive growth of ZnO nano-arrays with an average length of 10 μm in 12 h. The monolithic ZnO nano-array-integrated cordierite obtained through CFS shows enhanced low-temperature (200 °C) desulfurization capacity and recyclability in comparison to ZnO powder wash-coated cordierite. This can be attributed to exposed ZnO {101¯0} planes, better dispersion and stronger interactions between sorbent and reactant in the ZnO nanorod arrays, as well as the sintering-resistance of nano-array configurations during sulfidation–regeneration cycles. Finally, with the demonstrated scalable synthesis and desulfurization performance of ZnO nano-arrays, a promising, industrially relevant integration strategy is provided to fabricate metal oxide nano-array-based monolithic devices for various environmental and energy applications.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Sibo; Wu, Yunchao; Miao, Ran
Scalable and cost-effective synthesis and assembly of technologically important nanostructures in three-dimensional (3D) substrates hold keys to bridge the demonstrated nanotechnologies in academia with industrially relevant scalable manufacturing. In this paper, using ZnO nanorod arrays as an example, a hydrothermal-based continuous flow synthesis (CFS) method is successfully used to integrate the nano-arrays in multi-channeled monolithic cordierite. Compared to the batch process, CFS enhances the average growth rate of nano-arrays by 125%, with the average length increasing from 2 μm to 4.5 μm within the same growth time of 4 hours. The precursor utilization efficiency of CFS is enhanced by 9more » times compared to that of batch process by preserving the majority of precursors in recyclable solution. Computational fluid dynamic simulation suggests a steady-state solution flow and mass transport inside the channels of honeycomb substrates, giving rise to steady and consecutive growth of ZnO nano-arrays with an average length of 10 μm in 12 h. The monolithic ZnO nano-array-integrated cordierite obtained through CFS shows enhanced low-temperature (200 °C) desulfurization capacity and recyclability in comparison to ZnO powder wash-coated cordierite. This can be attributed to exposed ZnO {101¯0} planes, better dispersion and stronger interactions between sorbent and reactant in the ZnO nanorod arrays, as well as the sintering-resistance of nano-array configurations during sulfidation–regeneration cycles. Finally, with the demonstrated scalable synthesis and desulfurization performance of ZnO nano-arrays, a promising, industrially relevant integration strategy is provided to fabricate metal oxide nano-array-based monolithic devices for various environmental and energy applications.« less
NASA Astrophysics Data System (ADS)
Rababaah, Haroun; Shirkhodaie, Amir
2009-04-01
The rapidly advancing hardware technology, smart sensors and sensor networks are advancing environment sensing. One major potential of this technology is Large-Scale Surveillance Systems (LS3) especially for, homeland security, battlefield intelligence, facility guarding and other civilian applications. The efficient and effective deployment of LS3 requires addressing number of aspects impacting the scalability of such systems. The scalability factors are related to: computation and memory utilization efficiency, communication bandwidth utilization, network topology (e.g., centralized, ad-hoc, hierarchical or hybrid), network communication protocol and data routing schemes; and local and global data/information fusion scheme for situational awareness. Although, many models have been proposed to address one aspect or another of these issues but, few have addressed the need for a multi-modality multi-agent data/information fusion that has characteristics satisfying the requirements of current and future intelligent sensors and sensor networks. In this paper, we have presented a novel scalable fusion engine for multi-modality multi-agent information fusion for LS3. The new fusion engine is based on a concept we call: Energy Logic. Experimental results of this work as compared to a Fuzzy logic model strongly supported the validity of the new model and inspired future directions for different levels of fusion and different applications.
ANCA: Anharmonic Conformational Analysis of Biomolecular Simulations.
Parvatikar, Akash; Vacaliuc, Gabriel S; Ramanathan, Arvind; Chennubhotla, S Chakra
2018-05-08
Anharmonicity in time-dependent conformational fluctuations is noted to be a key feature of functional dynamics of biomolecules. Although anharmonic events are rare, long-timescale (μs-ms and beyond) simulations facilitate probing of such events. We have previously developed quasi-anharmonic analysis to resolve higher-order spatial correlations and characterize anharmonicity in biomolecular simulations. In this article, we have extended this toolbox to resolve higher-order temporal correlations and built a scalable Python package called anharmonic conformational analysis (ANCA). ANCA has modules to: 1) measure anharmonicity in the form of higher-order statistics and its variation as a function of time, 2) output a storyboard representation of the simulations to identify key anharmonic conformational events, and 3) identify putative anharmonic conformational substates and visualization of transitions between these substates. Copyright © 2018 Biophysical Society. Published by Elsevier Inc. All rights reserved.