parallel virtual machine: Topics by Science.gov

Sample records for parallel virtual machine

Paging memory from random access memory to backing storage in a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Inglett, Todd A; Ratterman, Joseph D; Smith, Brian E

2013-05-21

Paging memory from random access memory (`RAM`) to backing storage in a parallel computer that includes a plurality of compute nodes, including: executing a data processing application on a virtual machine operating system in a virtual machine on a first compute node; providing, by a second compute node, backing storage for the contents of RAM on the first compute node; and swapping, by the virtual machine operating system in the virtual machine on the first compute node, a page of memory from RAM on the first compute node to the backing storage on the second compute node.
Productive High Performance Parallel Programming with Auto-tuned Domain-Specific Embedded Languages

DTIC Science & Technology

2013-01-02

Compilation JVM Java Virtual Machine KB Kilobyte KDT Knowledge Discovery Toolbox LAPACK Linear Algebra Package LLVM Low-Level Virtual Machine LOC Lines...different starting points. Leo Meyerovich also helped solidify some of the ideas here in discussions during Par Lab retreats. I would also like to thank...multi-timestep computations by blocking in both time and space. 88 Implementation Output Approx DSL Type Language Language Parallelism LoC Graphite
Virtual Manufacturing Techniques Designed and Applied to Manufacturing Activities in the Manufacturing Integration and Technology Branch

NASA Technical Reports Server (NTRS)

Shearrow, Charles A.

1999-01-01

One of the identified goals of EM3 is to implement virtual manufacturing by the time the year 2000 has ended. To realize this goal of a true virtual manufacturing enterprise the initial development of a machinability database and the infrastructure must be completed. This will consist of the containment of the existing EM-NET problems and developing machine, tooling, and common materials databases. To integrate the virtual manufacturing enterprise with normal day to day operations the development of a parallel virtual manufacturing machinability database, virtual manufacturing database, virtual manufacturing paradigm, implementation/integration procedure, and testable verification models must be constructed. Common and virtual machinability databases will include the four distinct areas of machine tools, available tooling, common machine tool loads, and a materials database. The machine tools database will include the machine envelope, special machine attachments, tooling capacity, location within NASA-JSC or with a contractor, and availability/scheduling. The tooling database will include available standard tooling, custom in-house tooling, tool properties, and availability. The common materials database will include materials thickness ranges, strengths, types, and their availability. The virtual manufacturing databases will consist of virtual machines and virtual tooling directly related to the common and machinability databases. The items to be completed are the design and construction of the machinability databases, virtual manufacturing paradigm for NASA-JSC, implementation timeline, VNC model of one bridge mill and troubleshoot existing software and hardware problems with EN4NET. The final step of this virtual manufacturing project will be to integrate other production sites into the databases bringing JSC's EM3 into a position of becoming a clearing house for NASA's digital manufacturing needs creating a true virtual manufacturing enterprise.
Parallel computation for biological sequence comparison: comparing a portable model to the native model for the Intel Hypercube.

PubMed

Nadkarni, P M; Miller, P L

1991-01-01

A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations.
Parallel computation for biological sequence comparison: comparing a portable model to the native model for the Intel Hypercube.

PubMed Central

Nadkarni, P. M.; Miller, P. L.

1991-01-01

A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations. PMID:1807632
Means and method of balancing multi-cylinder reciprocating machines

DOEpatents

Corey, John A.; Walsh, Michael M.

1985-01-01

A virtual balancing axis arrangement is described for multi-cylinder reciprocating piston machines for effectively balancing out imbalanced forces and minimizing residual imbalance moments acting on the crankshaft of such machines without requiring the use of additional parallel-arrayed balancing shafts or complex and expensive gear arrangements. The novel virtual balancing axis arrangement is capable of being designed into multi-cylinder reciprocating piston and crankshaft machines for substantially reducing vibrations induced during operation of such machines with only minimal number of additional component parts. Some of the required component parts may be available from parts already required for operation of auxiliary equipment, such as oil and water pumps used in certain types of reciprocating piston and crankshaft machine so that by appropriate location and dimensioning in accordance with the teachings of the invention, the virtual balancing axis arrangement can be built into the machine at little or no additional cost.
PISCES: An environment for parallel scientific computation

NASA Technical Reports Server (NTRS)

Pratt, T. W.

1985-01-01

The parallel implementation of scientific computing environment (PISCES) is a project to provide high-level programming environments for parallel MIMD computers. Pisces 1, the first of these environments, is a FORTRAN 77 based environment which runs under the UNIX operating system. The Pisces 1 user programs in Pisces FORTRAN, an extension of FORTRAN 77 for parallel processing. The major emphasis in the Pisces 1 design is in providing a carefully specified virtual machine that defines the run-time environment within which Pisces FORTRAN programs are executed. Each implementation then provides the same virtual machine, regardless of differences in the underlying architecture. The design is intended to be portable to a variety of architectures. Currently Pisces 1 is implemented on a network of Apollo workstations and on a DEC VAX uniprocessor via simulation of the task level parallelism. An implementation for the Flexible Computing Corp. FLEX/32 is under construction. An introduction to the Pisces 1 virtual computer and the FORTRAN 77 extensions is presented. An example of an algorithm for the iterative solution of a system of equations is given. The most notable features of the design are the provision for several granularities of parallelism in programs and the provision of a window mechanism for distributed access to large arrays of data.
Research in Parallel Algorithms and Software for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Domel, Neal D.

1996-01-01

Phase I is complete for the development of a Computational Fluid Dynamics parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Research in Parallel Algorithms and Software for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Domel, Neal D.

1996-01-01

Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Optimized Hypervisor Scheduler for Parallel Discrete Event Simulations on Virtual Machine Platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yoginath, Srikanth B; Perumalla, Kalyan S

2013-01-01

With the advent of virtual machine (VM)-based platforms for parallel computing, it is now possible to execute parallel discrete event simulations (PDES) over multiple virtual machines, in contrast to executing in native mode directly over hardware as is traditionally done over the past decades. While mature VM-based parallel systems now offer new, compelling benefits such as serviceability, dynamic reconfigurability and overall cost effectiveness, the runtime performance of parallel applications can be significantly affected. In particular, most VM-based platforms are optimized for general workloads, but PDES execution exhibits unique dynamics significantly different from other workloads. Here we first present results frommore » experiments that highlight the gross deterioration of the runtime performance of VM-based PDES simulations when executed using traditional VM schedulers, quantitatively showing the bad scaling properties of the scheduler as the number of VMs is increased. The mismatch is fundamental in nature in the sense that any fairness-based VM scheduler implementation would exhibit this mismatch with PDES runs. We also present a new scheduler optimized specifically for PDES applications, and describe its design and implementation. Experimental results obtained from running PDES benchmarks (PHOLD and vehicular traffic simulations) over VMs show over an order of magnitude improvement in the run time of the PDES-optimized scheduler relative to the regular VM scheduler, with over 20 reduction in run time of simulations using up to 64 VMs. The observations and results are timely in the context of emerging systems such as cloud platforms and VM-based high performance computing installations, highlighting to the community the need for PDES-specific support, and the feasibility of significantly reducing the runtime overhead for scalable PDES on VM platforms.« less
The core legion object model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lewis, M.; Grimshaw, A.

1996-12-31

The Legion project at the University of Virginia is an architecture for designing and building system services that provide the illusion of a single virtual machine to users, a virtual machine that provides secure shared object and shared name spaces, application adjustable fault-tolerance, improved response time, and greater throughput. Legion targets wide area assemblies of workstations, supercomputers, and parallel supercomputers, Legion tackles problems not solved by existing workstation based parallel processing tools; the system will enable fault-tolerance, wide area parallel processing, inter-operability, heterogeneity, a single global name space, protection, security, efficient scheduling, and comprehensive resource management. This paper describes themore » core Legion object model, which specifies the composition and functionality of Legion`s core objects-those objects that cooperate to create, locate, manage, and remove objects in the Legion system. The object model facilitates a flexible extensible implementation, provides a single global name space, grants site autonomy to participating organizations, and scales to millions of sites and trillions of objects.« less
Distributed computing methodology for training neural networks in an image-guided diagnostic application.

PubMed

Plagianakos, V P; Magoulas, G D; Vrahatis, M N

2006-03-01

Distributed computing is a process through which a set of computers connected by a network is used collectively to solve a single problem. In this paper, we propose a distributed computing methodology for training neural networks for the detection of lesions in colonoscopy. Our approach is based on partitioning the training set across multiple processors using a parallel virtual machine. In this way, interconnected computers of varied architectures can be used for the distributed evaluation of the error function and gradient values, and, thus, training neural networks utilizing various learning methods. The proposed methodology has large granularity and low synchronization, and has been implemented and tested. Our results indicate that the parallel virtual machine implementation of the training algorithms developed leads to considerable speedup, especially when large network architectures and training sets are used.
Highly parallel sparse Cholesky factorization

NASA Technical Reports Server (NTRS)

Gilbert, John R.; Schreiber, Robert

1990-01-01

Several fine grained parallel algorithms were developed and compared to compute the Cholesky factorization of a sparse matrix. The experimental implementations are on the Connection Machine, a distributed memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special purpose algorithms in which the matrix structure conforms to the connection structure of the machine, the focus is on matrices with arbitrary sparsity structure. The most promising algorithm is one whose inner loop performs several dense factorizations simultaneously on a 2-D grid of processors. Virtually any massively parallel dense factorization algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorization from realizing its potential efficiency, it is concluded that a regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. A performance model is also presented and it is used to analyze the algorithms.
Performance prediction: A case study using a multi-ring KSR-1 machine

NASA Technical Reports Server (NTRS)

Sun, Xian-He; Zhu, Jianping

1995-01-01

While computers with tens of thousands of processors have successfully delivered high performance power for solving some of the so-called 'grand-challenge' applications, the notion of scalability is becoming an important metric in the evaluation of parallel machine architectures and algorithms. In this study, the prediction of scalability and its application are carefully investigated. A simple formula is presented to show the relation between scalability, single processor computing power, and degradation of parallelism. A case study is conducted on a multi-ring KSR1 shared virtual memory machine. Experimental and theoretical results show that the influence of topology variation of an architecture is predictable. Therefore, the performance of an algorithm on a sophisticated, heirarchical architecture can be predicted and the best algorithm-machine combination can be selected for a given application.
Using PVM to host CLIPS in distributed environments

NASA Technical Reports Server (NTRS)

Myers, Leonard; Pohl, Kym

1994-01-01

It is relatively easy to enhance CLIPS (C Language Integrated Production System) to support multiple expert systems running in a distributed environment with heterogeneous machines. The task is minimized by using the PVM (Parallel Virtual Machine) code from Oak Ridge Labs to provide the distributed utility. PVM is a library of C and FORTRAN subprograms that supports distributive computing on many different UNIX platforms. A PVM deamon is easily installed on each CPU that enters the virtual machine environment. Any user with rsh or rexec access to a machine can use the one PVM deamon to obtain a generous set of distributed facilities. The ready availability of both CLIPS and PVM makes the combination of software particularly attractive for budget conscious experimentation of heterogeneous distributive computing with multiple CLIPS executables. This paper presents a design that is sufficient to provide essential message passing functions in CLIPS and enable the full range of PVM facilities.
A Distributed Parallel Genetic Algorithm of Placement Strategy for Virtual Machines Deployment on Cloud Platform

PubMed Central

Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

2014-01-01

The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform. PMID:25097872
A distributed parallel genetic algorithm of placement strategy for virtual machines deployment on cloud platform.

PubMed

Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

2014-01-01

The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform.
Global Detection of Live Virtual Machine Migration Based on Cellular Neural Networks

PubMed Central

Xie, Kang; Yang, Yixian; Zhang, Ling; Jing, Maohua; Xin, Yang; Li, Zhongxian

2014-01-01

In order to meet the demands of operation monitoring of large scale, autoscaling, and heterogeneous virtual resources in the existing cloud computing, a new method of live virtual machine (VM) migration detection algorithm based on the cellular neural networks (CNNs), is presented. Through analyzing the detection process, the parameter relationship of CNN is mapped as an optimization problem, in which improved particle swarm optimization algorithm based on bubble sort is used to solve the problem. Experimental results demonstrate that the proposed method can display the VM migration processing intuitively. Compared with the best fit heuristic algorithm, this approach reduces the processing time, and emerging evidence has indicated that this new approach is affordable to parallelism and analog very large scale integration (VLSI) implementation allowing the VM migration detection to be performed better. PMID:24959631
Global detection of live virtual machine migration based on cellular neural networks.

PubMed

Xie, Kang; Yang, Yixian; Zhang, Ling; Jing, Maohua; Xin, Yang; Li, Zhongxian

2014-01-01

In order to meet the demands of operation monitoring of large scale, autoscaling, and heterogeneous virtual resources in the existing cloud computing, a new method of live virtual machine (VM) migration detection algorithm based on the cellular neural networks (CNNs), is presented. Through analyzing the detection process, the parameter relationship of CNN is mapped as an optimization problem, in which improved particle swarm optimization algorithm based on bubble sort is used to solve the problem. Experimental results demonstrate that the proposed method can display the VM migration processing intuitively. Compared with the best fit heuristic algorithm, this approach reduces the processing time, and emerging evidence has indicated that this new approach is affordable to parallelism and analog very large scale integration (VLSI) implementation allowing the VM migration detection to be performed better.
The engine design engine. A clustered computer platform for the aerodynamic inverse design and analysis of a full engine

NASA Technical Reports Server (NTRS)

Sanz, J.; Pischel, K.; Hubler, D.

1992-01-01

An application for parallel computation on a combined cluster of powerful workstations and supercomputers was developed. A Parallel Virtual Machine (PVM) is used as message passage language on a macro-tasking parallelization of the Aerodynamic Inverse Design and Analysis for a Full Engine computer code. The heterogeneous nature of the cluster is perfectly handled by the controlling host machine. Communication is established via Ethernet with the TCP/IP protocol over an open network. A reasonable overhead is imposed for internode communication, rendering an efficient utilization of the engaged processors. Perhaps one of the most interesting features of the system is its versatile nature, that permits the usage of the computational resources available that are experiencing less use at a given point in time.

Shared virtual memory and generalized speedup

NASA Technical Reports Server (NTRS)

Sun, Xian-He; Zhu, Jianping

1994-01-01

Generalized speedup is defined as parallel speed over sequential speed. The generalized speedup and its relation with other existing performance metrics, such as traditional speedup, efficiency, scalability, etc., are carefully studied. In terms of the introduced asymptotic speed, it was shown that the difference between the generalized speedup and the traditional speedup lies in the definition of the efficiency of uniprocessor processing, which is a very important issue in shared virtual memory machines. A scientific application was implemented on a KSR-1 parallel computer. Experimental and theoretical results show that the generalized speedup is distinct from the traditional speedup and provides a more reasonable measurement. In the study of different speedups, various causes of superlinear speedup are also presented.
A distributed version of the NASA Engine Performance Program

NASA Technical Reports Server (NTRS)

Cours, Jeffrey T.; Curlett, Brian P.

1993-01-01

Distributed NEPP, a version of the NASA Engine Performance Program, uses the original NEPP code but executes it in a distributed computer environment. Multiple workstations connected by a network increase the program's speed and, more importantly, the complexity of the cases it can handle in a reasonable time. Distributed NEPP uses the public domain software package, called Parallel Virtual Machine, allowing it to execute on clusters of machines containing many different architectures. It includes the capability to link with other computers, allowing them to process NEPP jobs in parallel. This paper discusses the design issues and granularity considerations that entered into programming Distributed NEPP and presents the results of timing runs.
Parallel Compilation on Virtual Machines in a Development Cloud Environment

DTIC Science & Technology

2013-09-01

the potential impact of a possible course of action. 1-2 2. Approach We performed a simple experiment to determine whether the multiple CPUs...PERFORMING ORGANIZATION NAME(S) AND ADDRESSES 8. PERFORMING ORGANIZATION REPORT NUMBER D-4996 H13 -001206 Institute for Defense Analyses 4850 Mark
A Parallel Vector Machine for the PM Programming Language

NASA Astrophysics Data System (ADS)

Bellerby, Tim

2016-04-01

PM is a new programming language which aims to make the writing of computational geoscience models on parallel hardware accessible to scientists who are not themselves expert parallel programmers. It is based around the concept of communicating operators: language constructs that enable variables local to a single invocation of a parallelised loop to be viewed as if they were arrays spanning the entire loop domain. This mechanism enables different loop invocations (which may or may not be executing on different processors) to exchange information in a manner that extends the successful Communicating Sequential Processes idiom from single messages to collective communication. Communicating operators avoid the additional synchronisation mechanisms, such as atomic variables, required when programming using the Partitioned Global Address Space (PGAS) paradigm. Using a single loop invocation as the fundamental unit of concurrency enables PM to uniformly represent different levels of parallelism from vector operations through shared memory systems to distributed grids. This paper describes an implementation of PM based on a vectorised virtual machine. On a single processor node, concurrent operations are implemented using masked vector operations. Virtual machine instructions operate on vectors of values and may be unmasked, masked using a Boolean field, or masked using an array of active vector cell locations. Conditional structures (such as if-then-else or while statement implementations) calculate and apply masks to the operations they control. A shift in mask representation from Boolean to location-list occurs when active locations become sufficiently sparse. Parallel loops unfold data structures (or vectors of data structures for nested loops) into vectors of values that may additionally be distributed over multiple computational nodes and then split into micro-threads compatible with the size of the local cache. Inter-node communication is accomplished using standard OpenMP and MPI. Performance analyses of the PM vector machine, demonstrating its scaling properties with respect to domain size and the number of processor nodes will be presented for a range of hardware configurations. The PM software and language definition are being made available under unrestrictive MIT and Creative Commons Attribution licenses respectively: www.pm-lang.org.
Automated Instrumentation, Monitoring and Visualization of PVM Programs Using AIMS

NASA Technical Reports Server (NTRS)

Mehra, Pankaj; VanVoorst, Brian; Yan, Jerry; Lum, Henry, Jr. (Technical Monitor)

1994-01-01

We present views and analysis of the execution of several PVM (Parallel Virtual Machine) codes for Computational Fluid Dynamics on a networks of Sparcstations, including: (1) NAS Parallel Benchmarks CG and MG; (2) a multi-partitioning algorithm for NAS Parallel Benchmark SP; and (3) an overset grid flowsolver. These views and analysis were obtained using our Automated Instrumentation and Monitoring System (AIMS) version 3.0, a toolkit for debugging the performance of PVM programs. We will describe the architecture, operation and application of AIMS. The AIMS toolkit contains: (1) Xinstrument, which can automatically instrument various computational and communication constructs in message-passing parallel programs; (2) Monitor, a library of runtime trace-collection routines; (3) VK (Visual Kernel), an execution-animation tool with source-code clickback; and (4) Tally, a tool for statistical analysis of execution profiles. Currently, Xinstrument can handle C and Fortran 77 programs using PVM 3.2.x; Monitor has been implemented and tested on Sun 4 systems running SunOS 4.1.2; and VK uses XIIR5 and Motif 1.2. Data and views obtained using AIMS clearly illustrate several characteristic features of executing parallel programs on networked workstations: (1) the impact of long message latencies; (2) the impact of multiprogramming overheads and associated load imbalance; (3) cache and virtual-memory effects; and (4) significant skews between workstation clocks. Interestingly, AIMS can compensate for constant skew (zero drift) by calibrating the skew between a parent and its spawned children. In addition, AIMS' skew-compensation algorithm can adjust timestamps in a way that eliminates physically impossible communications (e.g., messages going backwards in time). Our current efforts are directed toward creating new views to explain the observed performance of PVM programs. Some of the features planned for the near future include: (1) ConfigView, showing the physical topology of the virtual machine, inferred using specially formatted IP (Internet Protocol) packets: and (2) LoadView, synchronous animation of PVM-program execution and resource-utilization patterns.
Dust Dynamics in Protoplanetary Disks: Parallel Computing with PVM

NASA Astrophysics Data System (ADS)

de La Fuente Marcos, Carlos; Barge, Pierre; de La Fuente Marcos, Raúl

2002-03-01

We describe a parallel version of our high-order-accuracy particle-mesh code for the simulation of collisionless protoplanetary disks. We use this code to carry out a massively parallel, two-dimensional, time-dependent, numerical simulation, which includes dust particles, to study the potential role of large-scale, gaseous vortices in protoplanetary disks. This noncollisional problem is easy to parallelize on message-passing multicomputer architectures. We performed the simulations on a cache-coherent nonuniform memory access Origin 2000 machine, using both the parallel virtual machine (PVM) and message-passing interface (MPI) message-passing libraries. Our performance analysis suggests that, for our problem, PVM is about 25% faster than MPI. Using PVM and MPI made it possible to reduce CPU time and increase code performance. This allows for simulations with a large number of particles (N ~ 105-106) in reasonable CPU times. The performances of our implementation of the pa! rallel code on an Origin 2000 supercomputer are presented and discussed. They exhibit very good speedup behavior and low load unbalancing. Our results confirm that giant gaseous vortices can play a dominant role in giant planet formation.
Making extreme computations possible with virtual machines

NASA Astrophysics Data System (ADS)

Reuter, J.; Chokoufe Nejad, B.; Ohl, T.

2016-10-01

State-of-the-art algorithms generate scattering amplitudes for high-energy physics at leading order for high-multiplicity processes as compiled code (in Fortran, C or C++). For complicated processes the size of these libraries can become tremendous (many GiB). We show that amplitudes can be translated to byte-code instructions, which even reduce the size by one order of magnitude. The byte-code is interpreted by a Virtual Machine with runtimes comparable to compiled code and a better scaling with additional legs. We study the properties of this algorithm, as an extension of the Optimizing Matrix Element Generator (O'Mega). The bytecode matrix elements are available as alternative input for the event generator WHIZARD. The bytecode interpreter can be implemented very compactly, which will help with a future implementation on massively parallel GPUs.
A framework for grand scale parallelization of the combined finite discrete element method in 2d

NASA Astrophysics Data System (ADS)

Lei, Z.; Rougier, E.; Knight, E. E.; Munjiza, A.

2014-09-01

Within the context of rock mechanics, the Combined Finite-Discrete Element Method (FDEM) has been applied to many complex industrial problems such as block caving, deep mining techniques (tunneling, pillar strength, etc.), rock blasting, seismic wave propagation, packing problems, dam stability, rock slope stability, rock mass strength characterization problems, etc. The reality is that most of these were accomplished in a 2D and/or single processor realm. In this work a hardware independent FDEM parallelization framework has been developed using the Virtual Parallel Machine for FDEM, (V-FDEM). With V-FDEM, a parallel FDEM software can be adapted to different parallel architecture systems ranging from just a few to thousands of cores.
Performance Evaluation in Network-Based Parallel Computing

NASA Technical Reports Server (NTRS)

Dezhgosha, Kamyar

1996-01-01

Network-based parallel computing is emerging as a cost-effective alternative for solving many problems which require use of supercomputers or massively parallel computers. The primary objective of this project has been to conduct experimental research on performance evaluation for clustered parallel computing. First, a testbed was established by augmenting our existing SUNSPARCs' network with PVM (Parallel Virtual Machine) which is a software system for linking clusters of machines. Second, a set of three basic applications were selected. The applications consist of a parallel search, a parallel sort, a parallel matrix multiplication. These application programs were implemented in C programming language under PVM. Third, we conducted performance evaluation under various configurations and problem sizes. Alternative parallel computing models and workload allocations for application programs were explored. The performance metric was limited to elapsed time or response time which in the context of parallel computing can be expressed in terms of speedup. The results reveal that the overhead of communication latency between processes in many cases is the restricting factor to performance. That is, coarse-grain parallelism which requires less frequent communication between processes will result in higher performance in network-based computing. Finally, we are in the final stages of installing an Asynchronous Transfer Mode (ATM) switch and four ATM interfaces (each 155 Mbps) which will allow us to extend our study to newer applications, performance metrics, and configurations.
Computer network defense system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Urias, Vincent; Stout, William M. S.; Loverro, Caleb

A method and apparatus for protecting virtual machines. A computer system creates a copy of a group of the virtual machines in an operating network in a deception network to form a group of cloned virtual machines in the deception network when the group of the virtual machines is accessed by an adversary. The computer system creates an emulation of components from the operating network in the deception network. The components are accessible by the group of the cloned virtual machines as if the group of the cloned virtual machines was in the operating network. The computer system moves networkmore » connections for the group of the virtual machines in the operating network used by the adversary from the group of the virtual machines in the operating network to the group of the cloned virtual machines, enabling protecting the group of the virtual machines from actions performed by the adversary.« less
Efficient operating system level virtualization techniques for cloud resources

NASA Astrophysics Data System (ADS)

Ansu, R.; Samiksha; Anju, S.; Singh, K. John

2017-11-01

Cloud computing is an advancing technology which provides the servcies of Infrastructure, Platform and Software. Virtualization and Computer utility are the keys of Cloud computing. The numbers of cloud users are increasing day by day. So it is the need of the hour to make resources available on demand to satisfy user requirements. The technique in which resources namely storage, processing power, memory and network or I/O are abstracted is known as Virtualization. For executing the operating systems various virtualization techniques are available. They are: Full System Virtualization and Para Virtualization. In Full Virtualization, the whole architecture of hardware is duplicated virtually. No modifications are required in Guest OS as the OS deals with the VM hypervisor directly. In Para Virtualization, modifications of OS is required to run in parallel with other OS. For the Guest OS to access the hardware, the host OS must provide a Virtual Machine Interface. OS virtualization has many advantages such as migrating applications transparently, consolidation of server, online maintenance of OS and providing security. This paper briefs both the virtualization techniques and discusses the issues in OS level virtualization.
1001 Ways to run AutoDock Vina for virtual screening

NASA Astrophysics Data System (ADS)

Jaghoori, Mohammad Mahdi; Bleijlevens, Boris; Olabarriaga, Silvia D.

2016-03-01

Large-scale computing technologies have enabled high-throughput virtual screening involving thousands to millions of drug candidates. It is not trivial, however, for biochemical scientists to evaluate the technical alternatives and their implications for running such large experiments. Besides experience with the molecular docking tool itself, the scientist needs to learn how to run it on high-performance computing (HPC) infrastructures, and understand the impact of the choices made. Here, we review such considerations for a specific tool, AutoDock Vina, and use experimental data to illustrate the following points: (1) an additional level of parallelization increases virtual screening throughput on a multi-core machine; (2) capturing of the random seed is not enough (though necessary) for reproducibility on heterogeneous distributed computing systems; (3) the overall time spent on the screening of a ligand library can be improved by analysis of factors affecting execution time per ligand, including number of active torsions, heavy atoms and exhaustiveness. We also illustrate differences among four common HPC infrastructures: grid, Hadoop, small cluster and multi-core (virtual machine on the cloud). Our analysis shows that these platforms are suitable for screening experiments of different sizes. These considerations can guide scientists when choosing the best computing platform and set-up for their future large virtual screening experiments.
1001 Ways to run AutoDock Vina for virtual screening.

PubMed

Jaghoori, Mohammad Mahdi; Bleijlevens, Boris; Olabarriaga, Silvia D

2016-03-01

Large-scale computing technologies have enabled high-throughput virtual screening involving thousands to millions of drug candidates. It is not trivial, however, for biochemical scientists to evaluate the technical alternatives and their implications for running such large experiments. Besides experience with the molecular docking tool itself, the scientist needs to learn how to run it on high-performance computing (HPC) infrastructures, and understand the impact of the choices made. Here, we review such considerations for a specific tool, AutoDock Vina, and use experimental data to illustrate the following points: (1) an additional level of parallelization increases virtual screening throughput on a multi-core machine; (2) capturing of the random seed is not enough (though necessary) for reproducibility on heterogeneous distributed computing systems; (3) the overall time spent on the screening of a ligand library can be improved by analysis of factors affecting execution time per ligand, including number of active torsions, heavy atoms and exhaustiveness. We also illustrate differences among four common HPC infrastructures: grid, Hadoop, small cluster and multi-core (virtual machine on the cloud). Our analysis shows that these platforms are suitable for screening experiments of different sizes. These considerations can guide scientists when choosing the best computing platform and set-up for their future large virtual screening experiments.
Parallel implementation of D-Phylo algorithm for maximum likelihood clusters.

PubMed

Malik, Shamita; Sharma, Dolly; Khatri, Sunil Kumar

2017-03-01

This study explains a newly developed parallel algorithm for phylogenetic analysis of DNA sequences. The newly designed D-Phylo is a more advanced algorithm for phylogenetic analysis using maximum likelihood approach. The D-Phylo while misusing the seeking capacity of k -means keeps away from its real constraint of getting stuck at privately conserved motifs. The authors have tested the behaviour of D-Phylo on Amazon Linux Amazon Machine Image(Hardware Virtual Machine)i2.4xlarge, six central processing unit, 122 GiB memory, 8 × 800 Solid-state drive Elastic Block Store volume, high network performance up to 15 processors for several real-life datasets. Distributing the clusters evenly on all the processors provides us the capacity to accomplish a near direct speed if there should arise an occurrence of huge number of processors.
Dynamically programmable cache

NASA Astrophysics Data System (ADS)

Nakkar, Mouna; Harding, John A.; Schwartz, David A.; Franzon, Paul D.; Conte, Thomas

1998-10-01

Reconfigurable machines have recently been used as co- processors to accelerate the execution of certain algorithms or program subroutines. The problems with the above approach include high reconfiguration time and limited partial reconfiguration. By far the most critical problems are: (1) the small on-chip memory which results in slower execution time, and (2) small FPGA areas that cannot implement large subroutines. Dynamically Programmable Cache (DPC) is a novel architecture for embedded processors which offers solutions to the above problems. To solve memory access problems, DPC processors merge reconfigurable arrays with the data cache at various cache levels to create a multi-level reconfigurable machines. As a result DPC machines have both higher data accessibility and FPGA memory bandwidth. To solve the limited FPGA resource problem, DPC processors implemented multi-context switching (Virtualization) concept. Virtualization allows implementation of large subroutines with fewer FPGA cells. Additionally, DPC processors can parallelize the execution of several operations resulting in faster execution time. In this paper, the speedup improvement for DPC machines are shown to be 5X faster than an Altera FLEX10K FPGA chip and 2X faster than a Sun Ultral SPARC station for two different algorithms (convolution and motion estimation).
Design and implementation of a reliable and cost-effective cloud computing infrastructure: the INFN Napoli experience

NASA Astrophysics Data System (ADS)

Capone, V.; Esposito, R.; Pardi, S.; Taurino, F.; Tortone, G.

2012-12-01

Over the last few years we have seen an increasing number of services and applications needed to manage and maintain cloud computing facilities. This is particularly true for computing in high energy physics, which often requires complex configurations and distributed infrastructures. In this scenario a cost effective rationalization and consolidation strategy is the key to success in terms of scalability and reliability. In this work we describe an IaaS (Infrastructure as a Service) cloud computing system, with high availability and redundancy features, which is currently in production at INFN-Naples and ATLAS Tier-2 data centre. The main goal we intended to achieve was a simplified method to manage our computing resources and deliver reliable user services, reusing existing hardware without incurring heavy costs. A combined usage of virtualization and clustering technologies allowed us to consolidate our services on a small number of physical machines, reducing electric power costs. As a result of our efforts we developed a complete solution for data and computing centres that can be easily replicated using commodity hardware. Our architecture consists of 2 main subsystems: a clustered storage solution, built on top of disk servers running GlusterFS file system, and a virtual machines execution environment. GlusterFS is a network file system able to perform parallel writes on multiple disk servers, providing this way live replication of data. High availability is also achieved via a network configuration using redundant switches and multiple paths between hypervisor hosts and disk servers. We also developed a set of management scripts to easily perform basic system administration tasks such as automatic deployment of new virtual machines, adaptive scheduling of virtual machines on hypervisor hosts, live migration and automated restart in case of hypervisor failures.
Modeling and simulation of five-axis virtual machine based on NX

NASA Astrophysics Data System (ADS)

Li, Xiaoda; Zhan, Xianghui

2018-04-01

Virtual technology in the machinery manufacturing industry has shown the role of growing. In this paper, the Siemens NX software is used to model the virtual CNC machine tool, and the parameters of the virtual machine are defined according to the actual parameters of the machine tool so that the virtual simulation can be carried out without loss of the accuracy of the simulation. How to use the machine builder of the CAM module to define the kinematic chain and machine components of the machine is described. The simulation of virtual machine can provide alarm information of tool collision and over cutting during the process to users, and can evaluate and forecast the rationality of the technological process.
Development of a Big Data Application Architecture for Navy Manpower, Personnel, Training, and Education

DTIC Science & Technology

2016-03-01

science IT information technology JBOD just a bunch of disks JDBC java database connectivity xviii JPME Joint Professional Military Education JSO...Joint Service Officer JVM java virtual machine MPP massively parallel processing MPTE Manpower, Personnel, Training, and Education NAVMAC Navy...27 external database, whether it is MySQL , Oracle, DB2, or SQL Server (Teller, 2015). Connectors optimize the data transfer by obtaining metadata
Trajectory Tracking of a Planer Parallel Manipulator by Using Computed Force Control Method

NASA Astrophysics Data System (ADS)

Bayram, Atilla

2017-03-01

Despite small workspace, parallel manipulators have some advantages over their serial counterparts in terms of higher speed, acceleration, rigidity, accuracy, manufacturing cost and payload. Accordingly, this type of manipulators can be used in many applications such as in high-speed machine tools, tuning machine for feeding, sensitive cutting, assembly and packaging. This paper presents a special type of planar parallel manipulator with three degrees of freedom. It is constructed as a variable geometry truss generally known planar Stewart platform. The reachable and orientation workspaces are obtained for this manipulator. The inverse kinematic analysis is solved for the trajectory tracking according to the redundancy and joint limit avoidance. Then, the dynamics model of the manipulator is established by using Virtual Work method. The simulations are performed to follow the given planar trajectories by using the dynamic equations of the variable geometry truss manipulator and computed force control method. In computed force control method, the feedback gain matrices for PD control are tuned with fixed matrices by trail end error and variable ones by means of optimization with genetic algorithm.
Efficiently modeling neural networks on massively parallel computers

NASA Technical Reports Server (NTRS)

Farber, Robert M.

1993-01-01

Neural networks are a very useful tool for analyzing and modeling complex real world systems. Applying neural network simulations to real world problems generally involves large amounts of data and massive amounts of computation. To efficiently handle the computational requirements of large problems, we have implemented at Los Alamos a highly efficient neural network compiler for serial computers, vector computers, vector parallel computers, and fine grain SIMD computers such as the CM-2 connection machine. This paper describes the mapping used by the compiler to implement feed-forward backpropagation neural networks for a SIMD (Single Instruction Multiple Data) architecture parallel computer. Thinking Machines Corporation has benchmarked our code at 1.3 billion interconnects per second (approximately 3 gigaflops) on a 64,000 processor CM-2 connection machine (Singer 1990). This mapping is applicable to other SIMD computers and can be implemented on MIMD computers such as the CM-5 connection machine. Our mapping has virtually no communications overhead with the exception of the communications required for a global summation across the processors (which has a sub-linear runtime growth on the order of O(log(number of processors)). We can efficiently model very large neural networks which have many neurons and interconnects and our mapping can extend to arbitrarily large networks (within memory limitations) by merging the memory space of separate processors with fast adjacent processor interprocessor communications. This paper will consider the simulation of only feed forward neural network although this method is extendable to recurrent networks.

Build and Execute Environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Guan, Qiang

At exascale, the challenge becomes to develop applications that run at scale and use exascale platforms reliably, efficiently, and flexibly. Workflows become much more complex because they must seamlessly integrate simulation and data analytics. They must include down-sampling, post-processing, feature extraction, and visualization. Power and data transfer limitations require these analysis tasks to be run in-situ or in-transit. We expect successful workflows will comprise multiple linked simulations along with tens of analysis routines. Users will have limited development time at scale and, therefore, must have rich tools to develop, debug, test, and deploy applications. At this scale, successful workflows willmore » compose linked computations from an assortment of reliable, well-defined computation elements, ones that can come and go as required, based on the needs of the workflow over time. We propose a novel framework that utilizes both virtual machines (VMs) and software containers to create a workflow system that establishes a uniform build and execution environment (BEE) beyond the capabilities of current systems. In this environment, applications will run reliably and repeatably across heterogeneous hardware and software. Containers, both commercial (Docker and Rocket) and open-source (LXC and LXD), define a runtime that isolates all software dependencies from the machine operating system. Workflows may contain multiple containers that run different operating systems, different software, and even different versions of the same software. We will run containers in open-source virtual machines (KVM) and emulators (QEMU) so that workflows run on any machine entirely in user-space. On this platform of containers and virtual machines, we will deliver workflow software that provides services, including repeatable execution, provenance, checkpointing, and future proofing. We will capture provenance about how containers were launched and how they interact to annotate workflows for repeatable and partial re-execution. We will coordinate the physical snapshots of virtual machines with parallel programming constructs, such as barriers, to automate checkpoint and restart. We will also integrate with HPC-specific container runtimes to gain access to accelerators and other specialized hardware to preserve native performance. Containers will link development to continuous integration. When application developers check code in, it will automatically be tested on a suite of different software and hardware architectures.« less
LAVA web-based remote simulation: enhancements for education and technology innovation

NASA Astrophysics Data System (ADS)

Lee, Sang Il; Ng, Ka Chun; Orimoto, Takashi; Pittenger, Jason; Horie, Toshi; Adam, Konstantinos; Cheng, Mosong; Croffie, Ebo H.; Deng, Yunfei; Gennari, Frank E.; Pistor, Thomas V.; Robins, Garth; Williamson, Mike V.; Wu, Bo; Yuan, Lei; Neureuther, Andrew R.

2001-09-01

The Lithography Analysis using Virtual Access (LAVA) web site at http://cuervo.eecs.berkeley.edu/Volcano/ has been enhanced with new optical and deposition applets, graphical infrastructure and linkage to parallel execution on networks of workstations. More than ten new graphical user interface applets have been designed to support education, illustrate novel concepts from research, and explore usage of parallel machines. These applets have been improved through feedback and classroom use. Over the last year LAVA provided industry and other academic communities 1,300 session and 700 rigorous simulations per month among the SPLAT, SAMPLE2D, SAMPLE3D, TEMPEST, STORM, and BEBS simulators.
: A Scalable and Transparent System for Simulating MPI Programs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Perumalla, Kalyan S

2010-01-01

is a scalable, transparent system for experimenting with the execution of parallel programs on simulated computing platforms. The level of simulated detail can be varied for application behavior as well as for machine characteristics. Unique features of are repeatability of execution, scalability to millions of simulated (virtual) MPI ranks, scalability to hundreds of thousands of host (real) MPI ranks, portability of the system to a variety of host supercomputing platforms, and the ability to experiment with scientific applications whose source-code is available. The set of source-code interfaces supported by is being expanded to support a wider set of applications, andmore » MPI-based scientific computing benchmarks are being ported. In proof-of-concept experiments, has been successfully exercised to spawn and sustain very large-scale executions of an MPI test program given in source code form. Low slowdowns are observed, due to its use of purely discrete event style of execution, and due to the scalability and efficiency of the underlying parallel discrete event simulation engine, sik. In the largest runs, has been executed on up to 216,000 cores of a Cray XT5 supercomputer, successfully simulating over 27 million virtual MPI ranks, each virtual rank containing its own thread context, and all ranks fully synchronized by virtual time.« less
Effects of virtualization on a scientific application - Running a hyperspectral radiative transfer code on virtual machines.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tikotekar, Anand A; Vallee, Geoffroy R; Naughton III, Thomas J

2008-01-01

The topic of system-level virtualization has recently begun to receive interest for high performance computing (HPC). This is in part due to the isolation and encapsulation offered by the virtual machine. These traits enable applications to customize their environments and maintain consistent software configurations in their virtual domains. Additionally, there are mechanisms that can be used for fault tolerance like live virtual machine migration. Given these attractive benefits to virtualization, a fundamental question arises, how does this effect my scientific application? We use this as the premise for our paper and observe a real-world scientific code running on a Xenmore » virtual machine. We studied the effects of running a radiative transfer simulation, Hydrolight, on a virtual machine. We discuss our methodology and report observations regarding the usage of virtualization with this application.« less
Request queues for interactive clients in a shared file system of a parallel computing system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bent, John M.; Faibish, Sorin

Interactive requests are processed from users of log-in nodes. A metadata server node is provided for use in a file system shared by one or more interactive nodes and one or more batch nodes. The interactive nodes comprise interactive clients to execute interactive tasks and the batch nodes execute batch jobs for one or more batch clients. The metadata server node comprises a virtual machine monitor; an interactive client proxy to store metadata requests from the interactive clients in an interactive client queue; a batch client proxy to store metadata requests from the batch clients in a batch client queue;more » and a metadata server to store the metadata requests from the interactive client queue and the batch client queue in a metadata queue based on an allocation of resources by the virtual machine monitor. The metadata requests can be prioritized, for example, based on one or more of a predefined policy and predefined rules.« less
Managing virtual machines with Vac and Vcycle

NASA Astrophysics Data System (ADS)

McNab, A.; Love, P.; MacMahon, E.

2015-12-01

We compare the Vac and Vcycle virtual machine lifecycle managers and our experiences in providing production job execution services for ATLAS, CMS, LHCb, and the GridPP VO at sites in the UK, France and at CERN. In both the Vac and Vcycle systems, the virtual machines are created outside of the experiment's job submission and pilot framework. In the case of Vac, a daemon runs on each physical host which manages a pool of virtual machines on that host, and a peer-to-peer UDP protocol is used to achieve the desired target shares between experiments across the site. In the case of Vcycle, a daemon manages a pool of virtual machines on an Infrastructure-as-a-Service cloud system such as OpenStack, and has within itself enough information to create the types of virtual machines to achieve the desired target shares. Both systems allow unused shares for one experiment to temporarily taken up by other experiements with work to be done. The virtual machine lifecycle is managed with a minimum of information, gathered from the virtual machine creation mechanism (such as libvirt or OpenStack) and using the proposed Machine/Job Features API from WLCG. We demonstrate that the same virtual machine designs can be used to run production jobs on Vac and Vcycle/OpenStack sites for ATLAS, CMS, LHCb, and GridPP, and that these technologies allow sites to be operated in a reliable and robust way.
Environmental concept for engineering software on MIMD computers

NASA Technical Reports Server (NTRS)

Lopez, L. A.; Valimohamed, K.

1989-01-01

The issues related to developing an environment in which engineering systems can be implemented on MIMD machines are discussed. The problem is presented in terms of implementing the finite element method under such an environment. However, neither the concepts nor the prototype implementation environment are limited to this application. The topics discussed include: the ability to schedule and synchronize tasks efficiently; granularity of tasks; load balancing; and the use of a high level language to specify parallel constructs, manage data, and achieve portability. The objective of developing a virtual machine concept which incorporates solutions to the above issues leads to a design that can be mapped onto loosely coupled, tightly coupled, and hybrid systems.
LHCb experience with running jobs in virtual machines

NASA Astrophysics Data System (ADS)

McNab, A.; Stagni, F.; Luzzi, C.

2015-12-01

The LHCb experiment has been running production jobs in virtual machines since 2013 as part of its DIRAC-based infrastructure. We describe the architecture of these virtual machines and the steps taken to replicate the WLCG worker node environment expected by user and production jobs. This relies on the uCernVM system for providing root images for virtual machines. We use the CernVM-FS distributed filesystem to supply the root partition files, the LHCb software stack, and the bootstrapping scripts necessary to configure the virtual machines for us. Using this approach, we have been able to minimise the amount of contextualisation which must be provided by the virtual machine managers. We explain the process by which the virtual machine is able to receive payload jobs submitted to DIRAC by users and production managers, and how this differs from payloads executed within conventional DIRAC pilot jobs on batch queue based sites. We describe our operational experiences in running production on VM based sites managed using Vcycle/OpenStack, Vac, and HTCondor Vacuum. Finally we show how our use of these resources is monitored using Ganglia and DIRAC.
Software platform virtualization in chemistry research and university teaching

PubMed Central

2009-01-01

Background Modern chemistry laboratories operate with a wide range of software applications under different operating systems, such as Windows, LINUX or Mac OS X. Instead of installing software on different computers it is possible to install those applications on a single computer using Virtual Machine software. Software platform virtualization allows a single guest operating system to execute multiple other operating systems on the same computer. We apply and discuss the use of virtual machines in chemistry research and teaching laboratories. Results Virtual machines are commonly used for cheminformatics software development and testing. Benchmarking multiple chemistry software packages we have confirmed that the computational speed penalty for using virtual machines is low and around 5% to 10%. Software virtualization in a teaching environment allows faster deployment and easy use of commercial and open source software in hands-on computer teaching labs. Conclusion Software virtualization in chemistry, mass spectrometry and cheminformatics is needed for software testing and development of software for different operating systems. In order to obtain maximum performance the virtualization software should be multi-core enabled and allow the use of multiprocessor configurations in the virtual machine environment. Server consolidation, by running multiple tasks and operating systems on a single physical machine, can lead to lower maintenance and hardware costs especially in small research labs. The use of virtual machines can prevent software virus infections and security breaches when used as a sandbox system for internet access and software testing. Complex software setups can be created with virtual machines and are easily deployed later to multiple computers for hands-on teaching classes. We discuss the popularity of bioinformatics compared to cheminformatics as well as the missing cheminformatics education at universities worldwide. PMID:20150997
Software platform virtualization in chemistry research and university teaching.

PubMed

Kind, Tobias; Leamy, Tim; Leary, Julie A; Fiehn, Oliver

2009-11-16

Modern chemistry laboratories operate with a wide range of software applications under different operating systems, such as Windows, LINUX or Mac OS X. Instead of installing software on different computers it is possible to install those applications on a single computer using Virtual Machine software. Software platform virtualization allows a single guest operating system to execute multiple other operating systems on the same computer. We apply and discuss the use of virtual machines in chemistry research and teaching laboratories. Virtual machines are commonly used for cheminformatics software development and testing. Benchmarking multiple chemistry software packages we have confirmed that the computational speed penalty for using virtual machines is low and around 5% to 10%. Software virtualization in a teaching environment allows faster deployment and easy use of commercial and open source software in hands-on computer teaching labs. Software virtualization in chemistry, mass spectrometry and cheminformatics is needed for software testing and development of software for different operating systems. In order to obtain maximum performance the virtualization software should be multi-core enabled and allow the use of multiprocessor configurations in the virtual machine environment. Server consolidation, by running multiple tasks and operating systems on a single physical machine, can lead to lower maintenance and hardware costs especially in small research labs. The use of virtual machines can prevent software virus infections and security breaches when used as a sandbox system for internet access and software testing. Complex software setups can be created with virtual machines and are easily deployed later to multiple computers for hands-on teaching classes. We discuss the popularity of bioinformatics compared to cheminformatics as well as the missing cheminformatics education at universities worldwide.
Using virtual machine monitors to overcome the challenges of monitoring and managing virtualized cloud infrastructures

NASA Astrophysics Data System (ADS)

Bamiah, Mervat Adib; Brohi, Sarfraz Nawaz; Chuprat, Suriayati

2012-01-01

Virtualization is one of the hottest research topics nowadays. Several academic researchers and developers from IT industry are designing approaches for solving security and manageability issues of Virtual Machines (VMs) residing on virtualized cloud infrastructures. Moving the application from a physical to a virtual platform increases the efficiency, flexibility and reduces management cost as well as effort. Cloud computing is adopting the paradigm of virtualization, using this technique, memory, CPU and computational power is provided to clients' VMs by utilizing the underlying physical hardware. Beside these advantages there are few challenges faced by adopting virtualization such as management of VMs and network traffic, unexpected additional cost and resource allocation. Virtual Machine Monitor (VMM) or hypervisor is the tool used by cloud providers to manage the VMs on cloud. There are several heterogeneous hypervisors provided by various vendors that include VMware, Hyper-V, Xen and Kernel Virtual Machine (KVM). Considering the challenge of VM management, this paper describes several techniques to monitor and manage virtualized cloud infrastructures.
Hardware assisted hypervisor introspection.

PubMed

Shi, Jiangyong; Yang, Yuexiang; Tang, Chuan

2016-01-01

In this paper, we introduce hypervisor introspection, an out-of-box way to monitor the execution of hypervisors. Similar to virtual machine introspection which has been proposed to protect virtual machines in an out-of-box way over the past decade, hypervisor introspection can be used to protect hypervisors which are the basis of cloud security. Virtual machine introspection tools are usually deployed either in hypervisor or in privileged virtual machines, which might also be compromised. By utilizing hardware support including nested virtualization, EPT protection and #BP, we are able to monitor all hypercalls belongs to the virtual machines of one hypervisor, include that of privileged virtual machine and even when the hypervisor is compromised. What's more, hypercall injection method is used to simulate hypercall-based attacks and evaluate the performance of our method. Experiment results show that our method can effectively detect hypercall-based attacks with some performance cost. Lastly, we discuss our furture approaches of reducing the performance cost and preventing the compromised hypervisor from detecting the existence of our introspector, in addition with some new scenarios to apply our hypervisor introspection system.
Kinematics and dynamics of robotic systems with multiple closed loops

NASA Astrophysics Data System (ADS)

Zhang, Chang-De

The kinematics and dynamics of robotic systems with multiple closed loops, such as Stewart platforms, walking machines, and hybrid manipulators, are studied. In the study of kinematics, focus is on the closed-form solutions of the forward position analysis of different parallel systems. A closed-form solution means that the solution is expressed as a polynomial in one variable. If the order of the polynomial is less than or equal to four, the solution has analytical closed-form. First, the conditions of obtaining analytical closed-form solutions are studied. For a Stewart platform, the condition is found to be that one rotational degree of freedom of the output link is decoupled from the other five. Based on this condition, a class of Stewart platforms which has analytical closed-form solution is formulated. Conditions of analytical closed-form solution for other parallel systems are also studied. Closed-form solutions of forward kinematics for walking machines and multi-fingered grippers are then studied. For a parallel system with three three-degree-of-freedom subchains, there are 84 possible ways to select six independent joints among nine joints. These 84 ways can be classified into three categories: Category 3:3:0, Category 3:2:1, and Category 2:2:2. It is shown that the first category has no solutions; the solutions of the second category have analytical closed-form; and the solutions of the last category are higher order polynomials. The study is then extended to a nearly general Stewart platform. The solution is a 20th order polynomial and the Stewart platform has a maximum of 40 possible configurations. Also, the study is extended to a new class of hybrid manipulators which consists of two serially connected parallel mechanisms. In the study of dynamics, a computationally efficient method for inverse dynamics of manipulators based on the virtual work principle is developed. Although this method is comparable with the recursive Newton-Euler method for serial manipulators, its advantage is more noteworthy when applied to parallel systems. An approach of inverse dynamics of a walking machine is also developed, which includes inverse dynamic modeling, foot force distribution, and joint force/torque allocation.

A robot arm simulation with a shared memory multiprocessor machine

NASA Technical Reports Server (NTRS)

Kim, Sung-Soo; Chuang, Li-Ping

1989-01-01

A parallel processing scheme for a single chain robot arm is presented for high speed computation on a shared memory multiprocessor. A recursive formulation that is derived from a virtual work form of the d'Alembert equations of motion is utilized for robot arm dynamics. A joint drive system that consists of a motor rotor and gears is included in the arm dynamics model, in order to take into account gyroscopic effects due to the spinning of the rotor. The fine grain parallelism of mechanical and control subsystem models is exploited, based on independent computation associated with bodies, joint drive systems, and controllers. Efficiency and effectiveness of the parallel scheme are demonstrated through simulations of a telerobotic manipulator arm. Two different mechanical subsystem models, i.e., with and without gyroscopic effects, are compared, to show the trade-off between efficiency and accuracy.
Virtual Machine Language 2.1

NASA Technical Reports Server (NTRS)

Riedel, Joseph E.; Grasso, Christopher A.

2012-01-01

VML (Virtual Machine Language) is an advanced computing environment that allows spacecraft to operate using mechanisms ranging from simple, time-oriented sequencing to advanced, multicomponent reactive systems. VML has developed in four evolutionary stages. VML 0 is a core execution capability providing multi-threaded command execution, integer data types, and rudimentary branching. VML 1 added named parameterized procedures, extensive polymorphism, data typing, branching, looping issuance of commands using run-time parameters, and named global variables. VML 2 added for loops, data verification, telemetry reaction, and an open flight adaptation architecture. VML 2.1 contains major advances in control flow capabilities for executable state machines. On the resource requirements front, VML 2.1 features a reduced memory footprint in order to fit more capability into modestly sized flight processors, and endian-neutral data access for compatibility with Intel little-endian processors. Sequence packaging has been improved with object-oriented programming constructs and the use of implicit (rather than explicit) time tags on statements. Sequence event detection has been significantly enhanced with multi-variable waiting, which allows a sequence to detect and react to conditions defined by complex expressions with multiple global variables. This multi-variable waiting serves as the basis for implementing parallel rule checking, which in turn, makes possible executable state machines. The new state machine feature in VML 2.1 allows the creation of sophisticated autonomous reactive systems without the need to develop expensive flight software. Users specify named states and transitions, along with the truth conditions required, before taking transitions. Transitions with the same signal name allow separate state machines to coordinate actions: the conditions distributed across all state machines necessary to arm a particular signal are evaluated, and once found true, that signal is raised. The selected signal then causes all identically named transitions in all present state machines to be taken simultaneously. VML 2.1 has relevance to all potential space missions, both manned and unmanned. It was under consideration for use on Orion.
Computer Associates International, CA-ACF2/VM Release 3.1

DTIC Science & Technology

1987-09-09

Associates CA-ACF2/VM Bibliography International Business Machines Corporation, IBM Virtual Machine/Directory Maintenance Program Logic Manual...publication number LY20-0889 International Business Machines International Business Machines Corporation, IBM System/370 Principles of Operation...publication number GA22-7000 International Business Machines Corporation, IBM Virtual Machine/Directory Maintenance Installation and System Administrator’s
An Effective Mechanism for Virtual Machine Placement using Aco in IAAS Cloud

NASA Astrophysics Data System (ADS)

Shenbaga Moorthy, Rajalakshmi; Fareentaj, U.; Divya, T. K.

2017-08-01

Cloud computing provides an effective way to dynamically provide numerous resources to meet customer demands. A major challenging problem for cloud providers is designing efficient mechanisms for optimal virtual machine Placement (OVMP). Such mechanisms enable the cloud providers to effectively utilize their available resources and obtain higher profits. In order to provide appropriate resources to the clients an optimal virtual machine placement algorithm is proposed. Virtual machine placement is NP-Hard problem. Such NP-Hard problem can be solved using heuristic algorithm. In this paper, Ant Colony Optimization based virtual machine placement is proposed. Our proposed system focuses on minimizing the cost spending in each plan for hosting virtual machines in a multiple cloud provider environment and the response time of each cloud provider is monitored periodically, in such a way to minimize delay in providing the resources to the users. The performance of the proposed algorithm is compared with greedy mechanism. The proposed algorithm is simulated in Eclipse IDE. The results clearly show that the proposed algorithm minimizes the cost, response time and also number of migrations.
Parallel computation and the basis system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Smith, G.R.

1993-05-01

A software package has been written that can facilitate efforts to develop powerful, flexible, and easy-to use programs that can run in single-processor, massively parallel, and distributed computing environments. Particular attention has been given to the difficulties posed by a program consisting of many science packages that represent subsystems of a complicated, coupled system. Methods have been found to maintain independence of the packages by hiding data structures without increasing the communications costs in a parallel computing environment. Concepts developed in this work are demonstrated by a prototype program that uses library routines from two existing software systems, Basis andmore » Parallel Virtual Machine (PVM). Most of the details of these libraries have been encapsulated in routines and macros that could be rewritten for alternative libraries that possess certain minimum capabilities. The prototype software uses a flexible master-and-slaves paradigm for parallel computation and supports domain decomposition with message passing for partitioning work among slaves. Facilities are provided for accessing variables that are distributed among the memories of slaves assigned to subdomains. The software is named PROTOPAR.« less
Role of Open Source Tools and Resources in Virtual Screening for Drug Discovery.

PubMed

Karthikeyan, Muthukumarasamy; Vyas, Renu

2015-01-01

Advancement in chemoinformatics research in parallel with availability of high performance computing platform has made handling of large scale multi-dimensional scientific data for high throughput drug discovery easier. In this study we have explored publicly available molecular databases with the help of open-source based integrated in-house molecular informatics tools for virtual screening. The virtual screening literature for past decade has been extensively investigated and thoroughly analyzed to reveal interesting patterns with respect to the drug, target, scaffold and disease space. The review also focuses on the integrated chemoinformatics tools that are capable of harvesting chemical data from textual literature information and transform them into truly computable chemical structures, identification of unique fragments and scaffolds from a class of compounds, automatic generation of focused virtual libraries, computation of molecular descriptors for structure-activity relationship studies, application of conventional filters used in lead discovery along with in-house developed exhaustive PTC (Pharmacophore, Toxicophores and Chemophores) filters and machine learning tools for the design of potential disease specific inhibitors. A case study on kinase inhibitors is provided as an example.
Scalable and reusable emulator for evaluating the performance of SS7 networks

NASA Astrophysics Data System (ADS)

Lazar, Aurel A.; Tseng, Kent H.; Lim, Koon Seng; Choe, Winston

1994-04-01

A scalable and reusable emulator was designed and implemented for studying the behavior of SS7 networks. The emulator design was largely based on public domain software. It was developed on top of an environment supported by PVM, the Parallel Virtual Machine, and managed by OSIMIS-the OSI Management Information Service platform. The emulator runs on top of a commercially available ATM LAN interconnecting engineering workstations. As a case study for evaluating the emulator, the behavior of the Singapore National SS7 Network under fault and unbalanced loading conditions was investigated.

Dots and dashes: art, virtual reality, and the telegraph

NASA Astrophysics Data System (ADS)

Ruzanka, Silvia; Chang, Ben

2009-02-01

Dots and Dashes is a virtual reality artwork that explores online romance over the telegraph, based on Ella Cheever Thayer's novel Wired Love - a Romance in Dots and Dashes (an Old Story Told in a New Way)1. The uncanny similarities between this story and the world of today's virtual environments provides the springboard for an exploration of a wealth of anxieties and dreams, including the construction of identities in an electronically mediated environment, the shifting boundaries between the natural and machine worlds, and the spiritual dimensions of science and technology. In this paper we examine the parallels between the telegraph networks and our current conceptions of cyberspace, as well as unique social and cultural impacts specific to the telegraph. These include the new opportunities and roles available to women in the telegraph industry and the connection between the telegraph and the Spiritualist movement. We discuss the development of the artwork, its structure and aesthetics, and the technical development of the work.
Scalable computing for evolutionary genomics.

PubMed

Prins, Pjotr; Belhachemi, Dominique; Möller, Steffen; Smant, Geert

2012-01-01

Genomic data analysis in evolutionary biology is becoming so computationally intensive that analysis of multiple hypotheses and scenarios takes too long on a single desktop computer. In this chapter, we discuss techniques for scaling computations through parallelization of calculations, after giving a quick overview of advanced programming techniques. Unfortunately, parallel programming is difficult and requires special software design. The alternative, especially attractive for legacy software, is to introduce poor man's parallelization by running whole programs in parallel as separate processes, using job schedulers. Such pipelines are often deployed on bioinformatics computer clusters. Recent advances in PC virtualization have made it possible to run a full computer operating system, with all of its installed software, on top of another operating system, inside a "box," or virtual machine (VM). Such a VM can flexibly be deployed on multiple computers, in a local network, e.g., on existing desktop PCs, and even in the Cloud, to create a "virtual" computer cluster. Many bioinformatics applications in evolutionary biology can be run in parallel, running processes in one or more VMs. Here, we show how a ready-made bioinformatics VM image, named BioNode, effectively creates a computing cluster, and pipeline, in a few steps. This allows researchers to scale-up computations from their desktop, using available hardware, anytime it is required. BioNode is based on Debian Linux and can run on networked PCs and in the Cloud. Over 200 bioinformatics and statistical software packages, of interest to evolutionary biology, are included, such as PAML, Muscle, MAFFT, MrBayes, and BLAST. Most of these software packages are maintained through the Debian Med project. In addition, BioNode contains convenient configuration scripts for parallelizing bioinformatics software. Where Debian Med encourages packaging free and open source bioinformatics software through one central project, BioNode encourages creating free and open source VM images, for multiple targets, through one central project. BioNode can be deployed on Windows, OSX, Linux, and in the Cloud. Next to the downloadable BioNode images, we provide tutorials online, which empower bioinformaticians to install and run BioNode in different environments, as well as information for future initiatives, on creating and building such images.
A parallel adaptive mesh refinement algorithm

NASA Technical Reports Server (NTRS)

Quirk, James J.; Hanebutte, Ulf R.

1993-01-01

Over recent years, Adaptive Mesh Refinement (AMR) algorithms which dynamically match the local resolution of the computational grid to the numerical solution being sought have emerged as powerful tools for solving problems that contain disparate length and time scales. In particular, several workers have demonstrated the effectiveness of employing an adaptive, block-structured hierarchical grid system for simulations of complex shock wave phenomena. Unfortunately, from the parallel algorithm developer's viewpoint, this class of scheme is quite involved; these schemes cannot be distilled down to a small kernel upon which various parallelizing strategies may be tested. However, because of their block-structured nature such schemes are inherently parallel, so all is not lost. In this paper we describe the method by which Quirk's AMR algorithm has been parallelized. This method is built upon just a few simple message passing routines and so it may be implemented across a broad class of MIMD machines. Moreover, the method of parallelization is such that the original serial code is left virtually intact, and so we are left with just a single product to support. The importance of this fact should not be underestimated given the size and complexity of the original algorithm.
Future Cyborgs: Human-Machine Interface for Virtual Reality Applications

DTIC Science & Technology

2007-04-01

FUTURE CYBORGS : HUMAN-MACHINE INTERFACE FOR VIRTUAL REALITY APPLICATIONS Robert R. Powell, Major, USAF April 2007 Blue Horizons...SUBTITLE Future Cyborgs : Human-Machine Interface for Virtual Reality Applications 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER...Nicholas Negroponte, Being Digital (New York: Alfred A Knopf, Inc, 1995), 123. 23 Ibid. 24 Andy Clark, Natural-Born Cyborgs (New York: Oxford
Debugging Fortran on a shared memory machine

DOE Office of Scientific and Technical Information (OSTI.GOV)

Allen, T.R.; Padua, D.A.

1987-01-01

Debugging on a parallel processor is more difficult than debugging on a serial machine because errors in a parallel program may introduce nondeterminism. The approach to parallel debugging presented here attempts to reduce the problem of debugging on a parallel machine to that of debugging on a serial machine by automatically detecting nondeterminism. 20 refs., 6 figs.
An imperialist competitive algorithm for virtual machine placement in cloud computing

NASA Astrophysics Data System (ADS)

Jamali, Shahram; Malektaji, Sepideh; Analoui, Morteza

2017-05-01

Cloud computing, the recently emerged revolution in IT industry, is empowered by virtualisation technology. In this paradigm, the user's applications run over some virtual machines (VMs). The process of selecting proper physical machines to host these virtual machines is called virtual machine placement. It plays an important role on resource utilisation and power efficiency of cloud computing environment. In this paper, we propose an imperialist competitive-based algorithm for the virtual machine placement problem called ICA-VMPLC. The base optimisation algorithm is chosen to be ICA because of its ease in neighbourhood movement, good convergence rate and suitable terminology. The proposed algorithm investigates search space in a unique manner to efficiently obtain optimal placement solution that simultaneously minimises power consumption and total resource wastage. Its final solution performance is compared with several existing methods such as grouping genetic and ant colony-based algorithms as well as bin packing heuristic. The simulation results show that the proposed method is superior to other tested algorithms in terms of power consumption, resource wastage, CPU usage efficiency and memory usage efficiency.
Porting Gravitational Wave Signal Extraction to Parallel Virtual Machine (PVM)

NASA Technical Reports Server (NTRS)

Thirumalainambi, Rajkumar; Thompson, David E.; Redmon, Jeffery

2009-01-01

Laser Interferometer Space Antenna (LISA) is a planned NASA-ESA mission to be launched around 2012. The Gravitational Wave detection is fundamentally the determination of frequency, source parameters, and waveform amplitude derived in a specific order from the interferometric time-series of the rotating LISA spacecrafts. The LISA Science Team has developed a Mock LISA Data Challenge intended to promote the testing of complicated nested search algorithms to detect the 100-1 millihertz frequency signals at amplitudes of 10E-21. However, it has become clear that, sequential search of the parameters is very time consuming and ultra-sensitive; hence, a new strategy has been developed. Parallelization of existing sequential search algorithms of Gravitational Wave signal identification consists of decomposing sequential search loops, beginning with outermost loops and working inward. In this process, the main challenge is to detect interdependencies among loops and partitioning the loops so as to preserve concurrency. Existing parallel programs are based upon either shared memory or distributed memory paradigms. In PVM, master and node programs are used to execute parallelization and process spawning. The PVM can handle process management and process addressing schemes using a virtual machine configuration. The task scheduling and the messaging and signaling can be implemented efficiently for the LISA Gravitational Wave search process using a master and 6 nodes. This approach is accomplished using a server that is available at NASA Ames Research Center, and has been dedicated to the LISA Data Challenge Competition. Historically, gravitational wave and source identification parameters have taken around 7 days in this dedicated single thread Linux based server. Using PVM approach, the parameter extraction problem can be reduced to within a day. The low frequency computation and a proxy signal-to-noise ratio are calculated in separate nodes that are controlled by the master using message and vector of data passing. The message passing among nodes follows a pattern of synchronous and asynchronous send-and-receive protocols. The communication model and the message buffers are allocated dynamically to address rapid search of gravitational wave source information in the Mock LISA data sets.
Comparative analysis of machine learning methods in ligand-based virtual screening of large compound libraries.

PubMed

Ma, Xiao H; Jia, Jia; Zhu, Feng; Xue, Ying; Li, Ze R; Chen, Yu Z

2009-05-01

Machine learning methods have been explored as ligand-based virtual screening tools for facilitating drug lead discovery. These methods predict compounds of specific pharmacodynamic, pharmacokinetic or toxicological properties based on their structure-derived structural and physicochemical properties. Increasing attention has been directed at these methods because of their capability in predicting compounds of diverse structures and complex structure-activity relationships without requiring the knowledge of target 3D structure. This article reviews current progresses in using machine learning methods for virtual screening of pharmacodynamically active compounds from large compound libraries, and analyzes and compares the reported performances of machine learning tools with those of structure-based and other ligand-based (such as pharmacophore and clustering) virtual screening methods. The feasibility to improve the performance of machine learning methods in screening large libraries is discussed.
Virtual machine-based simulation platform for mobile ad-hoc network-based cyber infrastructure

DOE PAGES

Yoginath, Srikanth B.; Perumalla, Kayla S.; Henz, Brian J.

2015-09-29

In modeling and simulating complex systems such as mobile ad-hoc networks (MANETs) in de-fense communications, it is a major challenge to reconcile multiple important considerations: the rapidity of unavoidable changes to the software (network layers and applications), the difficulty of modeling the critical, implementation-dependent behavioral effects, the need to sustain larger scale scenarios, and the desire for faster simulations. Here we present our approach in success-fully reconciling them using a virtual time-synchronized virtual machine(VM)-based parallel ex-ecution framework that accurately lifts both the devices as well as the network communications to a virtual time plane while retaining full fidelity. At themore » core of our framework is a scheduling engine that operates at the level of a hypervisor scheduler, offering a unique ability to execute multi-core guest nodes over multi-core host nodes in an accurate, virtual time-synchronized manner. In contrast to other related approaches that suffer from either speed or accuracy issues, our framework provides MANET node-wise scalability, high fidelity of software behaviors, and time-ordering accuracy. The design and development of this framework is presented, and an ac-tual implementation based on the widely used Xen hypervisor system is described. Benchmarks with synthetic and actual applications are used to identify the benefits of our approach. The time inaccuracy of traditional emulation methods is demonstrated, in comparison with the accurate execution of our framework verified by theoretically correct results expected from analytical models of the same scenarios. In the largest high fidelity tests, we are able to perform virtual time-synchronized simulation of 64-node VM-based full-stack, actual software behaviors of MANETs containing a mix of static and mobile (unmanned airborne vehicle) nodes, hosted on a 32-core host, with full fidelity of unmodified ad-hoc routing protocols, unmodified application executables, and user-controllable physical layer effects including inter-device wireless signal strength, reachability, and connectivity.« less
Virtual machine-based simulation platform for mobile ad-hoc network-based cyber infrastructure

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yoginath, Srikanth B.; Perumalla, Kayla S.; Henz, Brian J.

In modeling and simulating complex systems such as mobile ad-hoc networks (MANETs) in de-fense communications, it is a major challenge to reconcile multiple important considerations: the rapidity of unavoidable changes to the software (network layers and applications), the difficulty of modeling the critical, implementation-dependent behavioral effects, the need to sustain larger scale scenarios, and the desire for faster simulations. Here we present our approach in success-fully reconciling them using a virtual time-synchronized virtual machine(VM)-based parallel ex-ecution framework that accurately lifts both the devices as well as the network communications to a virtual time plane while retaining full fidelity. At themore » core of our framework is a scheduling engine that operates at the level of a hypervisor scheduler, offering a unique ability to execute multi-core guest nodes over multi-core host nodes in an accurate, virtual time-synchronized manner. In contrast to other related approaches that suffer from either speed or accuracy issues, our framework provides MANET node-wise scalability, high fidelity of software behaviors, and time-ordering accuracy. The design and development of this framework is presented, and an ac-tual implementation based on the widely used Xen hypervisor system is described. Benchmarks with synthetic and actual applications are used to identify the benefits of our approach. The time inaccuracy of traditional emulation methods is demonstrated, in comparison with the accurate execution of our framework verified by theoretically correct results expected from analytical models of the same scenarios. In the largest high fidelity tests, we are able to perform virtual time-synchronized simulation of 64-node VM-based full-stack, actual software behaviors of MANETs containing a mix of static and mobile (unmanned airborne vehicle) nodes, hosted on a 32-core host, with full fidelity of unmodified ad-hoc routing protocols, unmodified application executables, and user-controllable physical layer effects including inter-device wireless signal strength, reachability, and connectivity.« less
CloudMC: a cloud computing application for Monte Carlo simulation.

PubMed

Miras, H; Jiménez, R; Miras, C; Gomà, C

2013-04-21

This work presents CloudMC, a cloud computing application-developed in Windows Azure®, the platform of the Microsoft® cloud-for the parallelization of Monte Carlo simulations in a dynamic virtual cluster. CloudMC is a web application designed to be independent of the Monte Carlo code in which the simulations are based-the simulations just need to be of the form: input files → executable → output files. To study the performance of CloudMC in Windows Azure®, Monte Carlo simulations with penelope were performed on different instance (virtual machine) sizes, and for different number of instances. The instance size was found to have no effect on the simulation runtime. It was also found that the decrease in time with the number of instances followed Amdahl's law, with a slight deviation due to the increase in the fraction of non-parallelizable time with increasing number of instances. A simulation that would have required 30 h of CPU on a single instance was completed in 48.6 min when executed on 64 instances in parallel (speedup of 37 ×). Furthermore, the use of cloud computing for parallel computing offers some advantages over conventional clusters: high accessibility, scalability and pay per usage. Therefore, it is strongly believed that cloud computing will play an important role in making Monte Carlo dose calculation a reality in future clinical practice.
Parallel computation and the Basis system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Smith, G.R.

1992-12-16

A software package has been written that can facilitate efforts to develop powerful, flexible, and easy-to-use programs that can run in single-processor, massively parallel, and distributed computing environments. Particular attention has been given to the difficulties posed by a program consisting of many science packages that represent subsystems of a complicated, coupled system. Methods have been found to maintain independence of the packages by hiding data structures without increasing the communication costs in a parallel computing environment. Concepts developed in this work are demonstrated by a prototype program that uses library routines from two existing software systems, Basis and Parallelmore » Virtual Machine (PVM). Most of the details of these libraries have been encapsulated in routines and macros that could be rewritten for alternative libraries that possess certain minimum capabilities. The prototype software uses a flexible master-and-slaves paradigm for parallel computation and supports domain decomposition with message passing for partitioning work among slaves. Facilities are provided for accessing variables that are distributed among the memories of slaves assigned to subdomains. The software is named PROTOPAR.« less
Integration of the virtual 3D model of a control system with the virtual controller

NASA Astrophysics Data System (ADS)

Herbuś, K.; Ociepka, P.

2015-11-01

Nowadays the design process includes simulation analysis of different components of a constructed object. It involves the need for integration of different virtual object to simulate the whole investigated technical system. The paper presents the issues related to the integration of a virtual 3D model of a chosen control system of with a virtual controller. The goal of integration is to verify the operation of an adopted object of in accordance with the established control program. The object of the simulation work is the drive system of a tunneling machine for trenchless work. In the first stage of work was created an interactive visualization of functioning of the 3D virtual model of a tunneling machine. For this purpose, the software of the VR (Virtual Reality) class was applied. In the elaborated interactive application were created adequate procedures allowing controlling the drive system of a translatory motion, a rotary motion and the drive system of a manipulator. Additionally was created the procedure of turning on and off the output crushing head, mounted on the last element of the manipulator. In the elaborated interactive application have been established procedures for receiving input data from external software, on the basis of the dynamic data exchange (DDE), which allow controlling actuators of particular control systems of the considered machine. In the next stage of work, the program on a virtual driver, in the ladder diagram (LD) language, was created. The control program was developed on the basis of the adopted work cycle of the tunneling machine. The element integrating the virtual model of the tunneling machine for trenchless work with the virtual controller is the application written in a high level language (Visual Basic). In the developed application was created procedures responsible for collecting data from the running, in a simulation mode, virtual controller and transferring them to the interactive application, in which is verified the operation of the adopted research object. The carried out work allowed foot the integration of the virtual model of the control system of the tunneling machine with the virtual controller, enabling the verification of its operation.
GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation.

PubMed

Hess, Berk; Kutzner, Carsten; van der Spoel, David; Lindahl, Erik

2008-03-01

Molecular simulation is an extremely useful, but computationally very expensive tool for studies of chemical and biomolecular systems. Here, we present a new implementation of our molecular simulation toolkit GROMACS which now both achieves extremely high performance on single processors from algorithmic optimizations and hand-coded routines and simultaneously scales very well on parallel machines. The code encompasses a minimal-communication domain decomposition algorithm, full dynamic load balancing, a state-of-the-art parallel constraint solver, and efficient virtual site algorithms that allow removal of hydrogen atom degrees of freedom to enable integration time steps up to 5 fs for atomistic simulations also in parallel. To improve the scaling properties of the common particle mesh Ewald electrostatics algorithms, we have in addition used a Multiple-Program, Multiple-Data approach, with separate node domains responsible for direct and reciprocal space interactions. Not only does this combination of algorithms enable extremely long simulations of large systems but also it provides that simulation performance on quite modest numbers of standard cluster nodes.
HeNCE: A Heterogeneous Network Computing Environment

DOE PAGES

Beguelin, Adam; Dongarra, Jack J.; Geist, George Al; ...

1994-01-01

Network computing seeks to utilize the aggregate resources of many networked computers to solve a single problem. In so doing it is often possible to obtain supercomputer performance from an inexpensive local area network. The drawback is that network computing is complicated and error prone when done by hand, especially if the computers have different operating systems and data formats and are thus heterogeneous. The heterogeneous network computing environment (HeNCE) is an integrated graphical environment for creating and running parallel programs over a heterogeneous collection of computers. It is built on a lower level package called parallel virtual machine (PVM).more » The HeNCE philosophy of parallel programming is to have the programmer graphically specify the parallelism of a computation and to automate, as much as possible, the tasks of writing, compiling, executing, debugging, and tracing the network computation. Key to HeNCE is a graphical language based on directed graphs that describe the parallelism and data dependencies of an application. Nodes in the graphs represent conventional Fortran or C subroutines and the arcs represent data and control flow. This article describes the present state of HeNCE, its capabilities, limitations, and areas of future research.« less
Bioinformatics Pipelines for Targeted Resequencing and Whole-Exome Sequencing of Human and Mouse Genomes: A Virtual Appliance Approach for Instant Deployment

PubMed Central

Saeed, Isaam; Wong, Stephen Q.; Mar, Victoria; Goode, David L.; Caramia, Franco; Doig, Ken; Ryland, Georgina L.; Thompson, Ella R.; Hunter, Sally M.; Halgamuge, Saman K.; Ellul, Jason; Dobrovic, Alexander; Campbell, Ian G.; Papenfuss, Anthony T.; McArthur, Grant A.; Tothill, Richard W.

2014-01-01

Targeted resequencing by massively parallel sequencing has become an effective and affordable way to survey small to large portions of the genome for genetic variation. Despite the rapid development in open source software for analysis of such data, the practical implementation of these tools through construction of sequencing analysis pipelines still remains a challenging and laborious activity, and a major hurdle for many small research and clinical laboratories. We developed TREVA (Targeted REsequencing Virtual Appliance), making pre-built pipelines immediately available as a virtual appliance. Based on virtual machine technologies, TREVA is a solution for rapid and efficient deployment of complex bioinformatics pipelines to laboratories of all sizes, enabling reproducible results. The analyses that are supported in TREVA include: somatic and germline single-nucleotide and insertion/deletion variant calling, copy number analysis, and cohort-based analyses such as pathway and significantly mutated genes analyses. TREVA is flexible and easy to use, and can be customised by Linux-based extensions if required. TREVA can also be deployed on the cloud (cloud computing), enabling instant access without investment overheads for additional hardware. TREVA is available at http://bioinformatics.petermac.org/treva/. PMID:24752294
Hybrid polylingual object model: an efficient and seamless integration of Java and native components on the Dalvik virtual machine.

PubMed

Huang, Yukun; Chen, Rong; Wei, Jingbo; Pei, Xilong; Cao, Jing; Prakash Jayaraman, Prem; Ranjan, Rajiv

2014-01-01

JNI in the Android platform is often observed with low efficiency and high coding complexity. Although many researchers have investigated the JNI mechanism, few of them solve the efficiency and the complexity problems of JNI in the Android platform simultaneously. In this paper, a hybrid polylingual object (HPO) model is proposed to allow a CAR object being accessed as a Java object and as vice in the Dalvik virtual machine. It is an acceptable substitute for JNI to reuse the CAR-compliant components in Android applications in a seamless and efficient way. The metadata injection mechanism is designed to support the automatic mapping and reflection between CAR objects and Java objects. A prototype virtual machine, called HPO-Dalvik, is implemented by extending the Dalvik virtual machine to support the HPO model. Lifespan management, garbage collection, and data type transformation of HPO objects are also handled in the HPO-Dalvik virtual machine automatically. The experimental result shows that the HPO model outweighs the standard JNI in lower overhead on native side, better executing performance with no JNI bridging code being demanded.
The ASSERT Virtual Machine Kernel: Support for Preservation of Temporal Properties

NASA Astrophysics Data System (ADS)

Zamorano, J.; de la Puente, J. A.; Pulido, J. A.; Urueña

2008-08-01

A new approach to building embedded real-time software has been developed in the ASSERT project. One of its key elements is the concept of a virtual machine preserving the non-functional properties of the system, and especially real-time properties, all the way down from high- level design models down to executable code. The paper describes one instance of the virtual machine concept that provides support for the preservation of temporal properties both at the source code level —by accept- ing only "legal" entities, i.e. software components with statically analysable real-tim behaviour— and at run-time —by monitoring the temporal behaviour of the system. The virtual machine has been validated on several pilot projects carried out by aerospace companies in the framework of the ASSERT project.
Implementation and performance of parallel Prolog interpreter

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wei, S.; Kale, L.V.; Balkrishna, R.

1988-01-01

In this paper, the authors discuss the implementation of a parallel Prolog interpreter on different parallel machines. The implementation is based on the REDUCE--OR process model which exploits both AND and OR parallelism in logic programs. It is machine independent as it runs on top of the chare-kernel--a machine-independent parallel programming system. The authors also give the performance of the interpreter running a diverse set of benchmark pargrams on parallel machines including shared memory systems: an Alliant FX/8, Sequent and a MultiMax, and a non-shared memory systems: Intel iPSC/32 hypercube, in addition to its performance on a multiprocessor simulation system.
A high performance scientific cloud computing environment for materials simulations

NASA Astrophysics Data System (ADS)

Jorissen, K.; Vila, F. D.; Rehr, J. J.

2012-09-01

We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.

Hybrid Cloud Computing Environment for EarthCube and Geoscience Community

NASA Astrophysics Data System (ADS)

Yang, C. P.; Qin, H.

2016-12-01

The NSF EarthCube Integration and Test Environment (ECITE) has built a hybrid cloud computing environment to provides cloud resources from private cloud environments by using cloud system software - OpenStack and Eucalyptus, and also manages public cloud - Amazon Web Service that allow resource synchronizing and bursting between private and public cloud. On ECITE hybrid cloud platform, EarthCube and geoscience community can deploy and manage the applications by using base virtual machine images or customized virtual machines, analyze big datasets by using virtual clusters, and real-time monitor the virtual resource usage on the cloud. Currently, a number of EarthCube projects have deployed or started migrating their projects to this platform, such as CHORDS, BCube, CINERGI, OntoSoft, and some other EarthCube building blocks. To accomplish the deployment or migration, administrator of ECITE hybrid cloud platform prepares the specific needs (e.g. images, port numbers, usable cloud capacity, etc.) of each project in advance base on the communications between ECITE and participant projects, and then the scientists or IT technicians in those projects launch one or multiple virtual machines, access the virtual machine(s) to set up computing environment if need be, and migrate their codes, documents or data without caring about the heterogeneity in structure and operations among different cloud platforms.
Cooperating reduction machines

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kluge, W.E.

1983-11-01

This paper presents a concept and a system architecture for the concurrent execution of program expressions of a concrete reduction language based on lamda-expressions. If formulated appropriately, these expressions are well-suited for concurrent execution, following a demand-driven model of computation. In particular, recursive program expressions with nonlinear expansion may, at run time, recursively be partitioned into a hierarchy of independent subexpressions which can be reduced by a corresponding hierarchy of virtual reduction machines. This hierarchy unfolds and collapses dynamically, with virtual machines recursively assuming the role of masters that create and eventually terminate, or synchronize with, slaves. The paper alsomore » proposes a nonhierarchically organized system of reduction machines, each featuring a stack architecture, that effectively supports the allocation of virtual machines to the real machines of the system in compliance with their hierarchical order of creation and termination. 25 references.« less
Complementary Machine Intelligence and Human Intelligence in Virtual Teaching Assistant for Tutoring Program Tracing

ERIC Educational Resources Information Center

Chou, Chih-Yueh; Huang, Bau-Hung; Lin, Chi-Jen

2011-01-01

This study proposes a virtual teaching assistant (VTA) to share teacher tutoring tasks in helping students practice program tracing and proposes two mechanisms of complementing machine intelligence and human intelligence to develop the VTA. The first mechanism applies machine intelligence to extend human intelligence (teacher answers) to evaluate…
"Pack[superscript2]": VM Resource Scheduling for Fine-Grained Application SLAs in Highly Consolidated Environment

ERIC Educational Resources Information Center

Sukwong, Orathai

2013-01-01

Virtualization enables the ability to consolidate multiple servers on a single physical machine, increasing the infrastructure utilization. Maximizing the ratio of server virtual machines (VMs) to physical machines, namely the consolidation ratio, becomes an important goal toward infrastructure cost saving in a cloud. However, the consolidation…
A general purpose subroutine for fast fourier transform on a distributed memory parallel machine

NASA Technical Reports Server (NTRS)

Dubey, A.; Zubair, M.; Grosch, C. E.

1992-01-01

One issue which is central in developing a general purpose Fast Fourier Transform (FFT) subroutine on a distributed memory parallel machine is the data distribution. It is possible that different users would like to use the FFT routine with different data distributions. Thus, there is a need to design FFT schemes on distributed memory parallel machines which can support a variety of data distributions. An FFT implementation on a distributed memory parallel machine which works for a number of data distributions commonly encountered in scientific applications is presented. The problem of rearranging the data after computing the FFT is also addressed. The performance of the implementation on a distributed memory parallel machine Intel iPSC/860 is evaluated.
Simplified Virtualization in a HEP/NP Environment with Condor

NASA Astrophysics Data System (ADS)

Strecker-Kellogg, W.; Caramarcu, C.; Hollowell, C.; Wong, T.

2012-12-01

In this work we will address the development of a simple prototype virtualized worker node cluster, using Scientific Linux 6.x as a base OS, KVM and the libvirt API for virtualization, and the Condor batch software to manage virtual machines. The discussion in this paper provides details on our experience with building, configuring, and deploying the various components from bare metal, including the base OS, creation and distribution of the virtualized OS images and the integration of batch services with the virtual machines. Our focus was on simplicity and interoperability with our existing architecture.
Elevating Virtual Machine Introspection for Fine-Grained Process Monitoring: Techniques and Applications

ERIC Educational Resources Information Center

Srinivasan, Deepa

2013-01-01

Recent rapid malware growth has exposed the limitations of traditional in-host malware-defense systems and motivated the development of secure virtualization-based solutions. By running vulnerable systems as virtual machines (VMs) and moving security software from inside VMs to the outside, the out-of-VM solutions securely isolate the anti-malware…
Hybrid PolyLingual Object Model: An Efficient and Seamless Integration of Java and Native Components on the Dalvik Virtual Machine

PubMed Central

Huang, Yukun; Chen, Rong; Wei, Jingbo; Pei, Xilong; Cao, Jing; Prakash Jayaraman, Prem; Ranjan, Rajiv

2014-01-01

JNI in the Android platform is often observed with low efficiency and high coding complexity. Although many researchers have investigated the JNI mechanism, few of them solve the efficiency and the complexity problems of JNI in the Android platform simultaneously. In this paper, a hybrid polylingual object (HPO) model is proposed to allow a CAR object being accessed as a Java object and as vice in the Dalvik virtual machine. It is an acceptable substitute for JNI to reuse the CAR-compliant components in Android applications in a seamless and efficient way. The metadata injection mechanism is designed to support the automatic mapping and reflection between CAR objects and Java objects. A prototype virtual machine, called HPO-Dalvik, is implemented by extending the Dalvik virtual machine to support the HPO model. Lifespan management, garbage collection, and data type transformation of HPO objects are also handled in the HPO-Dalvik virtual machine automatically. The experimental result shows that the HPO model outweighs the standard JNI in lower overhead on native side, better executing performance with no JNI bridging code being demanded. PMID:25110745
An incremental anomaly detection model for virtual machines.

PubMed

Zhang, Hancui; Chen, Shuyu; Liu, Jun; Zhou, Zhen; Wu, Tianshu

2017-01-01

Self-Organizing Map (SOM) algorithm as an unsupervised learning method has been applied in anomaly detection due to its capabilities of self-organizing and automatic anomaly prediction. However, because of the algorithm is initialized in random, it takes a long time to train a detection model. Besides, the Cloud platforms with large scale virtual machines are prone to performance anomalies due to their high dynamic and resource sharing characters, which makes the algorithm present a low accuracy and a low scalability. To address these problems, an Improved Incremental Self-Organizing Map (IISOM) model is proposed for anomaly detection of virtual machines. In this model, a heuristic-based initialization algorithm and a Weighted Euclidean Distance (WED) algorithm are introduced into SOM to speed up the training process and improve model quality. Meanwhile, a neighborhood-based searching algorithm is presented to accelerate the detection time by taking into account the large scale and high dynamic features of virtual machines on cloud platform. To demonstrate the effectiveness, experiments on a common benchmark KDD Cup dataset and a real dataset have been performed. Results suggest that IISOM has advantages in accuracy and convergence velocity of anomaly detection for virtual machines on cloud platform.
An incremental anomaly detection model for virtual machines

PubMed Central

Zhang, Hancui; Chen, Shuyu; Liu, Jun; Zhou, Zhen; Wu, Tianshu

2017-01-01

Self-Organizing Map (SOM) algorithm as an unsupervised learning method has been applied in anomaly detection due to its capabilities of self-organizing and automatic anomaly prediction. However, because of the algorithm is initialized in random, it takes a long time to train a detection model. Besides, the Cloud platforms with large scale virtual machines are prone to performance anomalies due to their high dynamic and resource sharing characters, which makes the algorithm present a low accuracy and a low scalability. To address these problems, an Improved Incremental Self-Organizing Map (IISOM) model is proposed for anomaly detection of virtual machines. In this model, a heuristic-based initialization algorithm and a Weighted Euclidean Distance (WED) algorithm are introduced into SOM to speed up the training process and improve model quality. Meanwhile, a neighborhood-based searching algorithm is presented to accelerate the detection time by taking into account the large scale and high dynamic features of virtual machines on cloud platform. To demonstrate the effectiveness, experiments on a common benchmark KDD Cup dataset and a real dataset have been performed. Results suggest that IISOM has advantages in accuracy and convergence velocity of anomaly detection for virtual machines on cloud platform. PMID:29117245
Supersonic civil airplane study and design: Performance and sonic boom

NASA Technical Reports Server (NTRS)

Cheung, Samson

1995-01-01

Since aircraft configuration plays an important role in aerodynamic performance and sonic boom shape, the configuration of the next generation supersonic civil transport has to be tailored to meet high aerodynamic performance and low sonic boom requirements. Computational fluid dynamics (CFD) can be used to design airplanes to meet these dual objectives. The work and results in this report are used to support NASA's High Speed Research Program (HSRP). CFD tools and techniques have been developed for general usages of sonic boom propagation study and aerodynamic design. Parallel to the research effort on sonic boom extrapolation, CFD flow solvers have been coupled with a numeric optimization tool to form a design package for aircraft configuration. This CFD optimization package has been applied to configuration design on a low-boom concept and an oblique all-wing concept. A nonlinear unconstrained optimizer for Parallel Virtual Machine has been developed for aerodynamic design and study.
Gilgamesh: A Multithreaded Processor-In-Memory Architecture for Petaflops Computing

NASA Technical Reports Server (NTRS)

Sterling, T. L.; Zima, H. P.

2002-01-01

Processor-in-Memory (PIM) architectures avoid the von Neumann bottleneck in conventional machines by integrating high-density DRAM and CMOS logic on the same chip. Parallel systems based on this new technology are expected to provide higher scalability, adaptability, robustness, fault tolerance and lower power consumption than current MPPs or commodity clusters. In this paper we describe the design of Gilgamesh, a PIM-based massively parallel architecture, and elements of its execution model. Gilgamesh extends existing PIM capabilities by incorporating advanced mechanisms for virtualizing tasks and data and providing adaptive resource management for load balancing and latency tolerance. The Gilgamesh execution model is based on macroservers, a middleware layer which supports object-based runtime management of data and threads allowing explicit and dynamic control of locality and load balancing. The paper concludes with a discussion of related research activities and an outlook to future work.
Analysis towards VMEM File of a Suspended Virtual Machine

NASA Astrophysics Data System (ADS)

Song, Zheng; Jin, Bo; Sun, Yongqing

With the popularity of virtual machines, forensic investigators are challenged with more complicated situations, among which discovering the evidences in virtualized environment is of significant importance. This paper mainly analyzes the file suffixed with .vmem in VMware Workstation, which stores all pseudo-physical memory into an image. The internal file structure of .vmem file is studied and disclosed. Key information about processes and threads of a suspended virtual machine is revealed. Further investigation into the Windows XP SP3 heap contents is conducted and a proof-of-concept tool is provided. Different methods to obtain forensic memory images are introduced, with both advantages and limits analyzed. We conclude with an outlook.
Memory access in shared virtual memory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berrendorf, R.

1992-01-01

Shared virtual memory (SVM) is a virtual memory layer with a single address space on top of a distributed real memory on parallel computers. We examine the behavior and performance of SVM running a parallel program with medium-grained, loop-level parallelism on top of it. A simulator for the underlying parallel architecture can be used to examine the behavior of SVM more deeply. The influence of several parameters, such as the number of processors, page size, cold or warm start, and restricted page replication, is studied.
Memory access in shared virtual memory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berrendorf, R.

1992-09-01

Shared virtual memory (SVM) is a virtual memory layer with a single address space on top of a distributed real memory on parallel computers. We examine the behavior and performance of SVM running a parallel program with medium-grained, loop-level parallelism on top of it. A simulator for the underlying parallel architecture can be used to examine the behavior of SVM more deeply. The influence of several parameters, such as the number of processors, page size, cold or warm start, and restricted page replication, is studied.
Intel NX to PVM 3.2 message passing conversion library

NASA Technical Reports Server (NTRS)

Arthur, Trey; Nelson, Michael L.

1993-01-01

NASA Langley Research Center has developed a library that allows Intel NX message passing codes to be executed under the more popular and widely supported Parallel Virtual Machine (PVM) message passing library. PVM was developed at Oak Ridge National Labs and has become the defacto standard for message passing. This library will allow the many programs that were developed on the Intel iPSC/860 or Intel Paragon in a Single Program Multiple Data (SPMD) design to be ported to the numerous architectures that PVM (version 3.2) supports. Also, the library adds global operations capability to PVM. A familiarity with Intel NX and PVM message passing is assumed.
Feasibility of Virtual Machine and Cloud Computing Technologies for High Performance Computing

DTIC Science & Technology

2014-05-01

Hat Enterprise Linux SaaS software as a service VM virtual machine vNUMA virtual non-uniform memory access WRF weather research and forecasting...previously mentioned in Chapter I Section B1 of this paper, which is used to run the weather research and forecasting ( WRF ) model in their experiments...against a VMware virtualization solution of WRF . The experiment consisted of running WRF in a standard configuration between the D-VTM and VMware while
The HEPiX Virtualisation Working Group: Towards a Grid of Clouds

NASA Astrophysics Data System (ADS)

Cass, Tony

2012-12-01

The use of virtual machine images, as for example with Cloud services such as Amazon's Elastic Compute Cloud, is attractive for users as they have a guaranteed execution environment, something that cannot today be provided across sites participating in computing grids such as the Worldwide LHC Computing Grid. However, Grid sites often operate within computer security frameworks which preclude the use of remotely generated images. The HEPiX Virtualisation Working Group was setup with the objective to enable use of remotely generated virtual machine images at Grid sites and, to this end, has introduced the idea of trusted virtual machine images which are guaranteed to be secure and configurable by sites such that security policy commitments can be met. This paper describes the requirements and details of these trusted virtual machine images and presents a model for their use to facilitate the integration of Grid- and Cloud-based computing environments for High Energy Physics.
Status and Roadmap of CernVM

NASA Astrophysics Data System (ADS)

Berzano, D.; Blomer, J.; Buncic, P.; Charalampidis, I.; Ganis, G.; Meusel, R.

2015-12-01

Cloud resources nowadays contribute an essential share of resources for computing in high-energy physics. Such resources can be either provided by private or public IaaS clouds (e.g. OpenStack, Amazon EC2, Google Compute Engine) or by volunteers computers (e.g. LHC@Home 2.0). In any case, experiments need to prepare a virtual machine image that provides the execution environment for the physics application at hand. The CernVM virtual machine since version 3 is a minimal and versatile virtual machine image capable of booting different operating systems. The virtual machine image is less than 20 megabyte in size. The actual operating system is delivered on demand by the CernVM File System. CernVM 3 has matured from a prototype to a production environment. It is used, for instance, to run LHC applications in the cloud, to tune event generators using a network of volunteer computers, and as a container for the historic Scientific Linux 5 and Scientific Linux 4 based software environments in the course of long-term data preservation efforts of the ALICE, CMS, and ALEPH experiments. We present experience and lessons learned from the use of CernVM at scale. We also provide an outlook on the upcoming developments. These developments include adding support for Scientific Linux 7, the use of container virtualization, such as provided by Docker, and the streamlining of virtual machine contextualization towards the cloud-init industry standard.
Parallel machine architecture and compiler design facilities

NASA Technical Reports Server (NTRS)

Kuck, David J.; Yew, Pen-Chung; Padua, David; Sameh, Ahmed; Veidenbaum, Alex

1990-01-01

The objective is to provide an integrated simulation environment for studying and evaluating various issues in designing parallel systems, including machine architectures, parallelizing compiler techniques, and parallel algorithms. The status of Delta project (which objective is to provide a facility to allow rapid prototyping of parallelized compilers that can target toward different machine architectures) is summarized. Included are the surveys of the program manipulation tools developed, the environmental software supporting Delta, and the compiler research projects in which Delta has played a role.

Human Machine Interfaces for Teleoperators and Virtual Environments

NASA Technical Reports Server (NTRS)

Durlach, Nathaniel I. (Compiler); Sheridan, Thomas B. (Compiler); Ellis, Stephen R. (Compiler)

1991-01-01

In Mar. 1990, a meeting organized around the general theme of teleoperation research into virtual environment display technology was conducted. This is a collection of conference-related fragments that will give a glimpse of the potential of the following fields and how they interplay: sensorimotor performance; human-machine interfaces; teleoperation; virtual environments; performance measurement and evaluation methods; and design principles and predictive models.
Finding Tropical Cyclones on a Cloud Computing Cluster: Using Parallel Virtualization for Large-Scale Climate Simulation Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hasenkamp, Daren; Sim, Alexander; Wehner, Michael

Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, whilemore » we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.« less
Virtual C Machine and Integrated Development Environment for ATMS Controllers.

DOT National Transportation Integrated Search

2000-04-01

The overall objective of this project is to develop a prototype virtual machine that fits on current Advanced Traffic Management Systems (ATMS) controllers and provides functionality for complex traffic operations.;Prepared in cooperation with Utah S...
System-Level Virtualization Research at Oak Ridge National Laboratory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Scott, Stephen L; Vallee, Geoffroy R; Naughton, III, Thomas J

2010-01-01

System-level virtualization is today enjoying a rebirth as a technique to effectively share what were then considered large computing resources to subsequently fade from the spotlight as individual workstations gained in popularity with a one machine - one user approach. One reason for this resurgence is that the simple workstation has grown in capability to rival that of anything available in the past. Thus, computing centers are again looking at the price/performance benefit of sharing that single computing box via server consolidation. However, industry is only concentrating on the benefits of using virtualization for server consolidation (enterprise computing) whereas ourmore » interest is in leveraging virtualization to advance high-performance computing (HPC). While these two interests may appear to be orthogonal, one consolidating multiple applications and users on a single machine while the other requires all the power from many machines to be dedicated solely to its purpose, we propose that virtualization does provide attractive capabilities that may be exploited to the benefit of HPC interests. This does raise the two fundamental questions of: is the concept of virtualization (a machine sharing technology) really suitable for HPC and if so, how does one go about leveraging these virtualization capabilities for the benefit of HPC. To address these questions, this document presents ongoing studies on the usage of system-level virtualization in a HPC context. These studies include an analysis of the benefits of system-level virtualization for HPC, a presentation of research efforts based on virtualization for system availability, and a presentation of research efforts for the management of virtual systems. The basis for this document was material presented by Stephen L. Scott at the Collaborative and Grid Computing Technologies meeting held in Cancun, Mexico on April 12-14, 2007.« less
Taming Wild Horses: The Need for Virtual Time-based Scheduling of VMs in Network Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yoginath, Srikanth B; Perumalla, Kalyan S; Henz, Brian J

2012-01-01

The next generation of scalable network simulators employ virtual machines (VMs) to act as high-fidelity models of traffic producer/consumer nodes in simulated networks. However, network simulations could be inaccurate if VMs are not scheduled according to virtual time, especially when many VMs are hosted per simulator core in a multi-core simulator environment. Since VMs are by default free-running, on the outset, it is not clear if, and to what extent, their untamed execution affects the results in simulated scenarios. Here, we provide the first quantitative basis for establishing the need for generalized virtual time scheduling of VMs in network simulators,more » based on an actual prototyped implementations. To exercise breadth, our system is tested with multiple disparate applications: (a) a set of message passing parallel programs, (b) a computer worm propagation phenomenon, and (c) a mobile ad-hoc wireless network simulation. We define and use error metrics and benchmarks in scaled tests to empirically report the poor match of traditional, fairness-based VM scheduling to VM-based network simulation, and also clearly show the better performance of our simulation-specific scheduler, with up to 64 VMs hosted on a 12-core simulator node.« less
An element search ant colony technique for solving virtual machine placement problem

NASA Astrophysics Data System (ADS)

Srija, J.; Rani John, Rose; Kanaga, Grace Mary, Dr.

2017-09-01

The data centres in the cloud environment play a key role in providing infrastructure for ubiquitous computing, pervasive computing, mobile computing etc. This computing technique tries to utilize the available resources in order to provide services. Hence maintaining the resource utilization without wastage of power consumption has become a challenging task for the researchers. In this paper we propose the direct guidance ant colony system for effective mapping of virtual machines to the physical machine with maximal resource utilization and minimal power consumption. The proposed algorithm has been compared with the existing ant colony approach which is involved in solving virtual machine placement problem and thus the proposed algorithm proves to provide better result than the existing technique.
Performance Analysis of Ivshmem for High-Performance Computing in Virtual Machines

NASA Astrophysics Data System (ADS)

Ivanovic, Pavle; Richter, Harald

2018-01-01

High-Performance computing (HPC) is rarely accomplished via virtual machines (VMs). In this paper, we present a remake of ivshmem which can change this. Ivshmem was a shared memory (SHM) between virtual machines on the same server, with SHM-access synchronization included, until about 5 years ago when newer versions of Linux and its virtualization library libvirt evolved. We restored that SHM-access synchronization feature because it is indispensable for HPC and made ivshmem runnable with contemporary versions of Linux, libvirt, KVM, QEMU and especially MPICH, which is an implementation of MPI - the standard HPC communication library. Additionally, MPICH was transparently modified by us to get ivshmem included, resulting in a three to ten times performance improvement compared to TCP/IP. Furthermore, we have transparently replaced MPI_PUT, a single-side MPICH communication mechanism, by an own MPI_PUT wrapper. As a result, our ivshmem even surpasses non-virtualized SHM data transfers for block lengths greater than 512 KBytes, showing the benefits of virtualization. All improvements were possible without using SR-IOV.
Preconditioned implicit solvers for the Navier-Stokes equations on distributed-memory machines

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Liou, Meng-Sing; Dyson, Rodger W.

1994-01-01

The GMRES method is parallelized, and combined with local preconditioning to construct an implicit parallel solver to obtain steady-state solutions for the Navier-Stokes equations of fluid flow on distributed-memory machines. The new implicit parallel solver is designed to preserve the convergence rate of the equivalent 'serial' solver. A static domain-decomposition is used to partition the computational domain amongst the available processing nodes of the parallel machine. The SPMD (Single-Program Multiple-Data) programming model is combined with message-passing tools to develop the parallel code on a 32-node Intel Hypercube and a 512-node Intel Delta machine. The implicit parallel solver is validated for internal and external flow problems, and is found to compare identically with flow solutions obtained on a Cray Y-MP/8. A peak computational speed of 2300 MFlops/sec has been achieved on 512 nodes of the Intel Delta machine,k for a problem size of 1024 K equations (256 K grid points).
Lightweight scheduling of elastic analysis containers in a competitive cloud environment: a Docked Analysis Facility for ALICE

NASA Astrophysics Data System (ADS)

Berzano, D.; Blomer, J.; Buncic, P.; Charalampidis, I.; Ganis, G.; Meusel, R.

2015-12-01

During the last years, several Grid computing centres chose virtualization as a better way to manage diverse use cases with self-consistent environments on the same bare infrastructure. The maturity of control interfaces (such as OpenNebula and OpenStack) opened the possibility to easily change the amount of resources assigned to each use case by simply turning on and off virtual machines. Some of those private clouds use, in production, copies of the Virtual Analysis Facility, a fully virtualized and self-contained batch analysis cluster capable of expanding and shrinking automatically upon need: however, resources starvation occurs frequently as expansion has to compete with other virtual machines running long-living batch jobs. Such batch nodes cannot relinquish their resources in a timely fashion: the more jobs they run, the longer it takes to drain them and shut off, and making one-job virtual machines introduces a non-negligible virtualization overhead. By improving several components of the Virtual Analysis Facility we have realized an experimental “Docked” Analysis Facility for ALICE, which leverages containers instead of virtual machines for providing performance and security isolation. We will present the techniques we have used to address practical problems, such as software provisioning through CVMFS, as well as our considerations on the maturity of containers for High Performance Computing. As the abstraction layer is thinner, our Docked Analysis Facilities may feature a more fine-grained sizing, down to single-job node containers: we will show how this approach will positively impact automatic cluster resizing by deploying lightweight pilot containers instead of replacing central queue polls.
Data communications in a parallel active messaging interface of a parallel computer

DOEpatents

Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

2014-09-02

Eager send data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints that specify a client, a context, and a task, including receiving an eager send data communications instruction with transfer data disposed in a send buffer characterized by a read/write send buffer memory address in a read/write virtual address space of the origin endpoint; determining for the send buffer a read-only send buffer memory address in a read-only virtual address space, the read-only virtual address space shared by both the origin endpoint and the target endpoint, with all frames of physical memory mapped to pages of virtual memory in the read-only virtual address space; and communicating by the origin endpoint to the target endpoint an eager send message header that includes the read-only send buffer memory address.
Data communications in a parallel active messaging interface of a parallel computer

DOEpatents

Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

2014-09-16

Eager send data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints that specify a client, a context, and a task, including receiving an eager send data communications instruction with transfer data disposed in a send buffer characterized by a read/write send buffer memory address in a read/write virtual address space of the origin endpoint; determining for the send buffer a read-only send buffer memory address in a read-only virtual address space, the read-only virtual address space shared by both the origin endpoint and the target endpoint, with all frames of physical memory mapped to pages of virtual memory in the read-only virtual address space; and communicating by the origin endpoint to the target endpoint an eager send message header that includes the read-only send buffer memory address.
Efficient Checkpointing of Virtual Machines using Virtual Machine Introspection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aderholdt, Ferrol; Han, Fang; Scott, Stephen L

Cloud Computing environments rely heavily on system-level virtualization. This is due to the inherent benefits of virtualization including fault tolerance through checkpoint/restart (C/R) mechanisms. Because clouds are the abstraction of large data centers and large data centers have a higher potential for failure, it is imperative that a C/R mechanism for such an environment provide minimal latency as well as a small checkpoint file size. Recently, there has been much research into C/R with respect to virtual machines (VM) providing excellent solutions to reduce either checkpoint latency or checkpoint file size. However, these approaches do not provide both. This papermore » presents a method of checkpointing VMs by utilizing virtual machine introspection (VMI). Through the usage of VMI, we are able to determine which pages of memory within the guest are used or free and are better able to reduce the amount of pages written to disk during a checkpoint. We have validated this work by using various benchmarks to measure the latency along with the checkpoint size. With respect to checkpoint file size, our approach results in file sizes within 24% or less of the actual used memory within the guest. Additionally, the checkpoint latency of our approach is up to 52% faster than KVM s default method.« less
MISR Center Block Time Tool

Atmospheric Science Data Center

2013-04-01

MISR Center Block Time Tool The misr_time tool calculates the block center times for MISR Level 1B2 files. This is ... version of the IDL package or by using the IDL Virtual Machine application. The IDL Virtual Machine is bundled with IDL and is ...
System-Level Virtualization for High Performance Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vallee, Geoffroy R; Naughton, III, Thomas J; Engelmann, Christian

2008-01-01

System-level virtualization has been a research topic since the 70's but regained popularity during the past few years because of the availability of efficient solution such as Xen and the implementation of hardware support in commodity processors (e.g. Intel-VT, AMD-V). However, a majority of system-level virtualization projects is guided by the server consolidation market. As a result, current virtualization solutions appear to not be suitable for high performance computing (HPC) which is typically based on large-scale systems. On another hand there is significant interest in exploiting virtual machines (VMs) within HPC for a number of other reasons. By virtualizing themore » machine, one is able to run a variety of operating systems and environments as needed by the applications. Virtualization allows users to isolate workloads, improving security and reliability. It is also possible to support non-native environments and/or legacy operating environments through virtualization. In addition, it is possible to balance work loads, use migration techniques to relocate applications from failing machines, and isolate fault systems for repair. This document presents the challenges for the implementation of a system-level virtualization solution for HPC. It also presents a brief survey of the different approaches and techniques to address these challenges.« less
The Machine / Job Features Mechanism

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alef, M.; Cass, T.; Keijser, J. J.

Within the HEPiX virtualization group and the Worldwide LHC Computing Grid’s Machine/Job Features Task Force, a mechanism has been developed which provides access to detailed information about the current host and the current job to the job itself. This allows user payloads to access meta information, independent of the current batch system or virtual machine model. The information can be accessed either locally via the filesystem on a worker node, or remotely via HTTP(S) from a webserver. This paper describes the final version of the specification from 2016 which was published as an HEP Software Foundation technical note, and themore » design of the implementations of this version for batch and virtual machine platforms. We discuss early experiences with these implementations and how they can be exploited by experiment frameworks.« less
The machine/job features mechanism

NASA Astrophysics Data System (ADS)

Alef, M.; Cass, T.; Keijser, J. J.; McNab, A.; Roiser, S.; Schwickerath, U.; Sfiligoi, I.

2017-10-01

Within the HEPiX virtualization group and the Worldwide LHC Computing Grid’s Machine/Job Features Task Force, a mechanism has been developed which provides access to detailed information about the current host and the current job to the job itself. This allows user payloads to access meta information, independent of the current batch system or virtual machine model. The information can be accessed either locally via the filesystem on a worker node, or remotely via HTTP(S) from a webserver. This paper describes the final version of the specification from 2016 which was published as an HEP Software Foundation technical note, and the design of the implementations of this version for batch and virtual machine platforms. We discuss early experiences with these implementations and how they can be exploited by experiment frameworks.
Running accuracy analysis of a 3-RRR parallel kinematic machine considering the deformations of the links

NASA Astrophysics Data System (ADS)

Wang, Liping; Jiang, Yao; Li, Tiemin

2014-09-01

Parallel kinematic machines have drawn considerable attention and have been widely used in some special fields. However, high precision is still one of the challenges when they are used for advanced machine tools. One of the main reasons is that the kinematic chains of parallel kinematic machines are composed of elongated links that can easily suffer deformations, especially at high speeds and under heavy loads. A 3-RRR parallel kinematic machine is taken as a study object for investigating its accuracy with the consideration of the deformations of its links during the motion process. Based on the dynamic model constructed by the Newton-Euler method, all the inertia loads and constraint forces of the links are computed and their deformations are derived. Then the kinematic errors of the machine are derived with the consideration of the deformations of the links. Through further derivation, the accuracy of the machine is given in a simple explicit expression, which will be helpful to increase the calculating speed. The accuracy of this machine when following a selected circle path is simulated. The influences of magnitude of the maximum acceleration and external loads on the running accuracy of the machine are investigated. The results show that the external loads will deteriorate the accuracy of the machine tremendously when their direction coincides with the direction of the worst stiffness of the machine. The proposed method provides a solution for predicting the running accuracy of the parallel kinematic machines and can also be used in their design optimization as well as selection of suitable running parameters.
A three phase optimization method for precopy based VM live migration.

PubMed

Sharma, Sangeeta; Chawla, Meenu

2016-01-01

Virtual machine live migration is a method of moving virtual machine across hosts within a virtualized datacenter. It provides significant benefits for administrator to manage datacenter efficiently. It reduces service interruption by transferring the virtual machine without stopping at source. Transfer of large number of virtual machine memory pages results in long migration time as well as downtime, which also affects the overall system performance. This situation becomes unbearable when migration takes place over slower network or a long distance migration within a cloud. In this paper, precopy based virtual machine live migration method is thoroughly analyzed to trace out the issues responsible for its performance drops. In order to address these issues, this paper proposes three phase optimization (TPO) method. It works in three phases as follows: (i) reduce the transfer of memory pages in first phase, (ii) reduce the transfer of duplicate pages by classifying frequently and non-frequently updated pages, and (iii) reduce the data sent in last iteration of migration by applying the simple RLE compression technique. As a result, each phase significantly reduces total pages transferred, total migration time and downtime respectively. The proposed TPO method is evaluated using different representative workloads on a Xen virtualized environment. Experimental results show that TPO method reduces total pages transferred by 71 %, total migration time by 70 %, downtime by 3 % for higher workload, and it does not impose significant overhead as compared to traditional precopy method. Comparison of TPO method with other methods is also done for supporting and showing its effectiveness. TPO method and precopy methods are also tested at different number of iterations. The TPO method gives better performance even with less number of iterations.
Optimizing Virtual Network Functions Placement in Virtual Data Center Infrastructure Using Machine Learning

NASA Astrophysics Data System (ADS)

Bolodurina, I. P.; Parfenov, D. I.

2018-01-01

We have elaborated a neural network model of virtual network flow identification based on the statistical properties of flows circulating in the network of the data center and characteristics that describe the content of packets transmitted through network objects. This enabled us to establish the optimal set of attributes to identify virtual network functions. We have established an algorithm for optimizing the placement of virtual data functions using the data obtained in our research. Our approach uses a hybrid method of visualization using virtual machines and containers, which enables to reduce the infrastructure load and the response time in the network of the virtual data center. The algorithmic solution is based on neural networks, which enables to scale it at any number of the network function copies.
Protection of Mission-Critical Applications from Untrusted Execution Environment: Resource Efficient Replication and Migration of Virtual Machines

DTIC Science & Technology

2015-09-28

the performance of log-and- replay can degrade significantly for VMs configured with multiple virtual CPUs, since the shared memory communication...whether based on checkpoint replication or log-and- replay , existing HA ap- proaches use in- memory backups. The backup VM sits in the memory of a...efficiently. 15. SUBJECT TERMS High-availability virtual machines, live migration, memory and traffic overheads, application suspension, Java

RHE: A JVM Courseware

ERIC Educational Resources Information Center

Liu, S.; Tang, J.; Deng, C.; Li, X.-F.; Gaudiot, J.-L.

2011-01-01

Java Virtual Machine (JVM) education has become essential in training embedded software engineers as well as virtual machine researchers and practitioners. However, due to the lack of suitable instructional tools, it is difficult for students to obtain any kind of hands-on experience and to attain any deep understanding of JVM design. To address…
The Tera Multithreaded Architecture and Unstructured Meshes

NASA Technical Reports Server (NTRS)

Bokhari, Shahid H.; Mavriplis, Dimitri J.

1998-01-01

The Tera Multithreaded Architecture (MTA) is a new parallel supercomputer currently being installed at San Diego Supercomputing Center (SDSC). This machine has an architecture quite different from contemporary parallel machines. The computational processor is a custom design and the machine uses hardware to support very fine grained multithreading. The main memory is shared, hardware randomized and flat. These features make the machine highly suited to the execution of unstructured mesh problems, which are difficult to parallelize on other architectures. We report the results of a study carried out during July-August 1998 to evaluate the execution of EUL3D, a code that solves the Euler equations on an unstructured mesh, on the 2 processor Tera MTA at SDSC. Our investigation shows that parallelization of an unstructured code is extremely easy on the Tera. We were able to get an existing parallel code (designed for a shared memory machine), running on the Tera by changing only the compiler directives. Furthermore, a serial version of this code was compiled to run in parallel on the Tera by judicious use of directives to invoke the "full/empty" tag bits of the machine to obtain synchronization. This version achieves 212 and 406 Mflop/s on one and two processors respectively, and requires no attention to partitioning or placement of data issues that would be of paramount importance in other parallel architectures.
Distributed multitasking ITS with PVM

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fan, W.C.; Halbleib, J.A. Sr.

1995-12-31

Advances in computer hardware and communication software have made it possible to perform parallel-processing computing on a collection of desktop workstations. For many applications, multitasking on a cluster of high-performance workstations has achieved performance comparable to or better than that on a traditional supercomputer. From the point of view of cost-effectiveness, it also allows users to exploit available but unused computational resources and thus achieve a higher performance-to-cost ratio. Monte Carlo calculations are inherently parallelizable because the individual particle trajectories can be generated independently with minimum need for interprocessor communication. Furthermore, the number of particle histories that can be generatedmore » in a given amount of wall-clock time is nearly proportional to the number of processors in the cluster. This is an important fact because the inherent statistical uncertainty in any Monte Carlo result decreases as the number of histories increases. For these reasons, researchers have expended considerable effort to take advantage of different parallel architectures for a variety of Monte Carlo radiation transport codes, often with excellent results. The initial interest in this work was sparked by the multitasking capability of the MCNP code on a cluster of workstations using the Parallel Virtual Machine (PVM) software. On a 16-machine IBM RS/6000 cluster, it has been demonstrated that MCNP runs ten times as fast as on a single-processor CRAY YMP. In this paper, we summarize the implementation of a similar multitasking capability for the coupled electronphoton transport code system, the Integrated TIGER Series (ITS), and the evaluation of two load-balancing schemes for homogeneous and heterogeneous networks.« less
Distributed multitasking ITS with PVM

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fan, W.C.; Halbleib, J.A. Sr.

1995-02-01

Advances of computer hardware and communication software have made it possible to perform parallel-processing computing on a collection of desktop workstations. For many applications, multitasking on a cluster of high-performance workstations has achieved performance comparable or better than that on a traditional supercomputer. From the point of view of cost-effectiveness, it also allows users to exploit available but unused computational resources, and thus achieve a higher performance-to-cost ratio. Monte Carlo calculations are inherently parallelizable because the individual particle trajectories can be generated independently with minimum need for interprocessor communication. Furthermore, the number of particle histories that can be generated inmore » a given amount of wall-clock time is nearly proportional to the number of processors in the cluster. This is an important fact because the inherent statistical uncertainty in any Monte Carlo result decreases as the number of histories increases. For these reasons, researchers have expended considerable effort to take advantage of different parallel architectures for a variety of Monte Carlo radiation transport codes, often with excellent results. The initial interest in this work was sparked by the multitasking capability of MCNP on a cluster of workstations using the Parallel Virtual Machine (PVM) software. On a 16-machine IBM RS/6000 cluster, it has been demonstrated that MCNP runs ten times as fast as on a single-processor CRAY YMP. In this paper, we summarize the implementation of a similar multitasking capability for the coupled electron/photon transport code system, the Integrated TIGER Series (ITS), and the evaluation of two load balancing schemes for homogeneous and heterogeneous networks.« less
CFCC: A Covert Flows Confinement Mechanism for Virtual Machine Coalitions

NASA Astrophysics Data System (ADS)

Cheng, Ge; Jin, Hai; Zou, Deqing; Shi, Lei; Ohoussou, Alex K.

Normally, virtualization technology is adopted to construct the infrastructure of cloud computing environment. Resources are managed and organized dynamically through virtual machine (VM) coalitions in accordance with the requirements of applications. Enforcing mandatory access control (MAC) on the VM coalitions will greatly improve the security of VM-based cloud computing. However, the existing MAC models lack the mechanism to confine the covert flows and are hard to eliminate the convert channels. In this paper, we propose a covert flows confinement mechanism for virtual machine coalitions (CFCC), which introduces dynamic conflicts of interest based on the activity history of VMs, each of which is attached with a label. The proposed mechanism can be used to confine the covert flows between VMs in different coalitions. We implement a prototype system, evaluate its performance, and show that our mechanism is practical.
A comparative analysis of dynamic grids vs. virtual grids using the A3pviGrid framework.

PubMed

Shankaranarayanan, Avinas; Amaldas, Christine

2010-11-01

With the proliferation of Quad/Multi-core micro-processors in mainstream platforms such as desktops and workstations; a large number of unused CPU cycles can be utilized for running virtual machines (VMs) as dynamic nodes in distributed environments. Grid services and its service oriented business broker now termed cloud computing could deploy image based virtualization platforms enabling agent based resource management and dynamic fault management. In this paper we present an efficient way of utilizing heterogeneous virtual machines on idle desktops as an environment for consumption of high performance grid services. Spurious and exponential increases in the size of the datasets are constant concerns in medical and pharmaceutical industries due to the constant discovery and publication of large sequence databases. Traditional algorithms are not modeled at handing large data sizes under sudden and dynamic changes in the execution environment as previously discussed. This research was undertaken to compare our previous results with running the same test dataset with that of a virtual Grid platform using virtual machines (Virtualization). The implemented architecture, A3pviGrid utilizes game theoretic optimization and agent based team formation (Coalition) algorithms to improve upon scalability with respect to team formation. Due to the dynamic nature of distributed systems (as discussed in our previous work) all interactions were made local within a team transparently. This paper is a proof of concept of an experimental mini-Grid test-bed compared to running the platform on local virtual machines on a local test cluster. This was done to give every agent its own execution platform enabling anonymity and better control of the dynamic environmental parameters. We also analyze performance and scalability of Blast in a multiple virtual node setup and present our findings. This paper is an extension of our previous research on improving the BLAST application framework using dynamic Grids on virtualization platforms such as the virtual box.
Development and Application of ANN Model for Worker Assignment into Virtual Cells of Large Sized Configurations

NASA Astrophysics Data System (ADS)

Murali, R. V.; Puri, A. B.; Fathi, Khalid

2010-10-01

This paper presents an extended version of study already undertaken on development of an artificial neural networks (ANNs) model for assigning workforce into virtual cells under virtual cellular manufacturing systems (VCMS) environments. Previously, the same authors have introduced this concept and applied it to virtual cells of two-cell configuration and the results demonstrated that ANNs could be a worth applying tool for carrying out workforce assignments. In this attempt, three-cell configurations problems are considered for worker assignment task. Virtual cells are formed under dual resource constraint (DRC) context in which the number of available workers is less than the total number of machines available. Since worker assignment tasks are quite non-linear and highly dynamic in nature under varying inputs & conditions and, in parallel, ANNs have the ability to model complex relationships between inputs and outputs and find similar patterns effectively, an attempt was earlier made to employ ANNs into the above task. In this paper, the multilayered perceptron with feed forward (MLP-FF) neural network model has been reused for worker assignment tasks of three-cell configurations under DRC context and its performance at different time periods has been analyzed. The previously proposed worker assignment model has been reconfigured and cell formation solutions available for three-cell configuration in the literature are used in combination to generate datasets for training ANNs framework. Finally, results of the study have been presented and discussed.
Reverse time migration: A seismic processing application on the connection machine

NASA Technical Reports Server (NTRS)

Fiebrich, Rolf-Dieter

1987-01-01

The implementation of a reverse time migration algorithm on the Connection Machine, a massively parallel computer is described. Essential architectural features of this machine as well as programming concepts are presented. The data structures and parallel operations for the implementation of the reverse time migration algorithm are described. The algorithm matches the Connection Machine architecture closely and executes almost at the peak performance of this machine.
Impact of Machine Virtualization on Timing Precision for Performance-critical Tasks

NASA Astrophysics Data System (ADS)

Karpov, Kirill; Fedotova, Irina; Siemens, Eduard

2017-07-01

In this paper we present a measurement study to characterize the impact of hardware virtualization on basic software timing, as well as on precise sleep operations of an operating system. We investigated how timer hardware is shared among heavily CPU-, I/O- and Network-bound tasks on a virtual machine as well as on the host machine. VMware ESXi and QEMU/KVM have been chosen as commonly used examples of hypervisor- and host-based models. Based on statistical parameters of retrieved distributions, our results provide a very good estimation of timing behavior. It is essential for real-time and performance-critical applications such as image processing or real-time control.
Can Multiple "Spatial" Virtual Timelines Convey the Relatedness of Chronological Knowledge across Parallel Domains?

ERIC Educational Resources Information Center

Korallo, Liliya; Foreman, Nigel; Boyd-Davis, Stephen; Moar, Magnus; Coulson, Mark

2012-01-01

Single linear virtual timelines have been used effectively with undergraduates and primary school children to convey the chronological ordering of historical items, improving on PowerPoint and paper/textual displays. In the present study, a virtual environment (VE) consisting of three parallel related timelines (world history and the histories of…
ATLAS TDAQ System Administration: evolution and re-design

NASA Astrophysics Data System (ADS)

Ballestrero, S.; Bogdanchikov, A.; Brasolin, F.; Contescu, C.; Dubrov, S.; Fazio, D.; Korol, A.; Lee, C. J.; Scannicchio, D. A.; Twomey, M. S.

2015-12-01

The ATLAS Trigger and Data Acquisition system is responsible for the online processing of live data, streaming from the ATLAS experiment at the Large Hadron Collider at CERN. The online farm is composed of ∼3000 servers, processing the data read out from ∼100 million detector channels through multiple trigger levels. During the two years of the first Long Shutdown there has been a tremendous amount of work done by the ATLAS Trigger and Data Acquisition System Administrators, implementing numerous new software applications, upgrading the OS and the hardware, changing some design philosophies and exploiting the High- Level Trigger farm with different purposes. The OS version has been upgraded to SLC6; for the largest part of the farm, which is composed of net booted nodes, this required a completely new design of the net booting system. In parallel, the migration to Puppet of the Configuration Management systems has been completed for both net booted and local booted hosts; the Post-Boot Scripts system and Quattor have been consequently dismissed. Virtual Machine usage has been investigated and tested and many of the core servers are now running on Virtual Machines. Virtualisation has also been used to adapt the High-Level Trigger farm as a batch system, which has been used for running Monte Carlo production jobs that are mostly CPU and not I/O bound. Finally, monitoring the health and the status of ∼3000 machines in the experimental area is obviously of the utmost importance, so the obsolete Nagios v2 has been replaced with Icinga, complemented by Ganglia as a performance data provider. This paper serves for reporting of the actions taken by the Systems Administrators in order to improve and produce a system capable of performing for the next three years of ATLAS data taking.
Computation of Coupled Thermal-Fluid Problems in Distributed Memory Environment

NASA Technical Reports Server (NTRS)

Wei, H.; Shang, H. M.; Chen, Y. S.

2001-01-01

The thermal-fluid coupling problems are very important to aerospace and engineering applications. Instead of analyzing heat transfer and fluid flow separately, this study merged two well-accepted engineering solution methods, SINDA for thermal analysis and FDNS for fluid flow simulation, into a unified multi-disciplinary thermal fluid prediction method. A fully conservative patched grid interface algorithm for arbitrary two-dimensional and three-dimensional geometry has been developed. The state-of-the-art parallel computing concept was used to couple SINDA and FDNS for the communication of boundary conditions through PVM (Parallel Virtual Machine) libraries. Therefore, the thermal analysis performed by SINDA and the fluid flow calculated by FDNS are fully coupled to obtain steady state or transient solutions. The natural convection between two thick-walled eccentric tubes was calculated and the predicted results match the experiment data perfectly. A 3-D rocket engine model and a real 3-D SSME geometry were used to test the current model, and the reasonable temperature field was obtained.
Software architecture standard for simulation virtual machine, version 2.0

NASA Technical Reports Server (NTRS)

Sturtevant, Robert; Wessale, William

1994-01-01

The Simulation Virtual Machine (SBM) is an Ada architecture which eases the effort involved in the real-time software maintenance and sustaining engineering. The Software Architecture Standard defines the infrastructure which all the simulation models are built from. SVM was developed for and used in the Space Station Verification and Training Facility.
Cloud services for the Fermilab scientific stakeholders

DOE PAGES

Timm, S.; Garzoglio, G.; Mhashilkar, P.; ...

2015-12-23

As part of the Fermilab/KISTI cooperative research project, Fermilab has successfully run an experimental simulation workflow at scale on a federation of Amazon Web Services (AWS), FermiCloud, and local FermiGrid resources. We used the CernVM-FS (CVMFS) file system to deliver the application software. We established Squid caching servers in AWS as well, using the Shoal system to let each individual virtual machine find the closest squid server. We also developed an automatic virtual machine conversion system so that we could transition virtual machines made on FermiCloud to Amazon Web Services. We used this system to successfully run a cosmic raymore » simulation of the NOvA detector at Fermilab, making use of both AWS spot pricing and network bandwidth discounts to minimize the cost. On FermiCloud we also were able to run the workflow at the scale of 1000 virtual machines, using a private network routable inside of Fermilab. As a result, we present in detail the technological improvements that were used to make this work a reality.« less
Cloud services for the Fermilab scientific stakeholders

DOE Office of Scientific and Technical Information (OSTI.GOV)

Timm, S.; Garzoglio, G.; Mhashilkar, P.

As part of the Fermilab/KISTI cooperative research project, Fermilab has successfully run an experimental simulation workflow at scale on a federation of Amazon Web Services (AWS), FermiCloud, and local FermiGrid resources. We used the CernVM-FS (CVMFS) file system to deliver the application software. We established Squid caching servers in AWS as well, using the Shoal system to let each individual virtual machine find the closest squid server. We also developed an automatic virtual machine conversion system so that we could transition virtual machines made on FermiCloud to Amazon Web Services. We used this system to successfully run a cosmic raymore » simulation of the NOvA detector at Fermilab, making use of both AWS spot pricing and network bandwidth discounts to minimize the cost. On FermiCloud we also were able to run the workflow at the scale of 1000 virtual machines, using a private network routable inside of Fermilab. As a result, we present in detail the technological improvements that were used to make this work a reality.« less
Volunteered Cloud Computing for Disaster Management

NASA Astrophysics Data System (ADS)

Evans, J. D.; Hao, W.; Chettri, S. R.

2014-12-01

Disaster management relies increasingly on interpreting earth observations and running numerical models; which require significant computing capacity - usually on short notice and at irregular intervals. Peak computing demand during event detection, hazard assessment, or incident response may exceed agency budgets; however some of it can be met through volunteered computing, which distributes subtasks to participating computers via the Internet. This approach has enabled large projects in mathematics, basic science, and climate research to harness the slack computing capacity of thousands of desktop computers. This capacity is likely to diminish as desktops give way to battery-powered mobile devices (laptops, smartphones, tablets) in the consumer market; but as cloud computing becomes commonplace, it may offer significant slack capacity -- if its users are given an easy, trustworthy mechanism for participating. Such a "volunteered cloud computing" mechanism would also offer several advantages over traditional volunteered computing: tasks distributed within a cloud have fewer bandwidth limitations; granular billing mechanisms allow small slices of "interstitial" computing at no marginal cost; and virtual storage volumes allow in-depth, reversible machine reconfiguration. Volunteered cloud computing is especially suitable for "embarrassingly parallel" tasks, including ones requiring large data volumes: examples in disaster management include near-real-time image interpretation, pattern / trend detection, or large model ensembles. In the context of a major disaster, we estimate that cloud users (if suitably informed) might volunteer hundreds to thousands of CPU cores across a large provider such as Amazon Web Services. To explore this potential, we are building a volunteered cloud computing platform and targeting it to a disaster management context. Using a lightweight, fault-tolerant network protocol, this platform helps cloud users join parallel computing projects; automates reconfiguration of their virtual machines; ensures accountability for donated computing; and optimizes the use of "interstitial" computing. Initial applications include fire detection from multispectral satellite imagery and flood risk mapping through hydrological simulations.
Line-drawing algorithms for parallel machines

NASA Technical Reports Server (NTRS)

Pang, Alex T.

1990-01-01

The fact that conventional line-drawing algorithms, when applied directly on parallel machines, can lead to very inefficient codes is addressed. It is suggested that instead of modifying an existing algorithm for a parallel machine, a more efficient implementation can be produced by going back to the invariants in the definition. Popular line-drawing algorithms are compared with two alternatives; distance to a line (a point is on the line if sufficiently close to it) and intersection with a line (a point on the line if an intersection point). For massively parallel single-instruction-multiple-data (SIMD) machines (with thousands of processors and up), the alternatives provide viable line-drawing algorithms. Because of the pixel-per-processor mapping, their performance is independent of the line length and orientation.
Runtime Performance and Virtual Network Control Alternatives in VM-Based High-Fidelity Network Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yoginath, Srikanth B; Perumalla, Kalyan S; Henz, Brian J

2012-01-01

In prior work (Yoginath and Perumalla, 2011; Yoginath, Perumalla and Henz, 2012), the motivation, challenges and issues were articulated in favor of virtual time ordering of Virtual Machines (VMs) in network simulations hosted on multi-core machines. Two major components in the overall virtualization challenge are (1) virtual timeline establishment and scheduling of VMs, and (2) virtualization of inter-VM communication. Here, we extend prior work by presenting scaling results for the first component, with experiment results on up to 128 VMs scheduled in virtual time order on a single 12-core host. We also explore the solution space of design alternatives formore » the second component, and present performance results from a multi-threaded, multi-queue implementation of inter-VM network control for synchronized execution with VM scheduling, incorporated in our NetWarp simulation system.« less
KNBD: A Remote Kernel Block Server for Linux

NASA Technical Reports Server (NTRS)

Becker, Jeff

1999-01-01

I am developing a prototype of a Linux remote disk block server whose purpose is to serve as a lower level component of a parallel file system. Parallel file systems are an important component of high performance supercomputers and clusters. Although supercomputer vendors such as SGI and IBM have their own custom solutions, there has been a void and hence a demand for such a system on Beowulf-type PC Clusters. Recently, the Parallel Virtual File System (PVFS) project at Clemson University has begun to address this need (1). Although their system provides much of the functionality of (and indeed was inspired by) the equivalent file systems in the commercial supercomputer market, their system is all in user-space. Migrating their 10 services to the kernel could provide a performance boost, by obviating the need for expensive system calls. Thanks to Pavel Machek, the Linux kernel has provided the network block device (2) with kernels 2.1.101 and later. You can configure this block device to redirect reads and writes to a remote machine's disk. This can be used as a building block for constructing a striped file system across several nodes.
Proceedings: Sisal `93

DOE Office of Scientific and Technical Information (OSTI.GOV)

Feo, J.T.

1993-10-01

This report contain papers on: Programmability and performance issues; The case of an iterative partial differential equation solver; Implementing the kernal of the Australian Region Weather Prediction Model in Sisal; Even and quarter-even prime length symmetric FFTs and their Sisal Implementations; Top-down thread generation for Sisal; Overlapping communications and computations on NUMA architechtures; Compiling technique based on dataflow analysis for funtional programming language Valid; Copy elimination for true multidimensional arrays in Sisal 2.0; Increasing parallelism for an optimization that reduces copying in IF2 graphs; Caching in on Sisal; Cache performance of Sisal Vs. FORTRAN; FFT algorithms on a shared-memory multiprocessor;more » A parallel implementation of nonnumeric search problems in Sisal; Computer vision algorithms in Sisal; Compilation of Sisal for a high-performance data driven vector processor; Sisal on distributed memory machines; A virtual shared addressing system for distributed memory Sisal; Developing a high-performance FFT algorithm in Sisal for a vector supercomputer; Implementation issues for IF2 on a static data-flow architechture; and Systematic control of parallelism in array-based data-flow computation. Selected papers have been indexed separately for inclusion in the Energy Science and Technology Database.« less

minimega

DOE Office of Scientific and Technical Information (OSTI.GOV)

David Fritz, John Floren

2013-08-27

Minimega is a simple emulytics platform for creating testbeds of networked devices. The platform consists of easily deployable tools to facilitate bringing up large networks of virtual machines including Windows, Linux, and Android. Minimega attempts to allow experiments to be brought up quickly with nearly no configuration. Minimega also includes tools for simple cluster management, as well as tools for creating Linux based virtual machine images.
Parallel Computational Fluid Dynamics: Current Status and Future Requirements

NASA Technical Reports Server (NTRS)

Simon, Horst D.; VanDalsem, William R.; Dagum, Leonardo; Kutler, Paul (Technical Monitor)

1994-01-01

One or the key objectives of the Applied Research Branch in the Numerical Aerodynamic Simulation (NAS) Systems Division at NASA Allies Research Center is the accelerated introduction of highly parallel machines into a full operational environment. In this report we discuss the performance results obtained from the implementation of some computational fluid dynamics (CFD) applications on the Connection Machine CM-2 and the Intel iPSC/860. We summarize some of the experiences made so far with the parallel testbed machines at the NAS Applied Research Branch. Then we discuss the long term computational requirements for accomplishing some of the grand challenge problems in computational aerosciences. We argue that only massively parallel machines will be able to meet these grand challenge requirements, and we outline the computer science and algorithm research challenges ahead.
[Parallel virtual reality visualization of extreme large medical datasets].

PubMed

Tang, Min

2010-04-01

On the basis of a brief description of grid computing, the essence and critical techniques of parallel visualization of extreme large medical datasets are discussed in connection with Intranet and common-configuration computers of hospitals. In this paper are introduced several kernel techniques, including the hardware structure, software framework, load balance and virtual reality visualization. The Maximum Intensity Projection algorithm is realized in parallel using common PC cluster. In virtual reality world, three-dimensional models can be rotated, zoomed, translated and cut interactively and conveniently through the control panel built on virtual reality modeling language (VRML). Experimental results demonstrate that this method provides promising and real-time results for playing the role in of a good assistant in making clinical diagnosis.
Quantum information, cognition, and music.

PubMed

Dalla Chiara, Maria L; Giuntini, Roberto; Leporini, Roberto; Negri, Eleonora; Sergioli, Giuseppe

2015-01-01

Parallelism represents an essential aspect of human mind/brain activities. One can recognize some common features between psychological parallelism and the characteristic parallel structures that arise in quantum theory and in quantum computation. The article is devoted to a discussion of the following questions: a comparison between classical probabilistic Turing machines and quantum Turing machines.possible applications of the quantum computational semantics to cognitive problems.parallelism in music.
Quantum information, cognition, and music

PubMed Central

Dalla Chiara, Maria L.; Giuntini, Roberto; Leporini, Roberto; Negri, Eleonora; Sergioli, Giuseppe

2015-01-01

Parallelism represents an essential aspect of human mind/brain activities. One can recognize some common features between psychological parallelism and the characteristic parallel structures that arise in quantum theory and in quantum computation. The article is devoted to a discussion of the following questions: a comparison between classical probabilistic Turing machines and quantum Turing machines.possible applications of the quantum computational semantics to cognitive problems.parallelism in music. PMID:26539139
Application of a distributed network in computational fluid dynamic simulations

NASA Technical Reports Server (NTRS)

Deshpande, Manish; Feng, Jinzhang; Merkle, Charles L.; Deshpande, Ashish

1994-01-01

A general-purpose 3-D, incompressible Navier-Stokes algorithm is implemented on a network of concurrently operating workstations using parallel virtual machine (PVM) and compared with its performance on a CRAY Y-MP and on an Intel iPSC/860. The problem is relatively computationally intensive, and has a communication structure based primarily on nearest-neighbor communication, making it ideally suited to message passing. Such problems are frequently encountered in computational fluid dynamics (CDF), and their solution is increasingly in demand. The communication structure is explicitly coded in the implementation to fully exploit the regularity in message passing in order to produce a near-optimal solution. Results are presented for various grid sizes using up to eight processors.
The Language Grid: supporting intercultural collaboration

NASA Astrophysics Data System (ADS)

Ishida, T.

2018-03-01

A variety of language resources already exist online. Unfortunately, since many language resources have usage restrictions, it is virtually impossible for each user to negotiate with every language resource provider when combining several resources to achieve the intended purpose. To increase the accessibility and usability of language resources (dictionaries, parallel texts, part-of-speech taggers, machine translators, etc.), we proposed the Language Grid [1]; it wraps existing language resources as atomic services and enables users to create new services by combining the atomic services, and reduces the negotiation costs related to intellectual property rights [4]. Our slogan is “language services from language resources.” We believe that modularization with recombination is the key to creating a full range of customized language environments for various user communities.
Web Service Distributed Management Framework for Autonomic Server Virtualization

NASA Astrophysics Data System (ADS)

Solomon, Bogdan; Ionescu, Dan; Litoiu, Marin; Mihaescu, Mircea

Virtualization for the x86 platform has imposed itself recently as a new technology that can improve the usage of machines in data centers and decrease the cost and energy of running a high number of servers. Similar to virtualization, autonomic computing and more specifically self-optimization, aims to improve server farm usage through provisioning and deprovisioning of instances as needed by the system. Autonomic systems are able to determine the optimal number of server machines - real or virtual - to use at a given time, and add or remove servers from a cluster in order to achieve optimal usage. While provisioning and deprovisioning of servers is very important, the way the autonomic system is built is also very important, as a robust and open framework is needed. One such management framework is the Web Service Distributed Management (WSDM) system, which is an open standard of the Organization for the Advancement of Structured Information Standards (OASIS). This paper presents an open framework built on top of the WSDM specification, which aims to provide self-optimization for applications servers residing on virtual machines.
Visual Computing Environment

NASA Technical Reports Server (NTRS)

Lawrence, Charles; Putt, Charles W.

1997-01-01

The Visual Computing Environment (VCE) is a NASA Lewis Research Center project to develop a framework for intercomponent and multidisciplinary computational simulations. Many current engineering analysis codes simulate various aspects of aircraft engine operation. For example, existing computational fluid dynamics (CFD) codes can model the airflow through individual engine components such as the inlet, compressor, combustor, turbine, or nozzle. Currently, these codes are run in isolation, making intercomponent and complete system simulations very difficult to perform. In addition, management and utilization of these engineering codes for coupled component simulations is a complex, laborious task, requiring substantial experience and effort. To facilitate multicomponent aircraft engine analysis, the CFD Research Corporation (CFDRC) is developing the VCE system. This system, which is part of NASA's Numerical Propulsion Simulation System (NPSS) program, can couple various engineering disciplines, such as CFD, structural analysis, and thermal analysis. The objectives of VCE are to (1) develop a visual computing environment for controlling the execution of individual simulation codes that are running in parallel and are distributed on heterogeneous host machines in a networked environment, (2) develop numerical coupling algorithms for interchanging boundary conditions between codes with arbitrary grid matching and different levels of dimensionality, (3) provide a graphical interface for simulation setup and control, and (4) provide tools for online visualization and plotting. VCE was designed to provide a distributed, object-oriented environment. Mechanisms are provided for creating and manipulating objects, such as grids, boundary conditions, and solution data. This environment includes parallel virtual machine (PVM) for distributed processing. Users can interactively select and couple any set of codes that have been modified to run in a parallel distributed fashion on a cluster of heterogeneous workstations. A scripting facility allows users to dictate the sequence of events that make up the particular simulation.
Virtual Machine Language Controls Remote Devices

NASA Technical Reports Server (NTRS)

2014-01-01

Kennedy Space Center worked with Blue Sun Enterprises, based in Boulder, Colorado, to enhance the company's virtual machine language (VML) to control the instruments on the Regolith and Environment Science and Oxygen and Lunar Volatiles Extraction mission. Now the NASA-improved VML is available for crewed and uncrewed spacecraft, and has potential applications on remote systems such as weather balloons, unmanned aerial vehicles, and submarines.
A Virtual Astronomical Research Machine in No Time (VARMiNT)

NASA Astrophysics Data System (ADS)

Beaver, John

2012-05-01

We present early results of using virtual machine software to help make astronomical research computing accessible to a wider range of individuals. Our Virtual Astronomical Research Machine in No Time (VARMiNT) is an Ubuntu Linux virtual machine with free, open-source software already installed and configured (and in many cases documented). The purpose of VARMiNT is to provide a ready-to-go astronomical research computing environment that can be freely shared between researchers, or between amateur and professional, teacher and student, etc., and to circumvent the often-difficult task of configuring a suitable computing environment from scratch. Thus we hope that VARMiNT will make it easier for individuals to engage in research computing even if they have no ready access to the facilities of a research institution. We describe our current version of VARMiNT and some of the ways it is being used at the University of Wisconsin - Fox Valley, a two-year teaching campus of the University of Wisconsin System, as a means to enhance student independent study research projects and to facilitate collaborations with researchers at other locations. We also outline some future plans and prospects.
Human-machine interface for a VR-based medical imaging environment

NASA Astrophysics Data System (ADS)

Krapichler, Christian; Haubner, Michael; Loesch, Andreas; Lang, Manfred K.; Englmeier, Karl-Hans

1997-05-01

Modern 3D scanning techniques like magnetic resonance imaging (MRI) or computed tomography (CT) produce high- quality images of the human anatomy. Virtual environments open new ways to display and to analyze those tomograms. Compared with today's inspection of 2D image sequences, physicians are empowered to recognize spatial coherencies and examine pathological regions more facile, diagnosis and therapy planning can be accelerated. For that purpose a powerful human-machine interface is required, which offers a variety of tools and features to enable both exploration and manipulation of the 3D data. Man-machine communication has to be intuitive and efficacious to avoid long accustoming times and to enhance familiarity with and acceptance of the interface. Hence, interaction capabilities in virtual worlds should be comparable to those in the real work to allow utilization of our natural experiences. In this paper the integration of hand gestures and visual focus, two important aspects in modern human-computer interaction, into a medical imaging environment is shown. With the presented human- machine interface, including virtual reality displaying and interaction techniques, radiologists can be supported in their work. Further, virtual environments can even alleviate communication between specialists from different fields or in educational and training applications.
Dynamically allocated virtual clustering management system

NASA Astrophysics Data System (ADS)

Marcus, Kelvin; Cannata, Jess

2013-05-01

The U.S Army Research Laboratory (ARL) has built a "Wireless Emulation Lab" to support research in wireless mobile networks. In our current experimentation environment, our researchers need the capability to run clusters of heterogeneous nodes to model emulated wireless tactical networks where each node could contain a different operating system, application set, and physical hardware. To complicate matters, most experiments require the researcher to have root privileges. Our previous solution of using a single shared cluster of statically deployed virtual machines did not sufficiently separate each user's experiment due to undesirable network crosstalk, thus only one experiment could be run at a time. In addition, the cluster did not make efficient use of our servers and physical networks. To address these concerns, we created the Dynamically Allocated Virtual Clustering management system (DAVC). This system leverages existing open-source software to create private clusters of nodes that are either virtual or physical machines. These clusters can be utilized for software development, experimentation, and integration with existing hardware and software. The system uses the Grid Engine job scheduler to efficiently allocate virtual machines to idle systems and networks. The system deploys stateless nodes via network booting. The system uses 802.1Q Virtual LANs (VLANs) to prevent experimentation crosstalk and to allow for complex, private networks eliminating the need to map each virtual machine to a specific switch port. The system monitors the health of the clusters and the underlying physical servers and it maintains cluster usage statistics for historical trends. Users can start private clusters of heterogeneous nodes with root privileges for the duration of the experiment. Users also control when to shutdown their clusters.
Open multi-agent control architecture to support virtual-reality-based man-machine interfaces

NASA Astrophysics Data System (ADS)

Freund, Eckhard; Rossmann, Juergen; Brasch, Marcel

2001-10-01

Projective Virtual Reality is a new and promising approach to intuitively operable man machine interfaces for the commanding and supervision of complex automation systems. The user interface part of Projective Virtual Reality heavily builds on latest Virtual Reality techniques, a task deduction component and automatic action planning capabilities. In order to realize man machine interfaces for complex applications, not only the Virtual Reality part has to be considered but also the capabilities of the underlying robot and automation controller are of great importance. This paper presents a control architecture that has proved to be an ideal basis for the realization of complex robotic and automation systems that are controlled by Virtual Reality based man machine interfaces. The architecture does not just provide a well suited framework for the real-time control of a multi robot system but also supports Virtual Reality metaphors and augmentations which facilitate the user's job to command and supervise a complex system. The developed control architecture has already been used for a number of applications. Its capability to integrate sensor information from sensors of different levels of abstraction in real-time helps to make the realized automation system very responsive to real world changes. In this paper, the architecture will be described comprehensively, its main building blocks will be discussed and one realization that is built based on an open source real-time operating system will be presented. The software design and the features of the architecture which make it generally applicable to the distributed control of automation agents in real world applications will be explained. Furthermore its application to the commanding and control of experiments in the Columbus space laboratory, the European contribution to the International Space Station (ISS), is only one example which will be described.
Staghorn: An Automated Large-Scale Distributed System Analysis Platform

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gabert, Kasimir; Burns, Ian; Elliott, Steven

2016-09-01

Conducting experiments on large-scale distributed computing systems is becoming significantly easier with the assistance of emulation. Researchers can now create a model of a distributed computing environment and then generate a virtual, laboratory copy of the entire system composed of potentially thousands of virtual machines, switches, and software. The use of real software, running at clock rate in full virtual machines, allows experiments to produce meaningful results without necessitating a full understanding of all model components. However, the ability to inspect and modify elements within these models is bound by the limitation that such modifications must compete with the model,more » either running in or alongside it. This inhibits entire classes of analyses from being conducted upon these models. We developed a mechanism to snapshot an entire emulation-based model as it is running. This allows us to \\freeze time" and subsequently fork execution, replay execution, modify arbitrary parts of the model, or deeply explore the model. This snapshot includes capturing packets in transit and other input/output state along with the running virtual machines. We were able to build this system in Linux using Open vSwitch and Kernel Virtual Machines on top of Sandia's emulation platform Firewheel. This primitive opens the door to numerous subsequent analyses on models, including state space exploration, debugging distributed systems, performance optimizations, improved training environments, and improved experiment repeatability.« less
The paradigm compiler: Mapping a functional language for the connection machine

NASA Technical Reports Server (NTRS)

Dennis, Jack B.

1989-01-01

The Paradigm Compiler implements a new approach to compiling programs written in high level languages for execution on highly parallel computers. The general approach is to identify the principal data structures constructed by the program and to map these structures onto the processing elements of the target machine. The mapping is chosen to maximize performance as determined through compile time global analysis of the source program. The source language is Sisal, a functional language designed for scientific computations, and the target language is Paris, the published low level interface to the Connection Machine. The data structures considered are multidimensional arrays whose dimensions are known at compile time. Computations that build such arrays usually offer opportunities for highly parallel execution; they are data parallel. The Connection Machine is an attractive target for these computations, and the parallel for construct of the Sisal language is a convenient high level notation for data parallel algorithms. The principles and organization of the Paradigm Compiler are discussed.
An M-step preconditioned conjugate gradient method for parallel computation

NASA Technical Reports Server (NTRS)

Adams, L.

1983-01-01

This paper describes a preconditioned conjugate gradient method that can be effectively implemented on both vector machines and parallel arrays to solve sparse symmetric and positive definite systems of linear equations. The implementation on the CYBER 203/205 and on the Finite Element Machine is discussed and results obtained using the method on these machines are given.
Virtual Oscillator Controls | Grid Modernization | NREL

Science.gov Websites

Virtual Oscillator Controls Virtual Oscillator Controls NREL is developing virtual oscillator Santa-Barbara, and SunPower. Publications Synthesizing Virtual Oscillators To Control Islanded Inverters Synchronization of Parallel Single-Phase Inverters Using Virtual Oscillator Control, IEEE Transactions on Power
Approximation algorithms for scheduling unrelated parallel machines with release dates

NASA Astrophysics Data System (ADS)

Avdeenko, T. V.; Mesentsev, Y. A.; Estraykh, I. V.

2017-01-01

In this paper we propose approaches to optimal scheduling of unrelated parallel machines with release dates. One approach is based on the scheme of dynamic programming modified with adaptive narrowing of search domain ensuring its computational effectiveness. We discussed complexity of the exact schedules synthesis and compared it with approximate, close to optimal, solutions. Also we explain how the algorithm works for the example of two unrelated parallel machines and five jobs with release dates. Performance results that show the efficiency of the proposed approach have been given.
The Implications of Virtual Machine Introspection for Digital Forensics on Nonquiescent Virtual Machines

DTIC Science & Technology

2011-06-01

metacity [ 2788] gnome-panel [ 2790] nautilus [ 2794] bonobo -activati [ 2797] gnome-vfs-daemo [ 2799] eggcups [ 2800] gnome-volume-ma [ 2809] bt...xrdb [ 2784] metacity [ 2788] gnome-panel [ 2790] nautilus [ 2794] bonobo -activati [ 2797] gnome-vfs-daemo [ 2799] eggcups [ 2800] gnome-volume...gnome-keyring-d [ 2764] gnome-settings- [ 2780] xrdb [ 2784] metacity [ 2788] gnome-panel [ 2790] nautilus [ 2794] bonobo -activati [ 2797] gnome

Slot Machines: Pursuing Responsible Gaming Practices for Virtual Reels and Near Misses

ERIC Educational Resources Information Center

Harrigan, Kevin A.

2009-01-01

Since 1983, slot machines in North America have used a computer and virtual reels to determine the odds. Since at least 1988, a technique called clustering has been used to create a high number of near misses, failures that are close to wins. The result is that what the player sees does not represent the underlying probabilities and randomness,…
Virtual screening by a new Clustering-based Weighted Similarity Extreme Learning Machine approach

PubMed Central

Kudisthalert, Wasu

2018-01-01

Machine learning techniques are becoming popular in virtual screening tasks. One of the powerful machine learning algorithms is Extreme Learning Machine (ELM) which has been applied to many applications and has recently been applied to virtual screening. We propose the Weighted Similarity ELM (WS-ELM) which is based on a single layer feed-forward neural network in a conjunction of 16 different similarity coefficients as activation function in the hidden layer. It is known that the performance of conventional ELM is not robust due to random weight selection in the hidden layer. Thus, we propose a Clustering-based WS-ELM (CWS-ELM) that deterministically assigns weights by utilising clustering algorithms i.e. k-means clustering and support vector clustering. The experiments were conducted on one of the most challenging datasets–Maximum Unbiased Validation Dataset–which contains 17 activity classes carefully selected from PubChem. The proposed algorithms were then compared with other machine learning techniques such as support vector machine, random forest, and similarity searching. The results show that CWS-ELM in conjunction with support vector clustering yields the best performance when utilised together with Sokal/Sneath(1) coefficient. Furthermore, ECFP_6 fingerprint presents the best results in our framework compared to the other types of fingerprints, namely ECFP_4, FCFP_4, and FCFP_6. PMID:29652912
Impact of equalizing currents on losses and torque ripples in electrical machines with fractional slot concentrated windings

NASA Astrophysics Data System (ADS)

Toporkov, D. M.; Vialcev, G. B.

2017-10-01

The implementation of parallel branches is a commonly used manufacturing method of the realizing of fractional slot concentrated windings in electrical machines. If the rotor eccentricity is enabled in a machine with parallel branches, the equalizing currents can arise. The simulation approach of the equalizing currents in parallel branches of an electrical machine winding based on magnetic field calculation by using Finite Elements Method is discussed in the paper. The high accuracy of the model is provided by the dynamic improvement of the inductances in the differential equation system describing a machine. The pre-computed table flux linkage functions are used for that. The functions are the dependences of the flux linkage of parallel branches on the branches currents and rotor position angle. The functions permit to calculate self-inductances and mutual inductances by partial derivative. The calculated results obtained for the electric machine specimen are presented. The results received show that the adverse combination of design solutions and the rotor eccentricity leads to a high value of the equalizing currents and windings heating. Additional torque ripples also arise. The additional ripples harmonic content is not similar to the cogging torque or ripples caused by the rotor eccentricity.
GPURFSCREEN: a GPU based virtual screening tool using random forest classifier.

PubMed

Jayaraj, P B; Ajay, Mathias K; Nufail, M; Gopakumar, G; Jaleel, U C A

2016-01-01

In-silico methods are an integral part of modern drug discovery paradigm. Virtual screening, an in-silico method, is used to refine data models and reduce the chemical space on which wet lab experiments need to be performed. Virtual screening of a ligand data model requires large scale computations, making it a highly time consuming task. This process can be speeded up by implementing parallelized algorithms on a Graphical Processing Unit (GPU). Random Forest is a robust classification algorithm that can be employed in the virtual screening. A ligand based virtual screening tool (GPURFSCREEN) that uses random forests on GPU systems has been proposed and evaluated in this paper. This tool produces optimized results at a lower execution time for large bioassay data sets. The quality of results produced by our tool on GPU is same as that on a regular serial environment. Considering the magnitude of data to be screened, the parallelized virtual screening has a significantly lower running time at high throughput. The proposed parallel tool outperforms its serial counterpart by successfully screening billions of molecules in training and prediction phases.
sRNAtoolboxVM: Small RNA Analysis in a Virtual Machine.

PubMed

Gómez-Martín, Cristina; Lebrón, Ricardo; Rueda, Antonio; Oliver, José L; Hackenberg, Michael

2017-01-01

High-throughput sequencing (HTS) data for small RNAs (noncoding RNA molecules that are 20-250 nucleotides in length) can now be routinely generated by minimally equipped wet laboratories; however, the bottleneck in HTS-based research has shifted now to the analysis of such huge amount of data. One of the reasons is that many analysis types require a Linux environment but computers, system administrators, and bioinformaticians suppose additional costs that often cannot be afforded by small to mid-sized groups or laboratories. Web servers are an alternative that can be used if the data is not subjected to privacy issues (what very often is an important issue with medical data). However, in any case they are less flexible than stand-alone programs limiting the number of workflows and analysis types that can be carried out.We show in this protocol how virtual machines can be used to overcome those problems and limitations. sRNAtoolboxVM is a virtual machine that can be executed on all common operating systems through virtualization programs like VirtualBox or VMware, providing the user with a high number of preinstalled programs like sRNAbench for small RNA analysis without the need to maintain additional servers and/or operating systems.
RTNN: The New Parallel Machine in Zaragoza

NASA Astrophysics Data System (ADS)

Sijs, A. J. V. D.

I report on the development of RTNN, a parallel computer designed as a 4^4 hypercube of 256 T9000 transputer nodes, each with 8 MB memory. The peak performance of the machine is expected to be 2.5 Gflops.
Six Years of Parallel Computing at NAS (1987 - 1993): What Have we Learned?

NASA Technical Reports Server (NTRS)

Simon, Horst D.; Cooper, D. M. (Technical Monitor)

1994-01-01

In the fall of 1987 the age of parallelism at NAS began with the installation of a 32K processor CM-2 from Thinking Machines. In 1987 this was described as an "experiment" in parallel processing. In the six years since, NAS acquired a series of parallel machines, and conducted an active research and development effort focused on the use of highly parallel machines for applications in the computational aerosciences. In this time period parallel processing for scientific applications evolved from a fringe research topic into the one of main activities at NAS. In this presentation I will review the history of parallel computing at NAS in the context of the major progress, which has been made in the field in general. I will attempt to summarize the lessons we have learned so far, and the contributions NAS has made to the state of the art. Based on these insights I will comment on the current state of parallel computing (including the HPCC effort) and try to predict some trends for the next six years.
An Analysis of Hardware-Assisted Virtual Machine Based Rootkits

DTIC Science & Technology

2014-06-01

certain aspects of TPM implementation just to name a few. HyperWall is an architecture proposed by Szefer and Lee to protect guest VMs from...DISTRIBUTION CODE 13. ABSTRACT (maximum 200 words) The use of virtual machine (VM) technology has expanded rapidly since AMD and Intel implemented ...Intel VT-x implementations of Blue Pill to identify commonalities in the respective versions’ attack methodologies from both a functional and technical
Active tactile exploration using a brain-machine-brain interface.

PubMed

O'Doherty, Joseph E; Lebedev, Mikhail A; Ifft, Peter J; Zhuang, Katie Z; Shokur, Solaiman; Bleuler, Hannes; Nicolelis, Miguel A L

2011-10-05

Brain-machine interfaces use neuronal activity recorded from the brain to establish direct communication with external actuators, such as prosthetic arms. It is hoped that brain-machine interfaces can be used to restore the normal sensorimotor functions of the limbs, but so far they have lacked tactile sensation. Here we report the operation of a brain-machine-brain interface (BMBI) that both controls the exploratory reaching movements of an actuator and allows signalling of artificial tactile feedback through intracortical microstimulation (ICMS) of the primary somatosensory cortex. Monkeys performed an active exploration task in which an actuator (a computer cursor or a virtual-reality arm) was moved using a BMBI that derived motor commands from neuronal ensemble activity recorded in the primary motor cortex. ICMS feedback occurred whenever the actuator touched virtual objects. Temporal patterns of ICMS encoded the artificial tactile properties of each object. Neuronal recordings and ICMS epochs were temporally multiplexed to avoid interference. Two monkeys operated this BMBI to search for and distinguish one of three visually identical objects, using the virtual-reality arm to identify the unique artificial texture associated with each. These results suggest that clinical motor neuroprostheses might benefit from the addition of ICMS feedback to generate artificial somatic perceptions associated with mechanical, robotic or even virtual prostheses.
NGScloud: RNA-seq analysis of non-model species using cloud computing.

PubMed

Mora-Márquez, Fernando; Vázquez-Poletti, José Luis; López de Heredia, Unai

2018-05-03

RNA-seq analysis usually requires large computing infrastructures. NGScloud is a bioinformatic system developed to analyze RNA-seq data using the cloud computing services of Amazon that permit the access to ad hoc computing infrastructure scaled according to the complexity of the experiment, so its costs and times can be optimized. The application provides a user-friendly front-end to operate Amazon's hardware resources, and to control a workflow of RNA-seq analysis oriented to non-model species, incorporating the cluster concept, which allows parallel runs of common RNA-seq analysis programs in several virtual machines for faster analysis. NGScloud is freely available at https://github.com/GGFHF/NGScloud/. A manual detailing installation and how-to-use instructions is available with the distribution. unai.lopezdeheredia@upm.es.
The performance of low-cost commercial cloud computing as an alternative in computational chemistry.

PubMed

Thackston, Russell; Fortenberry, Ryan C

2015-05-05

The growth of commercial cloud computing (CCC) as a viable means of computational infrastructure is largely unexplored for the purposes of quantum chemistry. In this work, the PSI4 suite of computational chemistry programs is installed on five different types of Amazon World Services CCC platforms. The performance for a set of electronically excited state single-point energies is compared between these CCC platforms and typical, "in-house" physical machines. Further considerations are made for the number of cores or virtual CPUs (vCPUs, for the CCC platforms), but no considerations are made for full parallelization of the program (even though parallelization of the BLAS library is implemented), complete high-performance computing cluster utilization, or steal time. Even with this most pessimistic view of the computations, CCC resources are shown to be more cost effective for significant numbers of typical quantum chemistry computations. Large numbers of large computations are still best utilized by more traditional means, but smaller-scale research may be more effectively undertaken through CCC services. © 2015 Wiley Periodicals, Inc.
Demonstration of a full volume 3D pre-stack depth migration in the Garden Banks area using massively parallel processor (MPP) technology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Solano, M.; Chang, H.; VanDyke, J.

1996-12-31

This paper describes the implementation and results of portable, production-scale 3D Pre-stack Kirchhoff depth migration software. Full volume pre-stack imaging was applied to a six million trace (46.9 Gigabyte) data set from a subsalt play in the Garden Banks area in the Gulf of Mexico. The velocity model building and updating, were derived using image depth gathers and an image-driven strategy. After three velocity iterations, depth migrated sections revealed drilling targets that were not visible in the conventional 3D post-stack time migrated data set. As expected from the implementation of the migration algorithm, it was found that amplitudes are wellmore » preserved and anomalies associated with known reservoirs, conform to petrophysical predictions. Image gathers for velocity analysis and the final depth migrated volume, were generated on an 1824 node Intel Paragon at Sandia National Laboratories. The code has been successfully ported to a CRAY (T3D) and Unix workstation Parallel Virtual Machine environments (PVM).« less
PVM Wrapper

NASA Technical Reports Server (NTRS)

Katz, Daniel

2004-01-01

PVM Wrapper is a software library that makes it possible for code that utilizes the Parallel Virtual Machine (PVM) software library to run using the message-passing interface (MPI) software library, without needing to rewrite the entire code. PVM and MPI are the two most common software libraries used for applications that involve passing of messages among parallel computers. Since about 1996, MPI has been the de facto standard. Codes written when PVM was popular often feature patterns of {"initsend," "pack," "send"} and {"receive," "unpack"} calls. In many cases, these calls are not contiguous and one set of calls may even exist over multiple subroutines. These characteristics make it difficult to obtain equivalent functionality via a single MPI "send" call. Because PVM Wrapper is written to run with MPI- 1.2, some PVM functions are not permitted and must be replaced - a task that requires some programming expertise. The "pvm_spawn" and "pvm_parent" function calls are not replaced, but a programmer can use "mpirun" and knowledge of the ranks of parent and child tasks with supplied macroinstructions to enable execution of codes that use "pvm_spawn" and "pvm_parent."
Will Anything Useful Come Out of Virtual Reality? Examination of a Naval Application

DTIC Science & Technology

1993-05-01

The term virtual reality can encompass varying meanings, but some generally accepted attributes of a virtual environment are that it is immersive...technology, but at present there are few practical applications which are utilizing the broad range of virtual reality technology. This paper will discuss an...Operability, operator functions, Virtual reality , Man-machine interface, Decision aids/decision making, Decision support. ASW.
Portable programming on parallel/networked computers using the Application Portable Parallel Library (APPL)

NASA Technical Reports Server (NTRS)

Quealy, Angela; Cole, Gary L.; Blech, Richard A.

1993-01-01

The Application Portable Parallel Library (APPL) is a subroutine-based library of communication primitives that is callable from applications written in FORTRAN or C. APPL provides a consistent programmer interface to a variety of distributed and shared-memory multiprocessor MIMD machines. The objective of APPL is to minimize the effort required to move parallel applications from one machine to another, or to a network of homogeneous machines. APPL encompasses many of the message-passing primitives that are currently available on commercial multiprocessor systems. This paper describes APPL (version 2.3.1) and its usage, reports the status of the APPL project, and indicates possible directions for the future. Several applications using APPL are discussed, as well as performance and overhead results.
Migrating EO/IR sensors to cloud-based infrastructure as service architectures

NASA Astrophysics Data System (ADS)

Berglie, Stephen T.; Webster, Steven; May, Christopher M.

2014-06-01

The Night Vision Image Generator (NVIG), a product of US Army RDECOM CERDEC NVESD, is a visualization tool used widely throughout Army simulation environments to provide fully attributed synthesized, full motion video using physics-based sensor and environmental effects. The NVIG relies heavily on contemporary hardware-based acceleration and GPU processing techniques, which push the envelope of both enterprise and commodity-level hypervisor support for providing virtual machines with direct access to hardware resources. The NVIG has successfully been integrated into fully virtual environments where system architectures leverage cloudbased technologies to various extents in order to streamline infrastructure and service management. This paper details the challenges presented to engineers seeking to migrate GPU-bound processes, such as the NVIG, to virtual machines and, ultimately, Cloud-Based IAS architectures. In addition, it presents the path that led to success for the NVIG. A brief overview of Cloud-Based infrastructure management tool sets is provided, and several virtual desktop solutions are outlined. A discrimination is made between general purpose virtual desktop technologies compared to technologies that expose GPU-specific capabilities, including direct rendering and hard ware-based video encoding. Candidate hypervisor/virtual machine configurations that nominally satisfy the virtualized hardware-level GPU requirements of the NVIG are presented , and each is subsequently reviewed in light of its implications on higher-level Cloud management techniques. Implementation details are included from the hardware level, through the operating system, to the 3D graphics APls required by the NVIG and similar GPU-bound tools.
A Virtual Sensor for Online Fault Detection of Multitooth-Tools

PubMed Central

Bustillo, Andres; Correa, Maritza; Reñones, Anibal

2011-01-01

The installation of suitable sensors close to the tool tip on milling centres is not possible in industrial environments. It is therefore necessary to design virtual sensors for these machines to perform online fault detection in many industrial tasks. This paper presents a virtual sensor for online fault detection of multitooth tools based on a Bayesian classifier. The device that performs this task applies mathematical models that function in conjunction with physical sensors. Only two experimental variables are collected from the milling centre that performs the machining operations: the electrical power consumption of the feed drive and the time required for machining each workpiece. The task of achieving reliable signals from a milling process is especially complex when multitooth tools are used, because each kind of cutting insert in the milling centre only works on each workpiece during a certain time window. Great effort has gone into designing a robust virtual sensor that can avoid re-calibration due to, e.g., maintenance operations. The virtual sensor developed as a result of this research is successfully validated under real conditions on a milling centre used for the mass production of automobile engine crankshafts. Recognition accuracy, calculated with a k-fold cross validation, had on average 0.957 of true positives and 0.986 of true negatives. Moreover, measured accuracy was 98%, which suggests that the virtual sensor correctly identifies new cases. PMID:22163766
Modeling and Analysis Compute Environments, Utilizing Virtualization Technology in the Climate and Earth Systems Science domain

NASA Astrophysics Data System (ADS)

Michaelis, A.; Nemani, R. R.; Wang, W.; Votava, P.; Hashimoto, H.

2010-12-01

Given the increasing complexity of climate modeling and analysis tools, it is often difficult and expensive to build or recreate an exact replica of the software compute environment used in past experiments. With the recent development of new technologies for hardware virtualization, an opportunity exists to create full modeling, analysis and compute environments that are “archiveable”, transferable and may be easily shared amongst a scientific community or presented to a bureaucratic body if the need arises. By encapsulating and entire modeling and analysis environment in a virtual machine image, others may quickly gain access to the fully built system used in past experiments, potentially easing the task and reducing the costs of reproducing and verify past results produced by other researchers. Moreover, these virtual machine images may be used as a pedagogical tool for others that are interested in performing an academic exercise but don't yet possess the broad expertise required. We built two virtual machine images, one with the Community Earth System Model (CESM) and one with Weather Research Forecast Model (WRF), then ran several small experiments to assess the feasibility, performance overheads costs, reusability, and transferability. We present a list of the pros and cons as well as lessoned learned from utilizing virtualization technology in the climate and earth systems modeling domain.
A virtual sensor for online fault detection of multitooth-tools.

PubMed

Bustillo, Andres; Correa, Maritza; Reñones, Anibal

2011-01-01

The installation of suitable sensors close to the tool tip on milling centres is not possible in industrial environments. It is therefore necessary to design virtual sensors for these machines to perform online fault detection in many industrial tasks. This paper presents a virtual sensor for online fault detection of multitooth tools based on a bayesian classifier. The device that performs this task applies mathematical models that function in conjunction with physical sensors. Only two experimental variables are collected from the milling centre that performs the machining operations: the electrical power consumption of the feed drive and the time required for machining each workpiece. The task of achieving reliable signals from a milling process is especially complex when multitooth tools are used, because each kind of cutting insert in the milling centre only works on each workpiece during a certain time window. Great effort has gone into designing a robust virtual sensor that can avoid re-calibration due to, e.g., maintenance operations. The virtual sensor developed as a result of this research is successfully validated under real conditions on a milling centre used for the mass production of automobile engine crankshafts. Recognition accuracy, calculated with a k-fold cross validation, had on average 0.957 of true positives and 0.986 of true negatives. Moreover, measured accuracy was 98%, which suggests that the virtual sensor correctly identifies new cases.
A Comprehensive Availability Modeling and Analysis of a Virtualized Servers System Using Stochastic Reward Nets

PubMed Central

Kim, Dong Seong; Park, Jong Sou

2014-01-01

It is important to assess availability of virtualized systems in IT business infrastructures. Previous work on availability modeling and analysis of the virtualized systems used a simplified configuration and assumption in which only one virtual machine (VM) runs on a virtual machine monitor (VMM) hosted on a physical server. In this paper, we show a comprehensive availability model using stochastic reward nets (SRN). The model takes into account (i) the detailed failures and recovery behaviors of multiple VMs, (ii) various other failure modes and corresponding recovery behaviors (e.g., hardware faults, failure and recovery due to Mandelbugs and aging-related bugs), and (iii) dependency between different subcomponents (e.g., between physical host failure and VMM, etc.) in a virtualized servers system. We also show numerical analysis on steady state availability, downtime in hours per year, transaction loss, and sensitivity analysis. This model provides a new finding on how to increase system availability by combining both software rejuvenations at VM and VMM in a wise manner. PMID:25165732

Cloud-based opportunities in scientific computing: insights from processing Suomi National Polar-Orbiting Partnership (S-NPP) Direct Broadcast data

NASA Astrophysics Data System (ADS)

Evans, J. D.; Hao, W.; Chettri, S.

2013-12-01

The cloud is proving to be a uniquely promising platform for scientific computing. Our experience with processing satellite data using Amazon Web Services highlights several opportunities for enhanced performance, flexibility, and cost effectiveness in the cloud relative to traditional computing -- for example: - Direct readout from a polar-orbiting satellite such as the Suomi National Polar-Orbiting Partnership (S-NPP) requires bursts of processing a few times a day, separated by quiet periods when the satellite is out of receiving range. In the cloud, by starting and stopping virtual machines in minutes, we can marshal significant computing resources quickly when needed, but not pay for them when not needed. To take advantage of this capability, we are automating a data-driven approach to the management of cloud computing resources, in which new data availability triggers the creation of new virtual machines (of variable size and processing power) which last only until the processing workflow is complete. - 'Spot instances' are virtual machines that run as long as one's asking price is higher than the provider's variable spot price. Spot instances can greatly reduce the cost of computing -- for software systems that are engineered to withstand unpredictable interruptions in service (as occurs when a spot price exceeds the asking price). We are implementing an approach to workflow management that allows data processing workflows to resume with minimal delays after temporary spot price spikes. This will allow systems to take full advantage of variably-priced 'utility computing.' - Thanks to virtual machine images, we can easily launch multiple, identical machines differentiated only by 'user data' containing individualized instructions (e.g., to fetch particular datasets or to perform certain workflows or algorithms) This is particularly useful when (as is the case with S-NPP data) we need to launch many very similar machines to process an unpredictable number of data files concurrently. Our experience shows the viability and flexibility of this approach to workflow management for scientific data processing. - Finally, cloud computing is a promising platform for distributed volunteer ('interstitial') computing, via mechanisms such as the Berkeley Open Infrastructure for Network Computing (BOINC) popularized with the SETI@Home project and others such as ClimatePrediction.net and NASA's Climate@Home. Interstitial computing faces significant challenges as commodity computing shifts from (always on) desktop computers towards smartphones and tablets (untethered and running on scarce battery power); but cloud computing offers significant slack capacity. This capacity includes virtual machines with unused RAM or underused CPUs; virtual storage volumes allocated (& paid for) but not full; and virtual machines that are paid up for the current hour but whose work is complete. We are devising ways to facilitate the reuse of these resources (i.e., cloud-based interstitial computing) for satellite data processing and related analyses. We will present our findings and research directions on these and related topics.
A discrete Fourier transform for virtual memory machines

NASA Technical Reports Server (NTRS)

Galant, David C.

1992-01-01

An algebraic theory of the Discrete Fourier Transform is developed in great detail. Examination of the details of the theory leads to a computationally efficient fast Fourier transform for the use on computers with virtual memory. Such an algorithm is of great use on modern desktop machines. A FORTRAN coded version of the algorithm is given for the case when the sequence of numbers to be transformed is a power of two.
Robust Airborne Networking Extensions (RANGE)

DTIC Science & Technology

2008-02-01

IMUNES [13] project, which provides an entire network stack virtualization and topology control inside a single FreeBSD machine . The emulated topology...Multicast versus broadcast in a manet.” in ADHOC-NOW, 2004, pp. 14–27. [9] J. Mukherjee, R. Atwood , “ Rendezvous point relocation in protocol independent...computer with an Ethernet connection, or a Linux virtual machine on some other (e.g., Windows) operating system, should work. 2.1 Patching the source code
Lifelong personal health data and application software via virtual machines in the cloud.

PubMed

Van Gorp, Pieter; Comuzzi, Marco

2014-01-01

Personal Health Records (PHRs) should remain the lifelong property of patients, who should be able to show them conveniently and securely to selected caregivers and institutions. In this paper, we present MyPHRMachines, a cloud-based PHR system taking a radically new architectural solution to health record portability. In MyPHRMachines, health-related data and the application software to view and/or analyze it are separately deployed in the PHR system. After uploading their medical data to MyPHRMachines, patients can access them again from remote virtual machines that contain the right software to visualize and analyze them without any need for conversion. Patients can share their remote virtual machine session with selected caregivers, who will need only a Web browser to access the pre-loaded fragments of their lifelong PHR. We discuss a prototype of MyPHRMachines applied to two use cases, i.e., radiology image sharing and personalized medicine.
Concurrent computation of attribute filters on shared memory parallel machines.

PubMed

Wilkinson, Michael H F; Gao, Hui; Hesselink, Wim H; Jonker, Jan-Eppo; Meijster, Arnold

2008-10-01

Morphological attribute filters have not previously been parallelized, mainly because they are both global and non-separable. We propose a parallel algorithm that achieves efficient parallelism for a large class of attribute filters, including attribute openings, closings, thinnings and thickenings, based on Salembier's Max-Trees and Min-trees. The image or volume is first partitioned in multiple slices. We then compute the Max-trees of each slice using any sequential Max-Tree algorithm. Subsequently, the Max-trees of the slices can be merged to obtain the Max-tree of the image. A C-implementation yielded good speed-ups on both a 16-processor MIPS 14000 parallel machine, and a dual-core Opteron-based machine. It is shown that the speed-up of the parallel algorithm is a direct measure of the gain with respect to the sequential algorithm used. Furthermore, the concurrent algorithm shows a speed gain of up to 72 percent on a single-core processor, due to reduced cache thrashing.
A real-time MPEG software decoder using a portable message-passing library

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kwong, Man Kam; Tang, P.T. Peter; Lin, Biquan

1995-12-31

We present a real-time MPEG software decoder that uses message-passing libraries such as MPL, p4 and MPI. The parallel MPEG decoder currently runs on the IBM SP system but can be easil ported to other parallel machines. This paper discusses our parallel MPEG decoding algorithm as well as the parallel programming environment under which it uses. Several technical issues are discussed, including balancing of decoding speed, memory limitation, 1/0 capacities, and optimization of MPEG decoding components. This project shows that a real-time portable software MPEG decoder is feasible in a general-purpose parallel machine.
Solution of a tridiagonal system of equations on the finite element machine

NASA Technical Reports Server (NTRS)

Bostic, S. W.

1984-01-01

Two parallel algorithms for the solution of tridiagonal systems of equations were implemented on the Finite Element Machine. The Accelerated Parallel Gauss method, an iterative method, and the Buneman algorithm, a direct method, are discussed and execution statistics are presented.
Scheduling Jobs with Variable Job Processing Times on Unrelated Parallel Machines

PubMed Central

Zhang, Guang-Qian; Wang, Jian-Jun; Liu, Ya-Jing

2014-01-01

m unrelated parallel machines scheduling problems with variable job processing times are considered, where the processing time of a job is a function of its position in a sequence, its starting time, and its resource allocation. The objective is to determine the optimal resource allocation and the optimal schedule to minimize a total cost function that dependents on the total completion (waiting) time, the total machine load, the total absolute differences in completion (waiting) times on all machines, and total resource cost. If the number of machines is a given constant number, we propose a polynomial time algorithm to solve the problem. PMID:24982933
Integrating multiple scientific computing needs via a Private Cloud infrastructure

NASA Astrophysics Data System (ADS)

Bagnasco, S.; Berzano, D.; Brunetti, R.; Lusso, S.; Vallero, S.

2014-06-01

In a typical scientific computing centre, diverse applications coexist and share a single physical infrastructure. An underlying Private Cloud facility eases the management and maintenance of heterogeneous use cases such as multipurpose or application-specific batch farms, Grid sites catering to different communities, parallel interactive data analysis facilities and others. It allows to dynamically and efficiently allocate resources to any application and to tailor the virtual machines according to the applications' requirements. Furthermore, the maintenance of large deployments of complex and rapidly evolving middleware and application software is eased by the use of virtual images and contextualization techniques; for example, rolling updates can be performed easily and minimizing the downtime. In this contribution we describe the Private Cloud infrastructure at the INFN-Torino Computer Centre, that hosts a full-fledged WLCG Tier-2 site and a dynamically expandable PROOF-based Interactive Analysis Facility for the ALICE experiment at the CERN LHC and several smaller scientific computing applications. The Private Cloud building blocks include the OpenNebula software stack, the GlusterFS filesystem (used in two different configurations for worker- and service-class hypervisors) and the OpenWRT Linux distribution (used for network virtualization). A future integration into a federated higher-level infrastructure is made possible by exposing commonly used APIs like EC2 and by using mainstream contextualization tools like CloudInit.
Implementation of a parallel unstructured Euler solver on the CM-5

NASA Technical Reports Server (NTRS)

Morano, Eric; Mavriplis, D. J.

1995-01-01

An efficient unstructured 3D Euler solver is parallelized on a Thinking Machine Corporation Connection Machine 5, distributed memory computer with vectoring capability. In this paper, the single instruction multiple data (SIMD) strategy is employed through the use of the CM Fortran language and the CMSSL scientific library. The performance of the CMSSL mesh partitioner is evaluated and the overall efficiency of the parallel flow solver is discussed.
Performance of a plasma fluid code on the Intel parallel computers

NASA Technical Reports Server (NTRS)

Lynch, V. E.; Carreras, B. A.; Drake, J. B.; Leboeuf, J. N.; Liewer, P.

1992-01-01

One approach to improving the real-time efficiency of plasma turbulence calculations is to use a parallel algorithm. A parallel algorithm for plasma turbulence calculations was tested on the Intel iPSC/860 hypercube and the Touchtone Delta machine. Using the 128 processors of the Intel iPSC/860 hypercube, a factor of 5 improvement over a single-processor CRAY-2 is obtained. For the Touchtone Delta machine, the corresponding improvement factor is 16. For plasma edge turbulence calculations, an extrapolation of the present results to the Intel (sigma) machine gives an improvement factor close to 64 over the single-processor CRAY-2.
On the suitability of the connection machine for direct particle simulation

NASA Technical Reports Server (NTRS)

Dagum, Leonard

1990-01-01

The algorithmic structure was examined of the vectorizable Stanford particle simulation (SPS) method and the structure is reformulated in data parallel form. Some of the SPS algorithms can be directly translated to data parallel, but several of the vectorizable algorithms have no direct data parallel equivalent. This requires the development of new, strictly data parallel algorithms. In particular, a new sorting algorithm is developed to identify collision candidates in the simulation and a master/slave algorithm is developed to minimize communication cost in large table look up. Validation of the method is undertaken through test calculations for thermal relaxation of a gas, shock wave profiles, and shock reflection from a stationary wall. A qualitative measure is provided of the performance of the Connection Machine for direct particle simulation. The massively parallel architecture of the Connection Machine is found quite suitable for this type of calculation. However, there are difficulties in taking full advantage of this architecture because of lack of a broad based tradition of data parallel programming. An important outcome of this work has been new data parallel algorithms specifically of use for direct particle simulation but which also expand the data parallel diction.
Virtualization for Cost-Effective Teaching of Assembly Language Programming

ERIC Educational Resources Information Center

Cadenas, José O.; Sherratt, R. Simon; Howlett, Des; Guy, Chris G.; Lundqvist, Karsten O.

2015-01-01

This paper describes a virtual system that emulates an ARM-based processor machine, created to replace a traditional hardware-based system for teaching assembly language. The virtual system proposed here integrates, in a single environment, all the development tools necessary to deliver introductory or advanced courses on modern assembly language…
Proposal of Modification Strategy of NC Program in the Virtual Manufacturing Environment

NASA Astrophysics Data System (ADS)

Narita, Hirohisa; Chen, Lian-Yi; Fujimoto, Hideo; Shirase, Keiichi; Arai, Eiji

Virtual manufacturing will be a key technology in process planning, because there are no evaluation tools for cutting conditions. Therefore, virtual machining simulator (VMSim), which can predict end milling processes, has been developed. The modification strategy of NC program using VMSim is proposed in this paper.
Active Gaming: Is "Virtual" Reality Right for Your Physical Education Program?

ERIC Educational Resources Information Center

Hansen, Lisa; Sanders, Stephen W.

2012-01-01

Active gaming is growing in popularity and the idea of increasing children's physical activity by using technology is largely accepted by physical educators. Teachers nationwide have been providing active gaming equipment such as virtual bikes, rhythmic dance machines, virtual sporting games, martial arts simulators, balance boards, and other…
Simulation Platform: a cloud-based online simulation environment.

PubMed

Yamazaki, Tadashi; Ikeno, Hidetoshi; Okumura, Yoshihiro; Satoh, Shunji; Kamiyama, Yoshimi; Hirata, Yutaka; Inagaki, Keiichiro; Ishihara, Akito; Kannon, Takayuki; Usui, Shiro

2011-09-01

For multi-scale and multi-modal neural modeling, it is needed to handle multiple neural models described at different levels seamlessly. Database technology will become more important for these studies, specifically for downloading and handling the neural models seamlessly and effortlessly. To date, conventional neuroinformatics databases have solely been designed to archive model files, but the databases should provide a chance for users to validate the models before downloading them. In this paper, we report our on-going project to develop a cloud-based web service for online simulation called "Simulation Platform". Simulation Platform is a cloud of virtual machines running GNU/Linux. On a virtual machine, various software including developer tools such as compilers and libraries, popular neural simulators such as GENESIS, NEURON and NEST, and scientific software such as Gnuplot, R and Octave, are pre-installed. When a user posts a request, a virtual machine is assigned to the user, and the simulation starts on that machine. The user remotely accesses to the machine through a web browser and carries out the simulation, without the need to install any software but a web browser on the user's own computer. Therefore, Simulation Platform is expected to eliminate impediments to handle multiple neural models that require multiple software. Copyright © 2011 Elsevier Ltd. All rights reserved.
Reprint of: Simulation Platform: a cloud-based online simulation environment.

PubMed

Yamazaki, Tadashi; Ikeno, Hidetoshi; Okumura, Yoshihiro; Satoh, Shunji; Kamiyama, Yoshimi; Hirata, Yutaka; Inagaki, Keiichiro; Ishihara, Akito; Kannon, Takayuki; Usui, Shiro

2011-11-01

For multi-scale and multi-modal neural modeling, it is needed to handle multiple neural models described at different levels seamlessly. Database technology will become more important for these studies, specifically for downloading and handling the neural models seamlessly and effortlessly. To date, conventional neuroinformatics databases have solely been designed to archive model files, but the databases should provide a chance for users to validate the models before downloading them. In this paper, we report our on-going project to develop a cloud-based web service for online simulation called "Simulation Platform". Simulation Platform is a cloud of virtual machines running GNU/Linux. On a virtual machine, various software including developer tools such as compilers and libraries, popular neural simulators such as GENESIS, NEURON and NEST, and scientific software such as Gnuplot, R and Octave, are pre-installed. When a user posts a request, a virtual machine is assigned to the user, and the simulation starts on that machine. The user remotely accesses to the machine through a web browser and carries out the simulation, without the need to install any software but a web browser on the user's own computer. Therefore, Simulation Platform is expected to eliminate impediments to handle multiple neural models that require multiple software. Copyright © 2011 Elsevier Ltd. All rights reserved.
Comparative study of state-of-the-art myoelectric controllers for multigrasp prosthetic hands.

PubMed

Segil, Jacob L; Controzzi, Marco; Weir, Richard F ff; Cipriani, Christian

2014-01-01

A myoelectric controller should provide an intuitive and effective human-machine interface that deciphers user intent in real-time and is robust enough to operate in daily life. Many myoelectric control architectures have been developed, including pattern recognition systems, finite state machines, and more recently, postural control schemes. Here, we present a comparative study of two types of finite state machines and a postural control scheme using both virtual and physical assessment procedures with seven nondisabled subjects. The Southampton Hand Assessment Procedure (SHAP) was used in order to compare the effectiveness of the controllers during activities of daily living using a multigrasp artificial hand. Also, a virtual hand posture matching task was used to compare the controllers when reproducing six target postures. The performance when using the postural control scheme was significantly better (p < 0.05) than the finite state machines during the physical assessment when comparing within-subject averages using the SHAP percent difference metric. The virtual assessment results described significantly greater completion rates (97% and 99%) for the finite state machines, but the movement time tended to be faster (2.7 s) for the postural control scheme. Our results substantiate that postural control schemes rival other state-of-the-art myoelectric controllers.
Prospects for Evidence -Based Software Assurance: Models and Analysis

DTIC Science & Technology

2015-09-01

virtual machine is much lighter than the workstation. The virtual machine doesn’t need to run anti- virus , firewalls, intrusion preven- tion systems...34] Maiorca, D., Corona , I., and Giacinto, G. Looking at the bag is not enough to find the bomb: An evasion of structural methods for malicious PDF...CCS ’13, ACM, pp. 119–130. [35] Maiorca, D., Giacinto, G., and Corona , I. A pattern recognition system for malicious PDF files detection. In
Characterization of robotics parallel algorithms and mapping onto a reconfigurable SIMD machine

NASA Technical Reports Server (NTRS)

Lee, C. S. G.; Lin, C. T.

1989-01-01

The kinematics, dynamics, Jacobian, and their corresponding inverse computations are six essential problems in the control of robot manipulators. Efficient parallel algorithms for these computations are discussed and analyzed. Their characteristics are identified and a scheme on the mapping of these algorithms to a reconfigurable parallel architecture is presented. Based on the characteristics including type of parallelism, degree of parallelism, uniformity of the operations, fundamental operations, data dependencies, and communication requirement, it is shown that most of the algorithms for robotic computations possess highly regular properties and some common structures, especially the linear recursive structure. Moreover, they are well-suited to be implemented on a single-instruction-stream multiple-data-stream (SIMD) computer with reconfigurable interconnection network. The model of a reconfigurable dual network SIMD machine with internal direct feedback is introduced. A systematic procedure internal direct feedback is introduced. A systematic procedure to map these computations to the proposed machine is presented. A new scheduling problem for SIMD machines is investigated and a heuristic algorithm, called neighborhood scheduling, that reorders the processing sequence of subtasks to reduce the communication time is described. Mapping results of a benchmark algorithm are illustrated and discussed.

NAS Parallel Benchmark. Results 11-96: Performance Comparison of HPF and MPI Based NAS Parallel Benchmarks. 1.0

NASA Technical Reports Server (NTRS)

Saini, Subash; Bailey, David; Chancellor, Marisa K. (Technical Monitor)

1997-01-01

High Performance Fortran (HPF), the high-level language for parallel Fortran programming, is based on Fortran 90. HALF was defined by an informal standards committee known as the High Performance Fortran Forum (HPFF) in 1993, and modeled on TMC's CM Fortran language. Several HPF features have since been incorporated into the draft ANSI/ISO Fortran 95, the next formal revision of the Fortran standard. HPF allows users to write a single parallel program that can execute on a serial machine, a shared-memory parallel machine, or a distributed-memory parallel machine. HPF eliminates the complex, error-prone task of explicitly specifying how, where, and when to pass messages between processors on distributed-memory machines, or when to synchronize processors on shared-memory machines. HPF is designed in a way that allows the programmer to code an application at a high level, and then selectively optimize portions of the code by dropping into message-passing or calling tuned library routines as 'extrinsics'. Compilers supporting High Performance Fortran features first appeared in late 1994 and early 1995 from Applied Parallel Research (APR) Digital Equipment Corporation, and The Portland Group (PGI). IBM introduced an HPF compiler for the IBM RS/6000 SP/2 in April of 1996. Over the past two years, these implementations have shown steady improvement in terms of both features and performance. The performance of various hardware/ programming model (HPF and MPI (message passing interface)) combinations will be compared, based on latest NAS (NASA Advanced Supercomputing) Parallel Benchmark (NPB) results, thus providing a cross-machine and cross-model comparison. Specifically, HPF based NPB results will be compared with MPI based NPB results to provide perspective on performance currently obtainable using HPF versus MPI or versus hand-tuned implementations such as those supplied by the hardware vendors. In addition we would also present NPB (Version 1.0) performance results for the following systems: DEC Alpha Server 8400 5/440, Fujitsu VPP Series (VX, VPP300, and VPP700), HP/Convex Exemplar SPP2000, IBM RS/6000 SP P2SC node (120 MHz) NEC SX-4/32, SGI/CRAY T3E, SGI Origin2000.
The FORCE - A highly portable parallel programming language

NASA Technical Reports Server (NTRS)

Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

1989-01-01

This paper explains why the FORCE parallel programming language is easily portable among six different shared-memory multiprocessors, and how a two-level macro preprocessor makes it possible to hide low-level machine dependencies and to build machine-independent high-level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared-memory multiprocessor executing them.
The FORCE: A highly portable parallel programming language

NASA Technical Reports Server (NTRS)

Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

1989-01-01

Here, it is explained why the FORCE parallel programming language is easily portable among six different shared-memory microprocessors, and how a two-level macro preprocessor makes it possible to hide low level machine dependencies and to build machine-independent high level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared memory multiprocessor executing them.
Identification of Interesting Objects in Large Spectral Surveys Using Highly Parallelized Machine Learning

NASA Astrophysics Data System (ADS)

Škoda, Petr; Palička, Andrej; Koza, Jakub; Shakurova, Ksenia

2017-06-01

The current archives of LAMOST multi-object spectrograph contain millions of fully reduced spectra, from which the automatic pipelines have produced catalogues of many parameters of individual objects, including their approximate spectral classification. This is, however, mostly based on the global shape of the whole spectrum and on integral properties of spectra in given bandpasses, namely presence and equivalent width of prominent spectral lines, while for identification of some interesting object types (e.g. Be stars or quasars) the detailed shape of only a few lines is crucial. Here the machine learning is bringing a new methodology capable of improving the reliability of classification of such objects even in boundary cases. We present results of Spark-based semi-supervised machine learning of LAMOST spectra attempting to automatically identify the single and double-peak emission of Hα line typical for Be and B[e] stars. The labelled sample was obtained from archive of 2m Perek telescope at Ondřejov observatory. A simple physical model of spectrograph resolution was used in domain adaptation to LAMOST training domain. The resulting list of candidates contains dozens of Be stars (some are likely yet unknown), but also a bunch of interesting objects resembling spectra of quasars and even blazars, as well as many instrumental artefacts. The verification of a nature of interesting candidates benefited considerably from cross-matching and visualisation in the Virtual Observatory environment.
Dockres: a computer program that analyzes the output of virtual screening of small molecules

PubMed Central

2010-01-01

Background This paper describes a computer program named Dockres that is designed to analyze and summarize results of virtual screening of small molecules. The program is supplemented with utilities that support the screening process. Foremost among these utilities are scripts that run the virtual screening of a chemical library on a large number of processors in parallel. Methods Dockres and some of its supporting utilities are written Fortran-77; other utilities are written as C-shell scripts. They support the parallel execution of the screening. The current implementation of the program handles virtual screening with Autodock-3 and Autodock-4, but can be extended to work with the output of other programs. Results Analysis of virtual screening by Dockres led to both active and selective lead compounds. Conclusions Analysis of virtual screening was facilitated and enhanced by Dockres in both the authors' laboratories as well as laboratories elsewhere. PMID:20205801
Study on Parallel 2-DOF Rotation Machanism in Radar

NASA Astrophysics Data System (ADS)

Jiang, Ming; Hu, Xuelong; Liu, Lei; Yu, Yunfei

The spherical parallel machine has become the world's academic and industrial focus of the field in recent years due to its simple and economical manufacture as well as its structural compactness especially suitable for areas where space gesture changes. This paper dwells upon its present research and development home and abroad. The newer machine (RGRR-II) can rotate around the axis z within 360° and the axis y1 from -90° to +90°. It has the advantages such as less moving parts (only 3 parts), larger ratio of work space to machine size, zero mechanic coupling, no singularity. Constructing rotation machine with spherical parallel 2-DOF rotation join (RGRR-II) may realize semispherical movement with zero dead point and extent the range. Control card (PA8000NT Series CNC) is installed in the computer. The card can run the corresponding software which realizes radar movement control. The machine meets the need of radars in plane and satellite which require larger detection range, lighter weight and compacter structure.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Ang; Song, Shuaiwen; Brugel, Eric

To continuously comply with Moore’s Law, modern parallel machines become increasingly complex. Effectively tuning application performance for these machines therefore becomes a daunting task. Moreover, identifying performance bottlenecks at application and architecture level, as well as evaluating various optimization strategies, are becoming extremely difficult when the entanglement of numerous correlated factors is being presented. To tackle these challenges, we present a visual analytical model named “X”. It is intuitive and sufficiently flexible to track all the typical features of a parallel machine.
Final Report: Enabling Exascale Hardware and Software Design through Scalable System Virtualization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bridges, Patrick G.

2015-02-01

In this grant, we enhanced the Palacios virtual machine monitor to increase its scalability and suitability for addressing exascale system software design issues. This included a wide range of research on core Palacios features, large-scale system emulation, fault injection, perfomrance monitoring, and VMM extensibility. This research resulted in large number of high-impact publications in well-known venues, the support of a number of students, and the graduation of two Ph.D. students and one M.S. student. In addition, our enhanced version of the Palacios virtual machine monitor has been adopted as a core element of the Hobbes operating system under active DOE-fundedmore » research and development.« less
Automated Instrumentation, Monitoring and Visualization of PVM Programs Using AIMS

NASA Technical Reports Server (NTRS)

Mehra, Pankaj; VanVoorst, Brian; Yan, Jerry; Tucker, Deanne (Technical Monitor)

1994-01-01

We present views and analysis of the execution of several PVM codes for Computational Fluid Dynamics on a network of Sparcstations, including (a) NAS Parallel benchmarks CG and MG (White, Alund and Sunderam 1993); (b) a multi-partitioning algorithm for NAS Parallel Benchmark SP (Wijngaart 1993); and (c) an overset grid flowsolver (Smith 1993). These views and analysis were obtained using our Automated Instrumentation and Monitoring System (AIMS) version 3.0, a toolkit for debugging the performance of PVM programs. We will describe the architecture, operation and application of AIMS. The AIMS toolkit contains (a) Xinstrument, which can automatically instrument various computational and communication constructs in message-passing parallel programs; (b) Monitor, a library of run-time trace-collection routines; (c) VK (Visual Kernel), an execution-animation tool with source-code clickback; and (d) Tally, a tool for statistical analysis of execution profiles. Currently, Xinstrument can handle C and Fortran77 programs using PVM 3.2.x; Monitor has been implemented and tested on Sun 4 systems running SunOS 4.1.2; and VK uses X11R5 and Motif 1.2. Data and views obtained using AIMS clearly illustrate several characteristic features of executing parallel programs on networked workstations: (a) the impact of long message latencies; (b) the impact of multiprogramming overheads and associated load imbalance; (c) cache and virtual-memory effects; and (4significant skews between workstation clocks. Interestingly, AIMS can compensate for constant skew (zero drift) by calibrating the skew between a parent and its spawned children. In addition, AIMS' skew-compensation algorithm can adjust timestamps in a way that eliminates physically impossible communications (e.g., messages going backwards in time). Our current efforts are directed toward creating new views to explain the observed performance of PVM programs. Some of the features planned for the near future include: (a) ConfigView, showing the physical topology of the virtual machine, inferred using specially formatted IP (Internet Protocol) packets; and (b) LoadView, synchronous animation of PVM-program execution and resource-utilization patterns.
GATECloud.net: a platform for large-scale, open-source text processing on the cloud.

PubMed

Tablan, Valentin; Roberts, Ian; Cunningham, Hamish; Bontcheva, Kalina

2013-01-28

Cloud computing is increasingly being regarded as a key enabler of the 'democratization of science', because on-demand, highly scalable cloud computing facilities enable researchers anywhere to carry out data-intensive experiments. In the context of natural language processing (NLP), algorithms tend to be complex, which makes their parallelization and deployment on cloud platforms a non-trivial task. This study presents a new, unique, cloud-based platform for large-scale NLP research--GATECloud. net. It enables researchers to carry out data-intensive NLP experiments by harnessing the vast, on-demand compute power of the Amazon cloud. Important infrastructural issues are dealt with by the platform, completely transparently for the researcher: load balancing, efficient data upload and storage, deployment on the virtual machines, security and fault tolerance. We also include a cost-benefit analysis and usage evaluation.
A heterogeneous computing environment for simulating astrophysical fluid flows

NASA Technical Reports Server (NTRS)

Cazes, J.

1994-01-01

In the Concurrent Computing Laboratory in the Department of Physics and Astronomy at Louisiana State University we have constructed a heterogeneous computing environment that permits us to routinely simulate complicated three-dimensional fluid flows and to readily visualize the results of each simulation via three-dimensional animation sequences. An 8192-node MasPar MP-1 computer with 0.5 GBytes of RAM provides 250 MFlops of execution speed for our fluid flow simulations. Utilizing the parallel virtual machine (PVM) language, at periodic intervals data is automatically transferred from the MP-1 to a cluster of workstations where individual three-dimensional images are rendered for inclusion in a single animation sequence. Work is underway to replace executions on the MP-1 with simulations performed on the 512-node CM-5 at NCSA and to simultaneously gain access to more potent volume rendering workstations.
minimega v. 3.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crussell, Jonathan; Erickson, Jeremy; Fritz, David

minimega is an emulytics platform for creating testbeds of networked devices. The platoform consists of easily deployable tools to facilitate bringing up large networks of virtual machines including Windows, Linux, and Android. minimega allows experiments to be brought up quickly with almost no configuration. minimega also includes tools for simple cluster, management, as well as tools for creating Linux-based virtual machines. This release of minimega includes new emulated sensors for Android devices to improve the fidelity of testbeds that include mobile devices. Emulated sensors include GPS and
Phenomenology tools on cloud infrastructures using OpenStack

NASA Astrophysics Data System (ADS)

Campos, I.; Fernández-del-Castillo, E.; Heinemeyer, S.; Lopez-Garcia, A.; Pahlen, F.; Borges, G.

2013-04-01

We present a new environment for computations in particle physics phenomenology employing recent developments in cloud computing. On this environment users can create and manage "virtual" machines on which the phenomenology codes/tools can be deployed easily in an automated way. We analyze the performance of this environment based on "virtual" machines versus the utilization of physical hardware. In this way we provide a qualitative result for the influence of the host operating system on the performance of a representative set of applications for phenomenology calculations.
An integrated runtime and compile-time approach for parallelizing structured and block structured applications

NASA Technical Reports Server (NTRS)

Agrawal, Gagan; Sussman, Alan; Saltz, Joel

1993-01-01

Scientific and engineering applications often involve structured meshes. These meshes may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems). A combined runtime and compile-time approach for parallelizing these applications on distributed memory parallel machines in an efficient and machine-independent fashion was described. A runtime library which can be used to port these applications on distributed memory machines was designed and implemented. The library is currently implemented on several different systems. To further ease the task of application programmers, methods were developed for integrating this runtime library with compilers for HPK-like parallel programming languages. How this runtime library was integrated with the Fortran 90D compiler being developed at Syracuse University is discussed. Experimental results to demonstrate the efficacy of our approach are presented. A multiblock Navier-Stokes solver template and a multigrid code were experimented with. Our experimental results show that our primitives have low runtime communication overheads. Further, the compiler parallelized codes perform within 20 percent of the code parallelized by manually inserting calls to the runtime library.
Enhanced networked server management with random remote backups

NASA Astrophysics Data System (ADS)

Kim, Song-Kyoo

2003-08-01

In this paper, the model is focused on available server management in network environments. The (remote) backup servers are hooked up by VPN (Virtual Private Network) and replace broken main severs immediately. A virtual private network (VPN) is a way to use a public network infrastructure and hooks up long-distance servers within a single network infrastructure. The servers can be represent as "machines" and then the system deals with main unreliable and random auxiliary spare (remote backup) machines. When the system performs a mandatory routine maintenance, auxiliary machines are being used for backups during idle periods. Unlike other existing models, the availability of auxiliary machines is changed for each activation in this enhanced model. Analytically tractable results are obtained by using several mathematical techniques and the results are demonstrated in the framework of optimized networked server allocation problems.
Establishing a group of endpoints in a parallel computer

DOEpatents

Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.; Xue, Hanhong

2016-02-02

A parallel computer executes a number of tasks, each task includes a number of endpoints and the endpoints are configured to support collective operations. In such a parallel computer, establishing a group of endpoints receiving a user specification of a set of endpoints included in a global collection of endpoints, where the user specification defines the set in accordance with a predefined virtual representation of the endpoints, the predefined virtual representation is a data structure setting forth an organization of tasks and endpoints included in the global collection of endpoints and the user specification defines the set of endpoints without a user specification of a particular endpoint; and defining a group of endpoints in dependence upon the predefined virtual representation of the endpoints and the user specification.
A Cooperative Approach to Virtual Machine Based Fault Injection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Naughton III, Thomas J; Engelmann, Christian; Vallee, Geoffroy R

Resilience investigations often employ fault injection (FI) tools to study the effects of simulated errors on a target system. It is important to keep the target system under test (SUT) isolated from the controlling environment in order to maintain control of the experiement. Virtual machines (VMs) have been used to aid these investigations due to the strong isolation properties of system-level virtualization. A key challenge in fault injection tools is to gain proper insight and context about the SUT. In VM-based FI tools, this challenge of target con- text is increased due to the separation between host and guest (VM).more » We discuss an approach to VM-based FI that leverages virtual machine introspection (VMI) methods to gain insight into the target s context running within the VM. The key to this environment is the ability to provide basic information to the FI system that can be used to create a map of the target environment. We describe a proof- of-concept implementation and a demonstration of its use to introduce simulated soft errors into an iterative solver benchmark running in user-space of a guest VM.« less
Multiplexing Low and High QoS Workloads in Virtual Environments

NASA Astrophysics Data System (ADS)

Verboven, Sam; Vanmechelen, Kurt; Broeckhove, Jan

Virtualization technology has introduced new ways for managing IT infrastructure. The flexible deployment of applications through self-contained virtual machine images has removed the barriers for multiplexing, suspending and migrating applications with their entire execution environment, allowing for a more efficient use of the infrastructure. These developments have given rise to an important challenge regarding the optimal scheduling of virtual machine workloads. In this paper, we specifically address the VM scheduling problem in which workloads that require guaranteed levels of CPU performance are mixed with workloads that do not require such guarantees. We introduce a framework to analyze this scheduling problem and evaluate to what extent such mixed service delivery is beneficial for a provider of virtualized IT infrastructure. Traditionally providers offer IT resources under a guaranteed and fixed performance profile, which can lead to underutilization. The findings of our simulation study show that through proper tuning of a limited set of parameters, the proposed scheduling algorithm allows for a significant increase in utilization without sacrificing on performance dependability.
Automated Solar Module Assembly Line

NASA Technical Reports Server (NTRS)

Bycer, M.

1979-01-01

The gathering of information that led to the design approach of the machine, and a summary of the findings in the areas of study along with a description of each station of the machine are discussed. The machine is a cell stringing and string applique machine which is flexible in design, capable of handling a variety of cells and assembling strings of cells which can then be placed in a matrix up to 4 ft x 2 ft. in series or parallel arrangement. The target machine cycle is to be 5 seconds per cell. This machine is primarily adapted to 100 MM round cells with one or two tabs between cells. It places finished strings of up to twelve cells in a matrix of up to six such strings arranged in series or in parallel.
Simulation Exploration through Immersive Parallel Planes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brunhart-Lupo, Nicholas J; Bush, Brian W; Gruchalla, Kenny M

We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, eachmore » individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.« less

Simulation Exploration through Immersive Parallel Planes: Preprint

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brunhart-Lupo, Nicholas; Bush, Brian W.; Gruchalla, Kenny

We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, eachmore » individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.« less
Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Secchi, Simone; Tumeo, Antonino; Villa, Oreste

Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the main factors that limit performance scaling of such architectures. Modern high-performance computing DSM systems have evolved toward exploitation of massive hardware multi-threading and fine-grained memory hashing to tolerate irregular latencies, avoid network hot-spots and enable high scaling. In order to model the performance of such large-scale machines, parallel simulation has been proved to be a promising approach to achieve good accuracy inmore » reasonable times. One of the most critical factors in solving the simulation speed-accuracy trade-off is network modeling. The Cray XMT is a massively multi-threaded supercomputing architecture that belongs to the DSM class, since it implements a globally-shared address space abstraction on top of a physically distributed memory substrate. In this paper, we discuss the development of a contention-aware network model intended to be integrated in a full-system XMT simulator. We start by measuring the effects of network contention in a 128-processor XMT machine and then investigate the trade-off that exists between simulation accuracy and speed, by comparing three network models which operate at different levels of accuracy. The comparison and model validation is performed by executing a string-matching algorithm on the full-system simulator and on the XMT, using three datasets that generate noticeably different contention patterns.« less
Virtual reality for intelligent and interactive operating, training, and visualization systems

NASA Astrophysics Data System (ADS)

Freund, Eckhard; Rossmann, Juergen; Schluse, Michael

2000-10-01

Virtual Reality Methods allow a new and intuitive way of communication between man and machine. The basic idea of Virtual Reality (VR) is the generation of artificial computer simulated worlds, which the user not only can look at but also can interact with actively using data glove and data helmet. The main emphasis for the use of such techniques at the IRF is the development of a new generation of operator interfaces for the control of robots and other automation components and for intelligent training systems for complex tasks. The basic idea of the methods developed at the IRF for the realization of Projective Virtual Reality is to let the user work in the virtual world as he would act in reality. The user actions are recognized by the Virtual reality System and by means of new and intelligent control software projected onto the automation components like robots which afterwards perform the necessary actions in reality to execute the users task. In this operation mode the user no longer has to be a robot expert to generate tasks for robots or to program them, because intelligent control software recognizes the users intention and generated automatically the commands for nearly every automation component. Now, Virtual Reality Methods are ideally suited for universal man-machine-interfaces for the control and supervision of a big class of automation components, interactive training and visualization systems. The Virtual Reality System of the IRF-COSIMIR/VR- forms the basis for different projects starting with the control of space automation systems in the projects CIROS, VITAL and GETEX, the realization of a comprehensive development tool for the International Space Station and last but not least with the realistic simulation fire extinguishing, forest machines and excavators which will be presented in the final paper in addition to the key ideas of this Virtual Reality System.
Dynamic provisioning of local and remote compute resources with OpenStack

NASA Astrophysics Data System (ADS)

Giffels, M.; Hauth, T.; Polgart, F.; Quast, G.

2015-12-01

Modern high-energy physics experiments rely on the extensive usage of computing resources, both for the reconstruction of measured events as well as for Monte-Carlo simulation. The Institut fur Experimentelle Kernphysik (EKP) at KIT is participating in both the CMS and Belle experiments with computing and storage resources. In the upcoming years, these requirements are expected to increase due to growing amount of recorded data and the rise in complexity of the simulated events. It is therefore essential to increase the available computing capabilities by tapping into all resource pools. At the EKP institute, powerful desktop machines are available to users. Due to the multi-core nature of modern CPUs, vast amounts of CPU time are not utilized by common desktop usage patterns. Other important providers of compute capabilities are classical HPC data centers at universities or national research centers. Due to the shared nature of these installations, the standardized software stack required by HEP applications cannot be installed. A viable way to overcome this constraint and offer a standardized software environment in a transparent manner is the usage of virtualization technologies. The OpenStack project has become a widely adopted solution to virtualize hardware and offer additional services like storage and virtual machine management. This contribution will report on the incorporation of the institute's desktop machines into a private OpenStack Cloud. The additional compute resources provisioned via the virtual machines have been used for Monte-Carlo simulation and data analysis. Furthermore, a concept to integrate shared, remote HPC centers into regular HEP job workflows will be presented. In this approach, local and remote resources are merged to form a uniform, virtual compute cluster with a single point-of-entry for the user. Evaluations of the performance and stability of this setup and operational experiences will be discussed.
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

NASA Technical Reports Server (NTRS)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.
Virtualization for the LHCb Online system

NASA Astrophysics Data System (ADS)

Bonaccorsi, Enrico; Brarda, Loic; Moine, Gary; Neufeld, Niko

2011-12-01

Virtualization has long been advertised by the IT-industry as a way to cut down cost, optimise resource usage and manage the complexity in large data-centers. The great number and the huge heterogeneity of hardware, both industrial and custom-made, has up to now led to reluctance in the adoption of virtualization in the IT infrastructure of large experiment installations. Our experience in the LHCb experiment has shown that virtualization improves the availability and the manageability of the whole system. We have done an evaluation of available hypervisors / virtualization solutions and find that the Microsoft HV technology provides a high level of maturity and flexibility for our purpose. We present the results of these comparison tests, describing in detail, the architecture of our virtualization infrastructure with a special emphasis on the security for services visible to the outside world. Security is achieved by a sophisticated combination of VLANs, firewalls and virtual routing - the cost and benefits of this solution are analysed. We have adapted our cluster management tools, notably Quattor, for the needs of virtual machines and this allows us to migrate smoothly services on physical machines to the virtualized infrastructure. The procedures for migration will also be described. In the final part of the document we describe our recent R&D activities aiming to replacing the SAN-backend for the virtualization by a cheaper iSCSI solution - this will allow to move all servers and related services to the virtualized infrastructure, excepting the ones doing hardware control via non-commodity PCI plugin cards.
Evaluating open-source cloud computing solutions for geosciences

NASA Astrophysics Data System (ADS)

Huang, Qunying; Yang, Chaowei; Liu, Kai; Xia, Jizhe; Xu, Chen; Li, Jing; Gui, Zhipeng; Sun, Min; Li, Zhenglong

2013-09-01

Many organizations start to adopt cloud computing for better utilizing computing resources by taking advantage of its scalability, cost reduction, and easy to access characteristics. Many private or community cloud computing platforms are being built using open-source cloud solutions. However, little has been done to systematically compare and evaluate the features and performance of open-source solutions in supporting Geosciences. This paper provides a comprehensive study of three open-source cloud solutions, including OpenNebula, Eucalyptus, and CloudStack. We compared a variety of features, capabilities, technologies and performances including: (1) general features and supported services for cloud resource creation and management, (2) advanced capabilities for networking and security, and (3) the performance of the cloud solutions in provisioning and operating the cloud resources as well as the performance of virtual machines initiated and managed by the cloud solutions in supporting selected geoscience applications. Our study found that: (1) no significant performance differences in central processing unit (CPU), memory and I/O of virtual machines created and managed by different solutions, (2) OpenNebula has the fastest internal network while both Eucalyptus and CloudStack have better virtual machine isolation and security strategies, (3) Cloudstack has the fastest operations in handling virtual machines, images, snapshots, volumes and networking, followed by OpenNebula, and (4) the selected cloud computing solutions are capable for supporting concurrent intensive web applications, computing intensive applications, and small-scale model simulations without intensive data communication.
A performance study of live VM migration technologies: VMotion vs XenMotion

NASA Astrophysics Data System (ADS)

Feng, Xiujie; Tang, Jianxiong; Luo, Xuan; Jin, Yaohui

2011-12-01

Due to the growing demand of flexible resource management for cloud computing services, researches on live virtual machine migration have attained more and more attention. Live migration of virtual machine across different hosts has been a powerful tool to facilitate system maintenance, load balancing, fault tolerance and so on. In this paper, we use a measurement-based approach to compare the performance of two major live migration technologies under certain network conditions, i.e., VMotion and XenMotion. The results show that VMotion generates much less data transferred than XenMotion when migrating identical VMs. However, in network with moderate packet loss and delay, which are typical in a VPN (virtual private network) scenario used to connect the data centers, XenMotion outperforms VMotion in total migration time. We hope that this study can be helpful in choosing suitable virtualization environments for data center administrators and optimizing existing live migration mechanisms.
Three-Dimensional High-Lift Analysis Using a Parallel Unstructured Multigrid Solver

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.

1998-01-01

A directional implicit unstructured agglomeration multigrid solver is ported to shared and distributed memory massively parallel machines using the explicit domain-decomposition and message-passing approach. Because the algorithm operates on local implicit lines in the unstructured mesh, special care is required in partitioning the problem for parallel computing. A weighted partitioning strategy is described which avoids breaking the implicit lines across processor boundaries, while incurring minimal additional communication overhead. Good scalability is demonstrated on a 128 processor SGI Origin 2000 machine and on a 512 processor CRAY T3E machine for reasonably fine grids. The feasibility of performing large-scale unstructured grid calculations with the parallel multigrid algorithm is demonstrated by computing the flow over a partial-span flap wing high-lift geometry on a highly resolved grid of 13.5 million points in approximately 4 hours of wall clock time on the CRAY T3E.
Clock Agreement Among Parallel Supercomputer Nodes

DOE Data Explorer

Jones, Terry R.; Koenig, Gregory A.

2014-04-30

This dataset presents measurements that quantify the clock synchronization time-agreement characteristics among several high performance computers including the current world's most powerful machine for open science, the U.S. Department of Energy's Titan machine sited at Oak Ridge National Laboratory. These ultra-fast machines derive much of their computational capability from extreme node counts (over 18000 nodes in the case of the Titan machine). Time-agreement is commonly utilized by parallel programming applications and tools, distributed programming application and tools, and system software. Our time-agreement measurements detail the degree of time variance between nodes and how that variance changes over time. The dataset includes empirical measurements and the accompanying spreadsheets.
Charliecloud

DOE Office of Scientific and Technical Information (OSTI.GOV)

Priedhorsky, Reid; Randles, Tim

Charliecloud is a set of scripts to let users run a virtual cluster of virtual machines (VMs) on a desktop or supercomputer. Key functions include: 1. Creating (typically by installing an operating system from vendor media) and updating VM images; 2. Running a single VM; 3. Running multiple VMs in a virtual cluster. The virtual machines can talk to one another over the network and (in some cases) the outside world. This is accomplished by calling external programs such as QEMU and the Virtual Distributed Ethernet (VDE) suite. The goal is to let users have a virtual cluster containing nodesmore » where they have privileged access, while isolating that privilege within the virtual cluster so it cannot affect the physical compute resources. Host configuration enforces security; this is not included in Charliecloud, though security guidelines are included in its documentation and Charliecloud is designed to facilitate such configuration. Charliecloud manages passing information from host computers into and out of the virtual machines, such as parameters of the virtual cluster, input data specified by the user, output data from virtual compute jobs, VM console display, and network connections (e.g., SSH or X11). Parameters for the virtual cluster (number of VMs, RAM and disk per VM, etc.) are specified by the user or gathered from the environment (e.g., SLURM environment variables). Example job scripts are included. These include computation examples (such as a "hello world" MPI job) as well as performance tests. They also include a security test script to verify that the virtual cluster is appropriately sandboxed. Tests include: 1. Pinging hosts inside and outside the virtual cluster to explore connectivity; 2. Port scans (again inside and outside) to see what services are available; 3. Sniffing tests to see what traffic is visible to running VMs; 4. IP address spoofing to test network functionality in this case; 5. File access tests to make sure host access permissions are enforced. This test script is not a comprehensive scanner and does not test for specific vulnerabilities. Importantly, no information about physical hosts or network topology is included in this script (or any of Charliecloud); while part of a sensible test, such information is specified by the user when the test is run. That is, one cannot learn anything about the LANL network or computing infrastructure by examining Charliecloud code.« less
Characterizing parallel file-access patterns on a large-scale multiprocessor

NASA Technical Reports Server (NTRS)

Purakayastha, A.; Ellis, Carla; Kotz, David; Nieuwejaar, Nils; Best, Michael L.

1995-01-01

High-performance parallel file systems are needed to satisfy tremendous I/O requirements of parallel scientific applications. The design of such high-performance parallel file systems depends on a comprehensive understanding of the expected workload, but so far there have been very few usage studies of multiprocessor file systems. This paper is part of the CHARISMA project, which intends to fill this void by measuring real file-system workloads on various production parallel machines. In particular, we present results from the CM-5 at the National Center for Supercomputing Applications. Our results are unique because we collect information about nearly every individual I/O request from the mix of jobs running on the machine. Analysis of the traces leads to various recommendations for parallel file-system design.
Parallel Computing Using Web Servers and "Servlets".

ERIC Educational Resources Information Center

Lo, Alfred; Bloor, Chris; Choi, Y. K.

2000-01-01

Describes parallel computing and presents inexpensive ways to implement a virtual parallel computer with multiple Web servers. Highlights include performance measurement of parallel systems; models for using Java and intranet technology including single server, multiple clients and multiple servers, single client; and a comparison of CGI (common…
Virtual Mission Operations of Remote Sensors With Rapid Access To and From Space

NASA Technical Reports Server (NTRS)

Ivancic, William D.; Stewart, Dave; Walke, Jon; Dikeman, Larry; Sage, Steven; Miller, Eric; Northam, James; Jackson, Chris; Taylor, John; Lynch, Scott;

2010-01-01

This paper describes network-centric operations, where a virtual mission operations center autonomously receives sensor triggers, and schedules space and ground assets using Internet-based technologies and service-oriented architectures. For proof-of-concept purposes, sensor triggers are received from the United States Geological Survey (USGS) to determine targets for space-based sensors. The Surrey Satellite Technology Limited (SSTL) Disaster Monitoring Constellation satellite, the United Kingdom Disaster Monitoring Constellation (UK-DMC), is used as the space-based sensor. The UK-DMC s availability is determined via machine-to-machine communications using SSTL s mission planning system. Access to/from the UK-DMC for tasking and sensor data is via SSTL s and Universal Space Network s (USN) ground assets. The availability and scheduling of USN s assets can also be performed autonomously via machine-to-machine communications. All communication, both on the ground and between ground and space, uses open Internet standards.

Application of high-performance computing to numerical simulation of human movement

NASA Technical Reports Server (NTRS)

Anderson, F. C.; Ziegler, J. M.; Pandy, M. G.; Whalen, R. T.

1995-01-01

We have examined the feasibility of using massively-parallel and vector-processing supercomputers to solve large-scale optimization problems for human movement. Specifically, we compared the computational expense of determining the optimal controls for the single support phase of gait using a conventional serial machine (SGI Iris 4D25), a MIMD parallel machine (Intel iPSC/860), and a parallel-vector-processing machine (Cray Y-MP 8/864). With the human body modeled as a 14 degree-of-freedom linkage actuated by 46 musculotendinous units, computation of the optimal controls for gait could take up to 3 months of CPU time on the Iris. Both the Cray and the Intel are able to reduce this time to practical levels. The optimal solution for gait can be found with about 77 hours of CPU on the Cray and with about 88 hours of CPU on the Intel. Although the overall speeds of the Cray and the Intel were found to be similar, the unique capabilities of each machine are better suited to different portions of the computational algorithm used. The Intel was best suited to computing the derivatives of the performance criterion and the constraints whereas the Cray was best suited to parameter optimization of the controls. These results suggest that the ideal computer architecture for solving very large-scale optimal control problems is a hybrid system in which a vector-processing machine is integrated into the communication network of a MIMD parallel machine.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Barrett, Brian; Brightwell, Ronald B.; Grant, Ryan

This report presents a specification for the Portals 4 networ k programming interface. Portals 4 is intended to allow scalable, high-performance network communication betwee n nodes of a parallel computing system. Portals 4 is well suited to massively parallel processing and embedded syste ms. Portals 4 represents an adaption of the data movement layer developed for massively parallel processing platfor ms, such as the 4500-node Intel TeraFLOPS machine. Sandia's Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4 is tarmore » geted to the next generation of machines employing advanced network interface architectures that support enh anced offload capabilities.« less
The Portals 4.0 network programming interface.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barrett, Brian W.; Brightwell, Ronald Brian; Pedretti, Kevin

2012-11-01

This report presents a specification for the Portals 4.0 network programming interface. Portals 4.0 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4.0 is well suited to massively parallel processing and embedded systems. Portals 4.0 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandias Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4.0 is targeted to the next generationmore » of machines employing advanced network interface architectures that support enhanced offload capabilities.« less
Computing with Beowulf

NASA Technical Reports Server (NTRS)

Cohen, Jarrett

1999-01-01

Parallel computers built out of mass-market parts are cost-effectively performing data processing and simulation tasks. The Supercomputing (now known as "SC") series of conferences celebrated its 10th anniversary last November. While vendors have come and gone, the dominant paradigm for tackling big problems still is a shared-resource, commercial supercomputer. Growing numbers of users needing a cheaper or dedicated-access alternative are building their own supercomputers out of mass-market parts. Such machines are generally called Beowulf-class systems after the 11th century epic. This modern-day Beowulf story began in 1994 at NASA's Goddard Space Flight Center. A laboratory for the Earth and space sciences, computing managers there threw down a gauntlet to develop a $50,000 gigaFLOPS workstation for processing satellite data sets. Soon, Thomas Sterling and Don Becker were working on the Beowulf concept at the University Space Research Association (USRA)-run Center of Excellence in Space Data and Information Sciences (CESDIS). Beowulf clusters mix three primary ingredients: commodity personal computers or workstations, low-cost Ethernet networks, and the open-source Linux operating system. One of the larger Beowulfs is Goddard's Highly-parallel Integrated Virtual Environment, or HIVE for short.
A Concept for Optimizing Behavioural Effectiveness & Efficiency

NASA Astrophysics Data System (ADS)

Barca, Jan Carlo; Rumantir, Grace; Li, Raymond

Both humans and machines exhibit strengths and weaknesses that can be enhanced by merging the two entities. This research aims to provide a broader understanding of how closer interactions between these two entities can facilitate more optimal goal-directed performance through the use of artificial extensions of the human body. Such extensions may assist us in adapting to and manipulating our environments in a more effective way than any system known today. To demonstrate this concept, we have developed a simulation where a semi interactive virtual spider can be navigated through an environment consisting of several obstacles and a virtual predator capable of killing the spider. The virtual spider can be navigated through the use of three different control systems that can be used to assist in optimising overall goal directed performance. The first two control systems use, an onscreen button interface and a touch sensor, respectively to facilitate human navigation of the spider. The third control system is an autonomous navigation system through the use of machine intelligence embedded in the spider. This system enables the spider to navigate and react to changes in its local environment. The results of this study indicate that machines should be allowed to override human control in order to maximise the benefits of collaboration between man and machine. This research further indicates that the development of strong machine intelligence, sensor systems that engage all human senses, extra sensory input systems, physical remote manipulators, multiple intelligent extensions of the human body, as well as a tighter symbiosis between man and machine, can support an upgrade of the human form.
The influence of negative training set size on machine learning-based virtual screening.

PubMed

Kurczab, Rafał; Smusz, Sabina; Bojarski, Andrzej J

2014-01-01

The paper presents a thorough analysis of the influence of the number of negative training examples on the performance of machine learning methods. The impact of this rather neglected aspect of machine learning methods application was examined for sets containing a fixed number of positive and a varying number of negative examples randomly selected from the ZINC database. An increase in the ratio of positive to negative training instances was found to greatly influence most of the investigated evaluating parameters of ML methods in simulated virtual screening experiments. In a majority of cases, substantial increases in precision and MCC were observed in conjunction with some decreases in hit recall. The analysis of dynamics of those variations let us recommend an optimal composition of training data. The study was performed on several protein targets, 5 machine learning algorithms (SMO, Naïve Bayes, Ibk, J48 and Random Forest) and 2 types of molecular fingerprints (MACCS and CDK FP). The most effective classification was provided by the combination of CDK FP with SMO or Random Forest algorithms. The Naïve Bayes models appeared to be hardly sensitive to changes in the number of negative instances in the training set. In conclusion, the ratio of positive to negative training instances should be taken into account during the preparation of machine learning experiments, as it might significantly influence the performance of particular classifier. What is more, the optimization of negative training set size can be applied as a boosting-like approach in machine learning-based virtual screening.

The influence of negative training set size on machine learning-based virtual screening

PubMed Central

2014-01-01

Background The paper presents a thorough analysis of the influence of the number of negative training examples on the performance of machine learning methods. Results The impact of this rather neglected aspect of machine learning methods application was examined for sets containing a fixed number of positive and a varying number of negative examples randomly selected from the ZINC database. An increase in the ratio of positive to negative training instances was found to greatly influence most of the investigated evaluating parameters of ML methods in simulated virtual screening experiments. In a majority of cases, substantial increases in precision and MCC were observed in conjunction with some decreases in hit recall. The analysis of dynamics of those variations let us recommend an optimal composition of training data. The study was performed on several protein targets, 5 machine learning algorithms (SMO, Naïve Bayes, Ibk, J48 and Random Forest) and 2 types of molecular fingerprints (MACCS and CDK FP). The most effective classification was provided by the combination of CDK FP with SMO or Random Forest algorithms. The Naïve Bayes models appeared to be hardly sensitive to changes in the number of negative instances in the training set. Conclusions In conclusion, the ratio of positive to negative training instances should be taken into account during the preparation of machine learning experiments, as it might significantly influence the performance of particular classifier. What is more, the optimization of negative training set size can be applied as a boosting-like approach in machine learning-based virtual screening. PMID:24976867
Enterprise Cloud Architecture for Chinese Ministry of Railway

NASA Astrophysics Data System (ADS)

Shan, Xumei; Liu, Hefeng

Enterprise like PRC Ministry of Railways (MOR), is facing various challenges ranging from highly distributed computing environment and low legacy system utilization, Cloud Computing is increasingly regarded as one workable solution to address this. This article describes full scale cloud solution with Intel Tashi as virtual machine infrastructure layer, Hadoop HDFS as computing platform, and self developed SaaS interface, gluing virtual machine and HDFS with Xen hypervisor. As a result, on demand computing task application and deployment have been tackled per MOR real working scenarios at the end of article.
Applications of Parallel Computation in Micro-Mechanics and Finite Element Method

NASA Technical Reports Server (NTRS)

Tan, Hui-Qian

1996-01-01

This project discusses the application of parallel computations related with respect to material analyses. Briefly speaking, we analyze some kind of material by elements computations. We call an element a cell here. A cell is divided into a number of subelements called subcells and all subcells in a cell have the identical structure. The detailed structure will be given later in this paper. It is obvious that the problem is "well-structured". SIMD machine would be a better choice. In this paper we try to look into the potentials of SIMD machine in dealing with finite element computation by developing appropriate algorithms on MasPar, a SIMD parallel machine. In section 2, the architecture of MasPar will be discussed. A brief review of the parallel programming language MPL also is given in that section. In section 3, some general parallel algorithms which might be useful to the project will be proposed. And, combining with the algorithms, some features of MPL will be discussed in more detail. In section 4, the computational structure of cell/subcell model will be given. The idea of designing the parallel algorithm for the model will be demonstrated. Finally in section 5, a summary will be given.
The R package "sperrorest" : Parallelized spatial error estimation and variable importance assessment for geospatial machine learning

NASA Astrophysics Data System (ADS)

Schratz, Patrick; Herrmann, Tobias; Brenning, Alexander

2017-04-01

Computational and statistical prediction methods such as the support vector machine have gained popularity in remote-sensing applications in recent years and are often compared to more traditional approaches like maximum-likelihood classification. However, the accuracy assessment of such predictive models in a spatial context needs to account for the presence of spatial autocorrelation in geospatial data by using spatial cross-validation and bootstrap strategies instead of their now more widely used non-spatial equivalent. The R package sperrorest by A. Brenning [IEEE International Geoscience and Remote Sensing Symposium, 1, 374 (2012)] provides a generic interface for performing (spatial) cross-validation of any statistical or machine-learning technique available in R. Since spatial statistical models as well as flexible machine-learning algorithms can be computationally expensive, parallel computing strategies are required to perform cross-validation efficiently. The most recent major release of sperrorest therefore comes with two new features (aside from improved documentation): The first one is the parallelized version of sperrorest(), parsperrorest(). This function features two parallel modes to greatly speed up cross-validation runs. Both parallel modes are platform independent and provide progress information. par.mode = 1 relies on the pbapply package and calls interactively (depending on the platform) parallel::mclapply() or parallel::parApply() in the background. While forking is used on Unix-Systems, Windows systems use a cluster approach for parallel execution. par.mode = 2 uses the foreach package to perform parallelization. This method uses a different way of cluster parallelization than the parallel package does. In summary, the robustness of parsperrorest() is increased with the implementation of two independent parallel modes. A new way of partitioning the data in sperrorest is provided by partition.factor.cv(). This function gives the user the possibility to perform cross-validation at the level of some grouping structure. As an example, in remote sensing of agricultural land uses, pixels from the same field contain nearly identical information and will thus be jointly placed in either the test set or the training set. Other spatial sampling resampling strategies are already available and can be extended by the user.
System software for the finite element machine

NASA Technical Reports Server (NTRS)

Crockett, T. W.; Knott, J. D.

1985-01-01

The Finite Element Machine is an experimental parallel computer developed at Langley Research Center to investigate the application of concurrent processing to structural engineering analysis. This report describes system-level software which has been developed to facilitate use of the machine by applications researchers. The overall software design is outlined, and several important parallel processing issues are discussed in detail, including processor management, communication, synchronization, and input/output. Based on experience using the system, the hardware architecture and software design are critiqued, and areas for further work are suggested.
Virtual Reality Enhanced Instructional Learning

ERIC Educational Resources Information Center

Nachimuthu, K.; Vijayakumari, G.

2009-01-01

Virtual Reality (VR) is a creation of virtual 3D world in which one can feel and sense the world as if it is real. It is allowing engineers to design machines and Educationists to design AV [audiovisual] equipment in real time but in 3-dimensional hologram as if the actual material is being made and worked upon. VR allows a least-cost (energy…
Solving the Cauchy-Riemann equations on parallel computers

NASA Technical Reports Server (NTRS)

Fatoohi, Raad A.; Grosch, Chester E.

1987-01-01

Discussed is the implementation of a single algorithm on three parallel-vector computers. The algorithm is a relaxation scheme for the solution of the Cauchy-Riemann equations; a set of coupled first order partial differential equations. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, and SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The machine architectures are briefly described. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Conclusions are presented.
Virtual manufacturing work cell for engineering

NASA Astrophysics Data System (ADS)

Watanabe, Hideo; Ohashi, Kazushi; Takahashi, Nobuyuki; Kato, Kiyotaka; Fujita, Satoru

1997-12-01

The life cycles of products have been getting shorter. To meet this rapid turnover, manufacturing systems must be frequently changed as well. In engineering to develop manufacturing systems, there are several tasks such as process planning, layout design, programming, and final testing using actual machines. This development of manufacturing systems takes a long time and is expensive. To aid the above engineering process, we have developed the virtual manufacturing workcell (VMW). This paper describes a concept of VMW and design method through computer aided manufacturing engineering using VMW (CAME-VMW) related to the above engineering tasks. The VMW has all design data, and realizes a behavior of equipment and devices using a simulator. The simulator has logical and physical functionality. The one simulates a sequence control and the other simulates motion control, shape movement in 3D space. The simulator can execute the same control software made for actual machines. Therefore we can verify the behavior precisely before the manufacturing workcell will be constructed. The VMW creates engineering work space for several engineers and offers debugging tools such as virtual equipment and virtual controllers. We applied this VMW to development of a transfer workcell for vaporization machine in actual manufacturing system to produce plasma display panel (PDP) workcell and confirmed its effectiveness.
CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing.

PubMed

Angiuoli, Samuel V; Matalka, Malcolm; Gussman, Aaron; Galens, Kevin; Vangala, Mahesh; Riley, David R; Arze, Cesar; White, James R; White, Owen; Fricke, W Florian

2011-08-30

Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.
Identification of Program Signatures from Cloud Computing System Telemetry Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nichols, Nicole M.; Greaves, Mark T.; Smith, William P.

Malicious cloud computing activity can take many forms, including running unauthorized programs in a virtual environment. Detection of these malicious activities while preserving the privacy of the user is an important research challenge. Prior work has shown the potential viability of using cloud service billing metrics as a mechanism for proxy identification of malicious programs. Previously this novel detection method has been evaluated in a synthetic and isolated computational environment. In this paper we demonstrate the ability of billing metrics to identify programs, in an active cloud computing environment, including multiple virtual machines running on the same hypervisor. The openmore » source cloud computing platform OpenStack, is used for private cloud management at Pacific Northwest National Laboratory. OpenStack provides a billing tool (Ceilometer) to collect system telemetry measurements. We identify four different programs running on four virtual machines under the same cloud user account. Programs were identified with up to 95% accuracy. This accuracy is dependent on the distinctiveness of telemetry measurements for the specific programs we tested. Future work will examine the scalability of this approach for a larger selection of programs to better understand the uniqueness needed to identify a program. Additionally, future work should address the separation of signatures when multiple programs are running on the same virtual machine.« less
Dynamic provisioning of a HEP computing infrastructure on a shared hybrid HPC system

NASA Astrophysics Data System (ADS)

Meier, Konrad; Fleig, Georg; Hauth, Thomas; Janczyk, Michael; Quast, Günter; von Suchodoletz, Dirk; Wiebelt, Bernd

2016-10-01

Experiments in high-energy physics (HEP) rely on elaborate hardware, software and computing systems to sustain the high data rates necessary to study rare physics processes. The Institut fr Experimentelle Kernphysik (EKP) at KIT is a member of the CMS and Belle II experiments, located at the LHC and the Super-KEKB accelerators, respectively. These detectors share the requirement, that enormous amounts of measurement data must be processed and analyzed and a comparable amount of simulated events is required to compare experimental results with theoretical predictions. Classical HEP computing centers are dedicated sites which support multiple experiments and have the required software pre-installed. Nowadays, funding agencies encourage research groups to participate in shared HPC cluster models, where scientist from different domains use the same hardware to increase synergies. This shared usage proves to be challenging for HEP groups, due to their specialized software setup which includes a custom OS (often Scientific Linux), libraries and applications. To overcome this hurdle, the EKP and data center team of the University of Freiburg have developed a system to enable the HEP use case on a shared HPC cluster. To achieve this, an OpenStack-based virtualization layer is installed on top of a bare-metal cluster. While other user groups can run their batch jobs via the Moab workload manager directly on bare-metal, HEP users can request virtual machines with a specialized machine image which contains a dedicated operating system and software stack. In contrast to similar installations, in this hybrid setup, no static partitioning of the cluster into a physical and virtualized segment is required. As a unique feature, the placement of the virtual machine on the cluster nodes is scheduled by Moab and the job lifetime is coupled to the lifetime of the virtual machine. This allows for a seamless integration with the jobs sent by other user groups and honors the fairshare policies of the cluster. The developed thin integration layer between OpenStack and Moab can be adapted to other batch servers and virtualization systems, making the concept also applicable for other cluster operators. This contribution will report on the concept and implementation of an OpenStack-virtualized cluster used for HEP workflows. While the full cluster will be installed in spring 2016, a test-bed setup with 800 cores has been used to study the overall system performance and dedicated HEP jobs were run in a virtualized environment over many weeks. Furthermore, the dynamic integration of the virtualized worker nodes, depending on the workload at the institute's computing system, will be described.
Solution of task related to control of swiss-type automatic lathe to get planes parallel to part axis

NASA Astrophysics Data System (ADS)

Tabekina, N. A.; Chepchurov, M. S.; Evtushenko, E. I.; Dmitrievsky, B. S.

2018-05-01

The work solves the problem of automation of machining process namely turning to produce parts having the planes parallel to an axis of rotation of part without using special tools. According to the results, the availability of the equipment of a high speed electromechanical drive to control the operative movements of lathe machine will enable one to get the planes parallel to the part axis. The method of getting planes parallel to the part axis is based on the mathematical model, which is presented as functional dependency between the conveying velocity of the driven element and the time. It describes the operative movements of lathe machine all over the tool path. Using the model of movement of the tool, it has been found that the conveying velocity varies from the maximum to zero value. It will allow one to carry out the reverse of the drive. The scheme of tool placement regarding the workpiece has been proposed for unidirectional movement of the driven element at high conveying velocity. The control method of CNC machines can be used for getting geometrically complex parts on the lathe without using special milling tools.
Wave scheduling - Decentralized scheduling of task forces in multicomputers

NASA Technical Reports Server (NTRS)

Van Tilborg, A. M.; Wittie, L. D.

1984-01-01

Decentralized operating systems that control large multicomputers need techniques to schedule competing parallel programs called task forces. Wave scheduling is a probabilistic technique that uses a hierarchical distributed virtual machine to schedule task forces by recursively subdividing and issuing wavefront-like commands to processing elements capable of executing individual tasks. Wave scheduling is highly resistant to processing element failures because it uses many distributed schedulers that dynamically assign scheduling responsibilities among themselves. The scheduling technique is trivially extensible as more processing elements join the host multicomputer. A simple model of scheduling cost is used by every scheduler node to distribute scheduling activity and minimize wasted processing capacity by using perceived workload to vary decentralized scheduling rules. At low to moderate levels of network activity, wave scheduling is only slightly less efficient than a central scheduler in its ability to direct processing elements to accomplish useful work.
A Latency-Tolerant Partitioner for Distributed Computing on the Information Power Grid

NASA Technical Reports Server (NTRS)

Das, Sajal K.; Harvey, Daniel J.; Biwas, Rupak; Kwak, Dochan (Technical Monitor)

2001-01-01

NASA's Information Power Grid (IPG) is an infrastructure designed to harness the power of graphically distributed computers, databases, and human expertise, in order to solve large-scale realistic computational problems. This type of a meta-computing environment is necessary to present a unified virtual machine to application developers that hides the intricacies of a highly heterogeneous environment and yet maintains adequate security. In this paper, we present a novel partitioning scheme. called MinEX, that dynamically balances processor workloads while minimizing data movement and runtime communication, for applications that are executed in a parallel distributed fashion on the IPG. We also analyze the conditions that are required for the IPG to be an effective tool for such distributed computations. Our results show that MinEX is a viable load balancer provided the nodes of the IPG are connected by a high-speed asynchronous interconnection network.
The Virtual Xenbase: transitioning an online bioinformatics resource to a private cloud

PubMed Central

Karimi, Kamran; Vize, Peter D.

2014-01-01

As a model organism database, Xenbase has been providing informatics and genomic data on Xenopus (Silurana) tropicalis and Xenopus laevis frogs for more than a decade. The Xenbase database contains curated, as well as community-contributed and automatically harvested literature, gene and genomic data. A GBrowse genome browser, a BLAST+ server and stock center support are available on the site. When this resource was first built, all software services and components in Xenbase ran on a single physical server, with inherent reliability, scalability and inter-dependence issues. Recent advances in networking and virtualization techniques allowed us to move Xenbase to a virtual environment, and more specifically to a private cloud. To do so we decoupled the different software services and components, such that each would run on a different virtual machine. In the process, we also upgraded many of the components. The resulting system is faster and more reliable. System maintenance is easier, as individual virtual machines can now be updated, backed up and changed independently. We are also experiencing more effective resource allocation and utilization. Database URL: www.xenbase.org PMID:25380782
Naval Applications of Virtual Reality,

DTIC Science & Technology

1993-01-01

Expert Virtual Reality Special Report 󈨡, pp. 67- 72. 14. SUBJECT TERMS 15 NUMBER o0 PAGES man-machine interface virtual reality decision support...collective and individual performance. -" Virtual reality projects could help *y by Mark Gembicki Av-t-abilty CodesA Avafllat Idt Iofe and David Rousseau...alt- 67 VIRTUAL . REALITY SPECIAl, REPORT r-OPY avcriaikxb to DD)C qg .- 154,41X~~~~~~~~~~~~j 1411 iI..:41 T a].’ 1,1 4 1111 I 4 1 * .11 ~ 4 l.~w111511 I
The portals 4.0.1 network programming interface.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barrett, Brian W.; Brightwell, Ronald Brian; Pedretti, Kevin

2013-04-01

This report presents a specification for the Portals 4.0 network programming interface. Portals 4.0 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4.0 is well suited to massively parallel processing and embedded systems. Portals 4.0 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandias Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4.0 is targeted to the next generationmore » of machines employing advanced network interface architectures that support enhanced offload capabilities. 3« less
Execution models for mapping programs onto distributed memory parallel computers

NASA Technical Reports Server (NTRS)

Sussman, Alan

1992-01-01

The problem of exploiting the parallelism available in a program to efficiently employ the resources of the target machine is addressed. The problem is discussed in the context of building a mapping compiler for a distributed memory parallel machine. The paper describes using execution models to drive the process of mapping a program in the most efficient way onto a particular machine. Through analysis of the execution models for several mapping techniques for one class of programs, we show that the selection of the best technique for a particular program instance can make a significant difference in performance. On the other hand, the results of benchmarks from an implementation of a mapping compiler show that our execution models are accurate enough to select the best mapping technique for a given program.
Conception et mise au point d'un emulateur de machine Synchrone trapezoidale a aimants permanents

NASA Astrophysics Data System (ADS)

Lessard, Francois

The development of technology leads inevitably to higher systems' complexity faced by engineers. Over time, tools are often developed in parallel with the main systems to ensure their sustainability. The work presented in this document provides a new tool for testing motor drives. In general, this project refers to active loads, which are complex dynamic loads emulated electronically with a static converter. Specifically, this document proposes and implements a system whose purpose is to recreate the behaviour of a trapezoidal permanent magnets synchronous machine. The ultimate goal is to connect a motor drive to the three terminal of the motor emulator, as it would with a real motor. The emulator's response then obtained, when subjected to disturbances of the motor drive, is ideally identical to the one of a real motor. The motor emulator led to a significant versatility of a test bench because the electrical and mechanical parameters of the application can be easily modified. The work is divided into two main parts: the static converter and real-rime. Overall, these two entities form a PHIL (Power Hardware-in-the-loop) real-time simulation. The static converter enables the exchange of real power between the drive motor and the real-time simulation. The latter gives the application the intelligence needed to interact with the motor drive in a way which the desired behaviour is recreated. The main partner of this project, Opal-RT, ensures this development. Keywords: virtual machine, PHIL, real-time simulation, electronic load
BioImg.org: A Catalog of Virtual Machine Images for the Life Sciences

PubMed Central

Dahlö, Martin; Haziza, Frédéric; Kallio, Aleksi; Korpelainen, Eija; Bongcam-Rudloff, Erik; Spjuth, Ola

2015-01-01

Virtualization is becoming increasingly important in bioscience, enabling assembly and provisioning of complete computer setups, including operating system, data, software, and services packaged as virtual machine images (VMIs). We present an open catalog of VMIs for the life sciences, where scientists can share information about images and optionally upload them to a server equipped with a large file system and fast Internet connection. Other scientists can then search for and download images that can be run on the local computer or in a cloud computing environment, providing easy access to bioinformatics environments. We also describe applications where VMIs aid life science research, including distributing tools and data, supporting reproducible analysis, and facilitating education. BioImg.org is freely available at: https://bioimg.org. PMID:26401099

BioImg.org: A Catalog of Virtual Machine Images for the Life Sciences.

PubMed

Dahlö, Martin; Haziza, Frédéric; Kallio, Aleksi; Korpelainen, Eija; Bongcam-Rudloff, Erik; Spjuth, Ola

2015-01-01

Virtualization is becoming increasingly important in bioscience, enabling assembly and provisioning of complete computer setups, including operating system, data, software, and services packaged as virtual machine images (VMIs). We present an open catalog of VMIs for the life sciences, where scientists can share information about images and optionally upload them to a server equipped with a large file system and fast Internet connection. Other scientists can then search for and download images that can be run on the local computer or in a cloud computing environment, providing easy access to bioinformatics environments. We also describe applications where VMIs aid life science research, including distributing tools and data, supporting reproducible analysis, and facilitating education. BioImg.org is freely available at: https://bioimg.org.
Access, Equity, and Opportunity. Women in Machining: A Model Program.

ERIC Educational Resources Information Center

Warner, Heather

The Women in Machining (WIM) program is a Machine Action Project (MAP) initiative that was developed in response to a local skilled metalworking labor shortage, despite a virtual absence of women and people of color from area shops. The project identified post-war stereotypes and other barriers that must be addressed if women are to have an equal…
A genetic algorithm for a bi-objective mathematical model for dynamic virtual cell formation problem

NASA Astrophysics Data System (ADS)

Moradgholi, Mostafa; Paydar, Mohammad Mahdi; Mahdavi, Iraj; Jouzdani, Javid

2016-09-01

Nowadays, with the increasing pressure of the competitive business environment and demand for diverse products, manufacturers are force to seek for solutions that reduce production costs and rise product quality. Cellular manufacturing system (CMS), as a means to this end, has been a point of attraction to both researchers and practitioners. Limitations of cell formation problem (CFP), as one of important topics in CMS, have led to the introduction of virtual CMS (VCMS). This research addresses a bi-objective dynamic virtual cell formation problem (DVCFP) with the objective of finding the optimal formation of cells, considering the material handling costs, fixed machine installation costs and variable production costs of machines and workforce. Furthermore, we consider different skills on different machines in workforce assignment in a multi-period planning horizon. The bi-objective model is transformed to a single-objective fuzzy goal programming model and to show its performance; numerical examples are solved using the LINGO software. In addition, genetic algorithm (GA) is customized to tackle large-scale instances of the problems to show the performance of the solution method.
Integration of virtualized worker nodes in standard batch systems

NASA Astrophysics Data System (ADS)

Büge, Volker; Hessling, Hermann; Kemp, Yves; Kunze, Marcel; Oberst, Oliver; Quast, Günter; Scheurer, Armin; Synge, Owen

2010-04-01

Current experiments in HEP only use a limited number of operating system flavours. Their software might only be validated on one single OS platform. Resource providers might have other operating systems of choice for the installation of the batch infrastructure. This is especially the case if a cluster is shared with other communities, or communities that have stricter security requirements. One solution would be to statically divide the cluster into separated sub-clusters. In such a scenario, no opportunistic distribution of the load can be achieved, resulting in a poor overall utilization efficiency. Another approach is to make the batch system aware of virtualization, and to provide each community with its favoured operating system in a virtual machine. Here, the scheduler has full flexibility, resulting in a better overall efficiency of the resources. In our contribution, we present a lightweight concept for the integration of virtual worker nodes into standard batch systems. The virtual machines are started on the worker nodes just before jobs are executed there. No meta-scheduling is introduced. We demonstrate two prototype implementations, one based on the Sun Grid Engine (SGE), the other using Maui/Torque as a batch system. Both solutions support local job as well as Grid job submission. The hypervisors currently used are Xen and KVM, a port to another system is easily envisageable. To better handle different virtual machines on the physical host, the management solution VmImageManager is developed. We will present first experience from running the two prototype implementations. In a last part, we will show the potential future use of this lightweight concept when integrated into high-level (i.e. Grid) work-flows.
Portable parallel stochastic optimization for the design of aeropropulsion components

NASA Technical Reports Server (NTRS)

Sues, Robert H.; Rhodes, G. S.

1994-01-01

This report presents the results of Phase 1 research to develop a methodology for performing large-scale Multi-disciplinary Stochastic Optimization (MSO) for the design of aerospace systems ranging from aeropropulsion components to complete aircraft configurations. The current research recognizes that such design optimization problems are computationally expensive, and require the use of either massively parallel or multiple-processor computers. The methodology also recognizes that many operational and performance parameters are uncertain, and that uncertainty must be considered explicitly to achieve optimum performance and cost. The objective of this Phase 1 research was to initialize the development of an MSO methodology that is portable to a wide variety of hardware platforms, while achieving efficient, large-scale parallelism when multiple processors are available. The first effort in the project was a literature review of available computer hardware, as well as review of portable, parallel programming environments. The first effort was to implement the MSO methodology for a problem using the portable parallel programming language, Parallel Virtual Machine (PVM). The third and final effort was to demonstrate the example on a variety of computers, including a distributed-memory multiprocessor, a distributed-memory network of workstations, and a single-processor workstation. Results indicate the MSO methodology can be well-applied towards large-scale aerospace design problems. Nearly perfect linear speedup was demonstrated for computation of optimization sensitivity coefficients on both a 128-node distributed-memory multiprocessor (the Intel iPSC/860) and a network of workstations (speedups of almost 19 times achieved for 20 workstations). Very high parallel efficiencies (75 percent for 31 processors and 60 percent for 50 processors) were also achieved for computation of aerodynamic influence coefficients on the Intel. Finally, the multi-level parallelization strategy that will be needed for large-scale MSO problems was demonstrated to be highly efficient. The same parallel code instructions were used on both platforms, demonstrating portability. There are many applications for which MSO can be applied, including NASA's High-Speed-Civil Transport, and advanced propulsion systems. The use of MSO will reduce design and development time and testing costs dramatically.
Scheduling for Parallel Supercomputing: A Historical Perspective of Achievable Utilization

NASA Technical Reports Server (NTRS)

Jones, James Patton; Nitzberg, Bill

1999-01-01

The NAS facility has operated parallel supercomputers for the past 11 years, including the Intel iPSC/860, Intel Paragon, Thinking Machines CM-5, IBM SP-2, and Cray Origin 2000. Across this wide variety of machine architectures, across a span of 10 years, across a large number of different users, and through thousands of minor configuration and policy changes, the utilization of these machines shows three general trends: (1) scheduling using a naive FIFO first-fit policy results in 40-60% utilization, (2) switching to the more sophisticated dynamic backfilling scheduling algorithm improves utilization by about 15 percentage points (yielding about 70% utilization), and (3) reducing the maximum allowable job size further increases utilization. Most surprising is the consistency of these trends. Over the lifetime of the NAS parallel systems, we made hundreds, perhaps thousands, of small changes to hardware, software, and policy, yet, utilization was affected little. In particular these results show that the goal of achieving near 100% utilization while supporting a real parallel supercomputing workload is unrealistic.
Grid heterogeneity in in-silico experiments: an exploration of drug screening using DOCK on cloud environments.

PubMed

Yim, Wen-Wai; Chien, Shu; Kusumoto, Yasuyuki; Date, Susumu; Haga, Jason

2010-01-01

Large-scale in-silico screening is a necessary part of drug discovery and Grid computing is one answer to this demand. A disadvantage of using Grid computing is the heterogeneous computational environments characteristic of a Grid. In our study, we have found that for the molecular docking simulation program DOCK, different clusters within a Grid organization can yield inconsistent results. Because DOCK in-silico virtual screening (VS) is currently used to help select chemical compounds to test with in-vitro experiments, such differences have little effect on the validity of using virtual screening before subsequent steps in the drug discovery process. However, it is difficult to predict whether the accumulation of these discrepancies over sequentially repeated VS experiments will significantly alter the results if VS is used as the primary means for identifying potential drugs. Moreover, such discrepancies may be unacceptable for other applications requiring more stringent thresholds. This highlights the need for establishing a more complete solution to provide the best scientific accuracy when executing an application across Grids. One possible solution to platform heterogeneity in DOCK performance explored in our study involved the use of virtual machines as a layer of abstraction. This study investigated the feasibility and practicality of using virtual machine and recent cloud computing technologies in a biological research application. We examined the differences and variations of DOCK VS variables, across a Grid environment composed of different clusters, with and without virtualization. The uniform computer environment provided by virtual machines eliminated inconsistent DOCK VS results caused by heterogeneous clusters, however, the execution time for the DOCK VS increased. In our particular experiments, overhead costs were found to be an average of 41% and 2% in execution time for two different clusters, while the actual magnitudes of the execution time costs were minimal. Despite the increase in overhead, virtual clusters are an ideal solution for Grid heterogeneity. With greater development of virtual cluster technology in Grid environments, the problem of platform heterogeneity may be eliminated through virtualization, allowing greater usage of VS, and will benefit all Grid applications in general.
Specification and Analysis of Parallel Machine Architecture

DTIC Science & Technology

1990-03-17

Parallel Machine Architeture C.V. Ramamoorthy Computer Science Division Dept. of Electrical Engineering and Computer Science University of California...capacity. (4) Adaptive: The overhead in resolution of deadlocks, etc. should be in proportion to their frequency. (5) Avoid rollbacks: Rollbacks can be...snapshots of system state graphically at a rate proportional to simulation time. Some of the examples are as follow: (1) When the simulation clock of
Scaling Support Vector Machines On Modern HPC Platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

You, Yang; Fu, Haohuan; Song, Shuaiwen

2015-02-01

We designed and implemented MIC-SVM, a highly efficient parallel SVM for x86 based multicore and many-core architectures, such as the Intel Ivy Bridge CPUs and Intel Xeon Phi co-processor (MIC). We propose various novel analysis methods and optimization techniques to fully utilize the multilevel parallelism provided by these architectures and serve as general optimization methods for other machine learning tools.
An Argument for Partial Admissibility of Polygraph Results in Trials by Courts-Martial

DTIC Science & Technology

1990-04-01

FUNDAMENTALS OF THE POLYGRAPH TECHNIQUE 6 A. THE POLYGRAPH MACHINE 7 1. THE CARDIOSPHYMOGRAPH 8 2. THE PNEUMOGRAPH 9 3. THE GALVANOMETER 10 4. THE... Machine Anyone observing a polygraph machine for the first time could easily conclude it is a survivor of the Spanish Inquisition. The lengths of...wire and coils get the immediate attention of the subject. However, the various polygraph machines in use today cause virtually no discomfort. Several
Teaching Cybersecurity Using the Cloud

ERIC Educational Resources Information Center

Salah, Khaled; Hammoud, Mohammad; Zeadally, Sherali

2015-01-01

Cloud computing platforms can be highly attractive to conduct course assignments and empower students with valuable and indispensable hands-on experience. In particular, the cloud can offer teaching staff and students (whether local or remote) on-demand, elastic, dedicated, isolated, (virtually) unlimited, and easily configurable virtual machines.…
Ant-Based Cyber Defense (also known as

DOE Office of Scientific and Technical Information (OSTI.GOV)

Glenn Fink, PNNL

2015-09-29

ABCD is a four-level hierarchy with human supervisors at the top, a top-level agent called a Sergeant controlling each enclave, Sentinel agents located at each monitored host, and mobile Sensor agents that swarm through the enclaves to detect cyber malice and misconfigurations. The code comprises four parts: (1) the core agent framework, (2) the user interface and visualization, (3) test-range software to create a network of virtual machines including a simulated Internet and user and host activity emulation scripts, and (4) a test harness to allow the safe running of adversarial code within the framework of monitored virtual machines.
Prototyping Faithful Execution in a Java virtual machine.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tarman, Thomas David; Campbell, Philip LaRoche; Pierson, Lyndon George

2003-09-01

This report presents the implementation of a stateless scheme for Faithful Execution, the design for which is presented in a companion report, ''Principles of Faithful Execution in the Implementation of Trusted Objects'' (SAND 2003-2328). We added a simple cryptographic capability to an already simplified class loader and its associated Java Virtual Machine (JVM) to provide a byte-level implementation of Faithful Execution. The extended class loader and JVM we refer to collectively as the Sandia Faithfully Executing Java architecture (or JavaFE for short). This prototype is intended to enable exploration of more sophisticated techniques which we intend to implement in hardware.
Architectures for reasoning in parallel

NASA Technical Reports Server (NTRS)

Hall, Lawrence O.

1989-01-01

The research conducted has dealt with rule-based expert systems. The algorithms that may lead to effective parallelization of them were investigated. Both the forward and backward chained control paradigms were investigated in the course of this work. The best computer architecture for the developed and investigated algorithms has been researched. Two experimental vehicles were developed to facilitate this research. They are Backpac, a parallel backward chained rule-based reasoning system and Datapac, a parallel forward chained rule-based reasoning system. Both systems have been written in Multilisp, a version of Lisp which contains the parallel construct, future. Applying the future function to a function causes the function to become a task parallel to the spawning task. Additionally, Backpac and Datapac have been run on several disparate parallel processors. The machines are an Encore Multimax with 10 processors, the Concert Multiprocessor with 64 processors, and a 32 processor BBN GP1000. Both the Concert and the GP1000 are switch-based machines. The Multimax has all its processors hung off a common bus. All are shared memory machines, but have different schemes for sharing the memory and different locales for the shared memory. The main results of the investigations come from experiments on the 10 processor Encore and the Concert with partitions of 32 or less processors. Additionally, experiments have been run with a stripped down version of EMYCIN.
A parallel-machine scheduling problem with two competing agents

NASA Astrophysics Data System (ADS)

Lee, Wen-Chiung; Chung, Yu-Hsiang; Wang, Jen-Ya

2017-06-01

Scheduling with two competing agents has become popular in recent years. Most of the research has focused on single-machine problems. This article considers a parallel-machine problem, the objective of which is to minimize the total completion time of jobs from the first agent given that the maximum tardiness of jobs from the second agent cannot exceed an upper bound. The NP-hardness of this problem is also examined. A genetic algorithm equipped with local search is proposed to search for the near-optimal solution. Computational experiments are conducted to evaluate the proposed genetic algorithm.
On the relationship between parallel computation and graph embedding

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gupta, A.K.

1989-01-01

The problem of efficiently simulating an algorithm designed for an n-processor parallel machine G on an m-processor parallel machine H with n > m arises when parallel algorithms designed for an ideal size machine are simulated on existing machines which are of a fixed size. The author studies this problem when every processor of H takes over the function of a number of processors in G, and he phrases the simulation problem as a graph embedding problem. New embeddings presented address relevant issues arising from the parallel computation environment. The main focus centers around embedding complete binary trees into smaller-sizedmore » binary trees, butterflies, and hypercubes. He also considers simultaneous embeddings of r source machines into a single hypercube. Constant factors play a crucial role in his embeddings since they are not only important in practice but also lead to interesting theoretical problems. All of his embeddings minimize dilation and load, which are the conventional cost measures in graph embeddings and determine the maximum amount of time required to simulate one step of G on H. His embeddings also optimize a new cost measure called ({alpha},{beta})-utilization which characterizes how evenly the processors of H are used by the processors of G. Ideally, the utilization should be balanced (i.e., every processor of H simulates at most (n/m) processors of G) and the ({alpha},{beta})-utilization measures how far off from a balanced utilization the embedding is. He presents embeddings for the situation when some processors of G have different capabilities (e.g. memory or I/O) than others and the processors with different capabilities are to be distributed uniformly among the processors of H. Placing such conditions on an embedding results in an increase in some of the cost measures.« less
Parallel Preconditioning for CFD Problems on the CM-5

NASA Technical Reports Server (NTRS)

Simon, Horst D.; Kremenetsky, Mark D.; Richardson, John; Lasinski, T. A. (Technical Monitor)

1994-01-01

Up to today, preconditioning methods on massively parallel systems have faced a major difficulty. The most successful preconditioning methods in terms of accelerating the convergence of the iterative solver such as incomplete LU factorizations are notoriously difficult to implement on parallel machines for two reasons: (1) the actual computation of the preconditioner is not very floating-point intensive, but requires a large amount of unstructured communication, and (2) the application of the preconditioning matrix in the iteration phase (i.e. triangular solves) are difficult to parallelize because of the recursive nature of the computation. Here we present a new approach to preconditioning for very large, sparse, unsymmetric, linear systems, which avoids both difficulties. We explicitly compute an approximate inverse to our original matrix. This new preconditioning matrix can be applied most efficiently for iterative methods on massively parallel machines, since the preconditioning phase involves only a matrix-vector multiplication, with possibly a dense matrix. Furthermore the actual computation of the preconditioning matrix has natural parallelism. For a problem of size n, the preconditioning matrix can be computed by solving n independent small least squares problems. The algorithm and its implementation on the Connection Machine CM-5 are discussed in detail and supported by extensive timings obtained from real problem data.
Prediction based proactive thermal virtual machine scheduling in green clouds.

PubMed

Kinger, Supriya; Kumar, Rajesh; Sharma, Anju

2014-01-01

Cloud computing has rapidly emerged as a widely accepted computing paradigm, but the research on Cloud computing is still at an early stage. Cloud computing provides many advanced features but it still has some shortcomings such as relatively high operating cost and environmental hazards like increasing carbon footprints. These hazards can be reduced up to some extent by efficient scheduling of Cloud resources. Working temperature on which a machine is currently running can be taken as a criterion for Virtual Machine (VM) scheduling. This paper proposes a new proactive technique that considers current and maximum threshold temperature of Server Machines (SMs) before making scheduling decisions with the help of a temperature predictor, so that maximum temperature is never reached. Different workload scenarios have been taken into consideration. The results obtained show that the proposed system is better than existing systems of VM scheduling, which does not consider current temperature of nodes before making scheduling decisions. Thus, a reduction in need of cooling systems for a Cloud environment has been obtained and validated.
Virtual Environment Training: Auxiliary Machinery Room (AMR) Watchstation Trainer.

ERIC Educational Resources Information Center

Hriber, Dennis C.; And Others

1993-01-01

Describes a project implemented at Newport News Shipbuilding that used Virtual Environment Training to improve the performance of submarine crewmen. Highlights include development of the Auxiliary Machine Room (AMR) Watchstation Trainer; Digital Video Interactive (DVI); screen layout; test design and evaluation; user reactions; authoring language;…
The virtual machine (VM) scaler: an infrastructure manager supporting environmental modeling on IaaS clouds

USDA-ARS?s Scientific Manuscript database

Infrastructure-as-a-service (IaaS) clouds provide a new medium for deployment of environmental modeling applications. Harnessing advancements in virtualization, IaaS clouds can provide dynamic scalable infrastructure to better support scientific modeling computational demands. Providing scientific m...

Multi-objective problem of the modified distributed parallel machine and assembly scheduling problem (MDPMASP) with eligibility constraints

NASA Astrophysics Data System (ADS)

Amallynda, I.; Santosa, B.

2017-11-01

This paper proposes a new generalization of the distributed parallel machine and assembly scheduling problem (DPMASP) with eligibility constraints referred to as the modified distributed parallel machine and assembly scheduling problem (MDPMASP) with eligibility constraints. Within this generalization, we assume that there are a set non-identical factories or production lines, each one with a set unrelated parallel machine with different speeds in processing them disposed to a single assembly machine in series. A set of different products that are manufactured through an assembly program of a set of components (jobs) according to the requested demand. Each product requires several kinds of jobs with different sizes. Beside that we also consider to the multi-objective problem (MOP) of minimizing mean flow time and the number of tardy products simultaneously. This is known to be NP-Hard problem, is important to practice, as the former criterions to reflect the customer's demand and manufacturer's perspective. This is a realistic and complex problem with wide range of possible solutions, we propose four simple heuristics and two metaheuristics to solve it. Various parameters of the proposed metaheuristic algorithms are discussed and calibrated by means of Taguchi technique. All proposed algorithms are tested by Matlab software. Our computational experiments indicate that the proposed problem and fourth proposed algorithms are able to be implemented and can be used to solve moderately-sized instances, and giving efficient solutions, which are close to optimum in most cases.
From planes to brains: parallels between military development of virtual reality environments and virtual neurological surgery.

PubMed

Schmitt, Paul J; Agarwal, Nitin; Prestigiacomo, Charles J

2012-01-01

Military explorations of the practical role of simulators have served as a driving force for much of the virtual reality technology that we have today. The evolution of 3-dimensional and virtual environments from the early flight simulators used during World War II to the sophisticated training simulators in the modern military followed a path that virtual surgical and neurosurgical devices have already begun to parallel. By understanding the evolution of military simulators as well as comparing and contrasting that evolution with current and future surgical simulators, it may be possible to expedite the development of appropriate devices and establish their validity as effective training tools. As such, this article presents a historical perspective examining the progression of neurosurgical simulators, the establishment of effective and appropriate curricula for using them, and the contributions that the military has made during the ongoing maturation of this exciting treatment and training modality. Copyright © 2012. Published by Elsevier Inc.
Computer-Aided Parallelizer and Optimizer

NASA Technical Reports Server (NTRS)

Jin, Haoqiang

2011-01-01

The Computer-Aided Parallelizer and Optimizer (CAPO) automates the insertion of compiler directives (see figure) to facilitate parallel processing on Shared Memory Parallel (SMP) machines. While CAPO currently is integrated seamlessly into CAPTools (developed at the University of Greenwich, now marketed as ParaWise), CAPO was independently developed at Ames Research Center as one of the components for the Legacy Code Modernization (LCM) project. The current version takes serial FORTRAN programs, performs interprocedural data dependence analysis, and generates OpenMP directives. Due to the widely supported OpenMP standard, the generated OpenMP codes have the potential to run on a wide range of SMP machines. CAPO relies on accurate interprocedural data dependence information currently provided by CAPTools. Compiler directives are generated through identification of parallel loops in the outermost level, construction of parallel regions around parallel loops and optimization of parallel regions, and insertion of directives with automatic identification of private, reduction, induction, and shared variables. Attempts also have been made to identify potential pipeline parallelism (implemented with point-to-point synchronization). Although directives are generated automatically, user interaction with the tool is still important for producing good parallel codes. A comprehensive graphical user interface is included for users to interact with the parallelization process.
Long-Term Training with a Brain-Machine Interface-Based Gait Protocol Induces Partial Neurological Recovery in Paraplegic Patients.

PubMed

Donati, Ana R C; Shokur, Solaiman; Morya, Edgard; Campos, Debora S F; Moioli, Renan C; Gitti, Claudia M; Augusto, Patricia B; Tripodi, Sandra; Pires, Cristhiane G; Pereira, Gislaine A; Brasil, Fabricio L; Gallo, Simone; Lin, Anthony A; Takigami, Angelo K; Aratanha, Maria A; Joshi, Sanjay; Bleuler, Hannes; Cheng, Gordon; Rudolph, Alan; Nicolelis, Miguel A L

2016-08-11

Brain-machine interfaces (BMIs) provide a new assistive strategy aimed at restoring mobility in severely paralyzed patients. Yet, no study in animals or in human subjects has indicated that long-term BMI training could induce any type of clinical recovery. Eight chronic (3-13 years) spinal cord injury (SCI) paraplegics were subjected to long-term training (12 months) with a multi-stage BMI-based gait neurorehabilitation paradigm aimed at restoring locomotion. This paradigm combined intense immersive virtual reality training, enriched visual-tactile feedback, and walking with two EEG-controlled robotic actuators, including a custom-designed lower limb exoskeleton capable of delivering tactile feedback to subjects. Following 12 months of training with this paradigm, all eight patients experienced neurological improvements in somatic sensation (pain localization, fine/crude touch, and proprioceptive sensing) in multiple dermatomes. Patients also regained voluntary motor control in key muscles below the SCI level, as measured by EMGs, resulting in marked improvement in their walking index. As a result, 50% of these patients were upgraded to an incomplete paraplegia classification. Neurological recovery was paralleled by the reemergence of lower limb motor imagery at cortical level. We hypothesize that this unprecedented neurological recovery results from both cortical and spinal cord plasticity triggered by long-term BMI usage.
Long-Term Training with a Brain-Machine Interface-Based Gait Protocol Induces Partial Neurological Recovery in Paraplegic Patients

PubMed Central

Donati, Ana R. C.; Shokur, Solaiman; Morya, Edgard; Campos, Debora S. F.; Moioli, Renan C.; Gitti, Claudia M.; Augusto, Patricia B.; Tripodi, Sandra; Pires, Cristhiane G.; Pereira, Gislaine A.; Brasil, Fabricio L.; Gallo, Simone; Lin, Anthony A.; Takigami, Angelo K.; Aratanha, Maria A.; Joshi, Sanjay; Bleuler, Hannes; Cheng, Gordon; Rudolph, Alan; Nicolelis, Miguel A. L.

2016-01-01

Brain-machine interfaces (BMIs) provide a new assistive strategy aimed at restoring mobility in severely paralyzed patients. Yet, no study in animals or in human subjects has indicated that long-term BMI training could induce any type of clinical recovery. Eight chronic (3–13 years) spinal cord injury (SCI) paraplegics were subjected to long-term training (12 months) with a multi-stage BMI-based gait neurorehabilitation paradigm aimed at restoring locomotion. This paradigm combined intense immersive virtual reality training, enriched visual-tactile feedback, and walking with two EEG-controlled robotic actuators, including a custom-designed lower limb exoskeleton capable of delivering tactile feedback to subjects. Following 12 months of training with this paradigm, all eight patients experienced neurological improvements in somatic sensation (pain localization, fine/crude touch, and proprioceptive sensing) in multiple dermatomes. Patients also regained voluntary motor control in key muscles below the SCI level, as measured by EMGs, resulting in marked improvement in their walking index. As a result, 50% of these patients were upgraded to an incomplete paraplegia classification. Neurological recovery was paralleled by the reemergence of lower limb motor imagery at cortical level. We hypothesize that this unprecedented neurological recovery results from both cortical and spinal cord plasticity triggered by long-term BMI usage. PMID:27513629
CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing

PubMed Central

2011-01-01

Background Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. Results We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. Conclusion The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing. PMID:21878105
Design and fabrication of complete dentures using CAD/CAM technology

PubMed Central

Han, Weili; Li, Yanfeng; Zhang, Yue; lv, Yuan; Zhang, Ying; Hu, Ping; Liu, Huanyue; Ma, Zheng; Shen, Yi

2017-01-01

Abstract The aim of the study was to test the feasibility of using commercially available computer-aided design and computer-aided manufacturing (CAD/CAM) technology including 3Shape Dental System 2013 trial version, WIELAND V2.0.049 and WIELAND ZENOTEC T1 milling machine to design and fabricate complete dentures. The modeling process of full denture available in the trial version of 3Shape Dental System 2013 was used to design virtual complete dentures on the basis of 3-dimensional (3D) digital edentulous models generated from the physical models. The virtual complete dentures designed were exported to CAM software of WIELAND V2.0.049. A WIELAND ZENOTEC T1 milling machine controlled by the CAM software was used to fabricate physical dentitions and baseplates by milling acrylic resin composite plates. The physical dentitions were bonded to the corresponding baseplates to form the maxillary and mandibular complete dentures. Virtual complete dentures were successfully designed using the software through several steps including generation of 3D digital edentulous models, model analysis, arrangement of artificial teeth, trimming relief area, and occlusal adjustment. Physical dentitions and baseplates were successfully fabricated according to the designed virtual complete dentures using milling machine controlled by a CAM software. Bonding physical dentitions to the corresponding baseplates generated the final physical complete dentures. Our study demonstrated that complete dentures could be successfully designed and fabricated by using CAD/CAM. PMID:28072686
Interactive Parallel Data Analysis within Data-Centric Cluster Facilities using the IPython Notebook

NASA Astrophysics Data System (ADS)

Pascoe, S.; Lansdowne, J.; Iwi, A.; Stephens, A.; Kershaw, P.

2012-12-01

The data deluge is making traditional analysis workflows for many researchers obsolete. Support for parallelism within popular tools such as matlab, IDL and NCO is not well developed and rarely used. However parallelism is necessary for processing modern data volumes on a timescale conducive to curiosity-driven analysis. Furthermore, for peta-scale datasets such as the CMIP5 archive, it is no longer practical to bring an entire dataset to a researcher's workstation for analysis, or even to their institutional cluster. Therefore, there is an increasing need to develop new analysis platforms which both enable processing at the point of data storage and which provides parallelism. Such an environment should, where possible, maintain the convenience and familiarity of our current analysis environments to encourage curiosity-driven research. We describe how we are combining the interactive python shell (IPython) with our JASMIN data-cluster infrastructure. IPython has been specifically designed to bridge the gap between the HPC-style parallel workflows and the opportunistic curiosity-driven analysis usually carried out using domain specific languages and scriptable tools. IPython offers a web-based interactive environment, the IPython notebook, and a cluster engine for parallelism all underpinned by the well-respected Python/Scipy scientific programming stack. JASMIN is designed to support the data analysis requirements of the UK and European climate and earth system modeling community. JASMIN, with its sister facility CEMS focusing the earth observation community, has 4.5 PB of fast parallel disk storage alongside over 370 computing cores provide local computation. Through the IPython interface to JASMIN, users can make efficient use of JASMIN's multi-core virtual machines to perform interactive analysis on all cores simultaneously or can configure IPython clusters across multiple VMs. Larger-scale clusters can be provisioned through JASMIN's batch scheduling system. Outputs can be summarised and visualised using the full power of Python's many scientific tools, including Scipy, Matplotlib, Pandas and CDAT. This rich user experience is delivered through the user's web browser; maintaining the interactive feel of a workstation-based environment with the parallel power of a remote data-centric processing facility.
A fast sorting algorithm for a hypersonic rarefied flow particle simulation on the connection machine

NASA Technical Reports Server (NTRS)

Dagum, Leonardo

1989-01-01

The data parallel implementation of a particle simulation for hypersonic rarefied flow described by Dagum associates a single parallel data element with each particle in the simulation. The simulated space is divided into discrete regions called cells containing a variable and constantly changing number of particles. The implementation requires a global sort of the parallel data elements so as to arrange them in an order that allows immediate access to the information associated with cells in the simulation. Described here is a very fast algorithm for performing the necessary ranking of the parallel data elements. The performance of the new algorithm is compared with that of the microcoded instruction for ranking on the Connection Machine.
Performance Comparison of a Set of Periodic and Non-Periodic Tridiagonal Solvers on SP2 and Paragon Parallel Computers

NASA Technical Reports Server (NTRS)

Sun, Xian-He; Moitra, Stuti

1996-01-01

Various tridiagonal solvers have been proposed in recent years for different parallel platforms. In this paper, the performance of three tridiagonal solvers, namely, the parallel partition LU algorithm, the parallel diagonal dominant algorithm, and the reduced diagonal dominant algorithm, is studied. These algorithms are designed for distributed-memory machines and are tested on an Intel Paragon and an IBM SP2 machines. Measured results are reported in terms of execution time and speedup. Analytical study are conducted for different communication topologies and for different tridiagonal systems. The measured results match the analytical results closely. In addition to address implementation issues, performance considerations such as problem sizes and models of speedup are also discussed.
Single product lot-sizing on unrelated parallel machines with non-decreasing processing times

NASA Astrophysics Data System (ADS)

Eremeev, A.; Kovalyov, M.; Kuznetsov, P.

2018-01-01

We consider a problem in which at least a given quantity of a single product has to be partitioned into lots, and lots have to be assigned to unrelated parallel machines for processing. In one version of the problem, the maximum machine completion time should be minimized, in another version of the problem, the sum of machine completion times is to be minimized. Machine-dependent lower and upper bounds on the lot size are given. The product is either assumed to be continuously divisible or discrete. The processing time of each machine is defined by an increasing function of the lot volume, given as an oracle. Setup times and costs are assumed to be negligibly small, and therefore, they are not considered. We derive optimal polynomial time algorithms for several special cases of the problem. An NP-hard case is shown to admit a fully polynomial time approximation scheme. An application of the problem in energy efficient processors scheduling is considered.
Early experiences in developing and managing the neuroscience gateway.

PubMed

Sivagnanam, Subhashini; Majumdar, Amit; Yoshimoto, Kenneth; Astakhov, Vadim; Bandrowski, Anita; Martone, MaryAnn; Carnevale, Nicholas T

2015-02-01

The last few decades have seen the emergence of computational neuroscience as a mature field where researchers are interested in modeling complex and large neuronal systems and require access to high performance computing machines and associated cyber infrastructure to manage computational workflow and data. The neuronal simulation tools, used in this research field, are also implemented for parallel computers and suitable for high performance computing machines. But using these tools on complex high performance computing machines remains a challenge because of issues with acquiring computer time on these machines located at national supercomputer centers, dealing with complex user interface of these machines, dealing with data management and retrieval. The Neuroscience Gateway is being developed to alleviate and/or hide these barriers to entry for computational neuroscientists. It hides or eliminates, from the point of view of the users, all the administrative and technical barriers and makes parallel neuronal simulation tools easily available and accessible on complex high performance computing machines. It handles the running of jobs and data management and retrieval. This paper shares the early experiences in bringing up this gateway and describes the software architecture it is based on, how it is implemented, and how users can use this for computational neuroscience research using high performance computing at the back end. We also look at parallel scaling of some publicly available neuronal models and analyze the recent usage data of the neuroscience gateway.
Early experiences in developing and managing the neuroscience gateway

PubMed Central

Sivagnanam, Subhashini; Majumdar, Amit; Yoshimoto, Kenneth; Astakhov, Vadim; Bandrowski, Anita; Martone, MaryAnn; Carnevale, Nicholas. T.

2015-01-01

SUMMARY The last few decades have seen the emergence of computational neuroscience as a mature field where researchers are interested in modeling complex and large neuronal systems and require access to high performance computing machines and associated cyber infrastructure to manage computational workflow and data. The neuronal simulation tools, used in this research field, are also implemented for parallel computers and suitable for high performance computing machines. But using these tools on complex high performance computing machines remains a challenge because of issues with acquiring computer time on these machines located at national supercomputer centers, dealing with complex user interface of these machines, dealing with data management and retrieval. The Neuroscience Gateway is being developed to alleviate and/or hide these barriers to entry for computational neuroscientists. It hides or eliminates, from the point of view of the users, all the administrative and technical barriers and makes parallel neuronal simulation tools easily available and accessible on complex high performance computing machines. It handles the running of jobs and data management and retrieval. This paper shares the early experiences in bringing up this gateway and describes the software architecture it is based on, how it is implemented, and how users can use this for computational neuroscience research using high performance computing at the back end. We also look at parallel scaling of some publicly available neuronal models and analyze the recent usage data of the neuroscience gateway. PMID:26523124
The Virtual Xenbase: transitioning an online bioinformatics resource to a private cloud.

PubMed

Karimi, Kamran; Vize, Peter D

2014-01-01

As a model organism database, Xenbase has been providing informatics and genomic data on Xenopus (Silurana) tropicalis and Xenopus laevis frogs for more than a decade. The Xenbase database contains curated, as well as community-contributed and automatically harvested literature, gene and genomic data. A GBrowse genome browser, a BLAST+ server and stock center support are available on the site. When this resource was first built, all software services and components in Xenbase ran on a single physical server, with inherent reliability, scalability and inter-dependence issues. Recent advances in networking and virtualization techniques allowed us to move Xenbase to a virtual environment, and more specifically to a private cloud. To do so we decoupled the different software services and components, such that each would run on a different virtual machine. In the process, we also upgraded many of the components. The resulting system is faster and more reliable. System maintenance is easier, as individual virtual machines can now be updated, backed up and changed independently. We are also experiencing more effective resource allocation and utilization. Database URL: www.xenbase.org. © The Author(s) 2014. Published by Oxford University Press.
Implementation and Assessment of a Virtual Laboratory of Parallel Robots Developed for Engineering Students

ERIC Educational Resources Information Center

Gil, Arturo; Peidró, Adrián; Reinoso, Óscar; Marín, José María

2017-01-01

This paper presents a tool, LABEL, oriented to the teaching of parallel robotics. The application, organized as a set of tools developed using Easy Java Simulations, enables the study of the kinematics of parallel robotics. A set of classical parallel structures was implemented such that LABEL can solve the inverse and direct kinematic problem of…
Research on axisymmetric aspheric surface numerical design and manufacturing technology

NASA Astrophysics Data System (ADS)

Wang, Zhen-zhong; Guo, Yin-biao; Lin, Zheng

2006-02-01

The key technology for aspheric machining offers exact machining path and machining aspheric lens with high accuracy and efficiency, in spite of the development of traditional manual manufacturing into nowadays numerical control (NC) machining. This paper presents a mathematical model between virtual cone and aspheric surface equations, and discusses the technology of uniform wear of grinding wheel and error compensation in aspheric machining. Finally, a software system for high precision aspheric surface manufacturing is designed and realized, based on the mentioned above. This software system can work out grinding wheel path according to input parameters and generate machining NC programs of aspheric surfaces.
Implementation and Characterization of Three-Dimensional Particle-in-Cell Codes on Multiple-Instruction-Multiple-Data Massively Parallel Supercomputers

NASA Technical Reports Server (NTRS)

Lyster, P. M.; Liewer, P. C.; Decyk, V. K.; Ferraro, R. D.

1995-01-01

A three-dimensional electrostatic particle-in-cell (PIC) plasma simulation code has been developed on coarse-grain distributed-memory massively parallel computers with message passing communications. Our implementation is the generalization to three-dimensions of the general concurrent particle-in-cell (GCPIC) algorithm. In the GCPIC algorithm, the particle computation is divided among the processors using a domain decomposition of the simulation domain. In a three-dimensional simulation, the domain can be partitioned into one-, two-, or three-dimensional subdomains ("slabs," "rods," or "cubes") and we investigate the efficiency of the parallel implementation of the push for all three choices. The present implementation runs on the Intel Touchstone Delta machine at Caltech; a multiple-instruction-multiple-data (MIMD) parallel computer with 512 nodes. We find that the parallel efficiency of the push is very high, with the ratio of communication to computation time in the range 0.3%-10.0%. The highest efficiency (> 99%) occurs for a large, scaled problem with 64(sup 3) particles per processing node (approximately 134 million particles of 512 nodes) which has a push time of about 250 ns per particle per time step. We have also developed expressions for the timing of the code which are a function of both code parameters (number of grid points, particles, etc.) and machine-dependent parameters (effective FLOP rate, and the effective interprocessor bandwidths for the communication of particles and grid points). These expressions can be used to estimate the performance of scaled problems--including those with inhomogeneous plasmas--to other parallel machines once the machine-dependent parameters are known.
Optimizing the way kinematical feed chains with great distance between slides are chosen for CNC machine tools

NASA Astrophysics Data System (ADS)

Lucian, P.; Gheorghe, S.

2017-08-01

This paper presents a new method, based on FRISCO formula, for optimizing the choice of the best control system for kinematical feed chains with great distance between slides used in computer numerical controlled machine tools. Such machines are usually, but not limited to, used for machining large and complex parts (mostly in the aviation industry) or complex casting molds. For such machine tools the kinematic feed chains are arranged in a dual-parallel drive structure that allows the mobile element to be moved by the two kinematical branches and their related control systems. Such an arrangement allows for high speed and high rigidity (a critical requirement for precision machining) during the machining process. A significant issue for such an arrangement it’s the ability of the two parallel control systems to follow the same trajectory accurately in order to address this issue it is necessary to achieve synchronous motion control for the two kinematical branches ensuring that the correct perpendicular position it’s kept by the mobile element during its motion on the two slides.
Communication Studies of DMP and SMP Machines

NASA Technical Reports Server (NTRS)

Sohn, Andrew; Biswas, Rupak; Chancellor, Marisa K. (Technical Monitor)

1997-01-01

Understanding the interplay between machines and problems is key to obtaining high performance on parallel machines. This paper investigates the interplay between programming paradigms and communication capabilities of parallel machines. In particular, we explicate the communication capabilities of the IBM SP-2 distributed-memory multiprocessor and the SGI PowerCHALLENGEarray symmetric multiprocessor. Two benchmark problems of bitonic sorting and Fast Fourier Transform are selected for experiments. Communication-efficient algorithms are developed to exploit the overlapping capabilities of the machines. Programs are written in Message-Passing Interface for portability and identical codes are used for both machines. Various data sizes and message sizes are used to test the machines' communication capabilities. Experimental results indicate that the communication performance of the multiprocessors are consistent with the size of messages. The SP-2 is sensitive to message size but yields a much higher communication overlapping because of the communication co-processor. The PowerCHALLENGEarray is not highly sensitive to message size and yields a low communication overlapping. Bitonic sorting yields lower performance compared to FFT due to a smaller computation-to-communication ratio.
Virtual Planning, Control, and Machining for a Modular-Based Automated Factory Operation in an Augmented Reality Environment

PubMed Central

Pai, Yun Suen; Yap, Hwa Jen; Md Dawal, Siti Zawiah; Ramesh, S.; Phoon, Sin Ye

2016-01-01

This study presents a modular-based implementation of augmented reality to provide an immersive experience in learning or teaching the planning phase, control system, and machining parameters of a fully automated work cell. The architecture of the system consists of three code modules that can operate independently or combined to create a complete system that is able to guide engineers from the layout planning phase to the prototyping of the final product. The layout planning module determines the best possible arrangement in a layout for the placement of various machines, in this case a conveyor belt for transportation, a robot arm for pick-and-place operations, and a computer numerical control milling machine to generate the final prototype. The robotic arm module simulates the pick-and-place operation offline from the conveyor belt to a computer numerical control (CNC) machine utilising collision detection and inverse kinematics. Finally, the CNC module performs virtual machining based on the Uniform Space Decomposition method and axis aligned bounding box collision detection. The conducted case study revealed that given the situation, a semi-circle shaped arrangement is desirable, whereas the pick-and-place system and the final generated G-code produced the highest deviation of 3.83 mm and 5.8 mm respectively. PMID:27271840

Virtual Planning, Control, and Machining for a Modular-Based Automated Factory Operation in an Augmented Reality Environment.

PubMed

Pai, Yun Suen; Yap, Hwa Jen; Md Dawal, Siti Zawiah; Ramesh, S; Phoon, Sin Ye

2016-06-07

This study presents a modular-based implementation of augmented reality to provide an immersive experience in learning or teaching the planning phase, control system, and machining parameters of a fully automated work cell. The architecture of the system consists of three code modules that can operate independently or combined to create a complete system that is able to guide engineers from the layout planning phase to the prototyping of the final product. The layout planning module determines the best possible arrangement in a layout for the placement of various machines, in this case a conveyor belt for transportation, a robot arm for pick-and-place operations, and a computer numerical control milling machine to generate the final prototype. The robotic arm module simulates the pick-and-place operation offline from the conveyor belt to a computer numerical control (CNC) machine utilising collision detection and inverse kinematics. Finally, the CNC module performs virtual machining based on the Uniform Space Decomposition method and axis aligned bounding box collision detection. The conducted case study revealed that given the situation, a semi-circle shaped arrangement is desirable, whereas the pick-and-place system and the final generated G-code produced the highest deviation of 3.83 mm and 5.8 mm respectively.
Virtual Planning, Control, and Machining for a Modular-Based Automated Factory Operation in an Augmented Reality Environment

NASA Astrophysics Data System (ADS)

Pai, Yun Suen; Yap, Hwa Jen; Md Dawal, Siti Zawiah; Ramesh, S.; Phoon, Sin Ye

2016-06-01

This study presents a modular-based implementation of augmented reality to provide an immersive experience in learning or teaching the planning phase, control system, and machining parameters of a fully automated work cell. The architecture of the system consists of three code modules that can operate independently or combined to create a complete system that is able to guide engineers from the layout planning phase to the prototyping of the final product. The layout planning module determines the best possible arrangement in a layout for the placement of various machines, in this case a conveyor belt for transportation, a robot arm for pick-and-place operations, and a computer numerical control milling machine to generate the final prototype. The robotic arm module simulates the pick-and-place operation offline from the conveyor belt to a computer numerical control (CNC) machine utilising collision detection and inverse kinematics. Finally, the CNC module performs virtual machining based on the Uniform Space Decomposition method and axis aligned bounding box collision detection. The conducted case study revealed that given the situation, a semi-circle shaped arrangement is desirable, whereas the pick-and-place system and the final generated G-code produced the highest deviation of 3.83 mm and 5.8 mm respectively.
A performance study of sparse Cholesky factorization on INTEL iPSC/860

NASA Technical Reports Server (NTRS)

Zubair, M.; Ghose, M.

1992-01-01

The problem of Cholesky factorization of a sparse matrix has been very well investigated on sequential machines. A number of efficient codes exist for factorizing large unstructured sparse matrices. However, there is a lack of such efficient codes on parallel machines in general, and distributed machines in particular. Some of the issues that are critical to the implementation of sparse Cholesky factorization on a distributed memory parallel machine are ordering, partitioning and mapping, load balancing, and ordering of various tasks within a processor. Here, we focus on the effect of various partitioning schemes on the performance of sparse Cholesky factorization on the Intel iPSC/860. Also, a new partitioning heuristic for structured as well as unstructured sparse matrices is proposed, and its performance is compared with other schemes.
FEM analysis of an single stator dual PM rotors axial synchronous machine

NASA Astrophysics Data System (ADS)

Tutelea, L. N.; Deaconu, S. I.; Popa, G. N.

2017-01-01

The actual e - continuously variable transmission (e-CVT) solution for the parallel Hybrid Electric Vehicle (HEV) requires two electric machines, two inverters, and a planetary gear. A distinct electric generator and a propulsion electric motor, both with full power converters, are typical for a series HEV. In an effort to simplify the planetary-geared e-CVT for the parallel HEV or the series HEV we hereby propose to replace the basically two electric machines and their two power converters by a single, axial-air-gap, electric machine central stator, fed from a single PWM converter with dual frequency voltage output and two independent PM rotors. The proposed topologies, the magneto-motive force analysis and quasi 3D-FEM analysis are the core of the paper.
Machining heavy plastic sections

NASA Technical Reports Server (NTRS)

Stalkup, O. M.

1967-01-01

Machining technique produces consistently satisfactory plane-parallel optical surfaces for pressure windows, made of plexiglass, required to support a photographic study of liquid rocket combustion processes. The surfaces are machined and polished to the required tolerances and show no degradation from stress relaxation over periods as long as 6 months.
Parallel processors and nonlinear structural dynamics algorithms and software

NASA Technical Reports Server (NTRS)

Belytschko, Ted; Gilbertsen, Noreen D.; Neal, Mark O.; Plaskacz, Edward J.

1989-01-01

The adaptation of a finite element program with explicit time integration to a massively parallel SIMD (single instruction multiple data) computer, the CONNECTION Machine is described. The adaptation required the development of a new algorithm, called the exchange algorithm, in which all nodal variables are allocated to the element with an exchange of nodal forces at each time step. The architectural and C* programming language features of the CONNECTION Machine are also summarized. Various alternate data structures and associated algorithms for nonlinear finite element analysis are discussed and compared. Results are presented which demonstrate that the CONNECTION Machine is capable of outperforming the CRAY XMP/14.
Virtual reality in surgical training.

PubMed

Lange, T; Indelicato, D J; Rosen, J M

2000-01-01

Virtual reality in surgery and, more specifically, in surgical training, faces a number of challenges in the future. These challenges are building realistic models of the human body, creating interface tools to view, hear, touch, feel, and manipulate these human body models, and integrating virtual reality systems into medical education and treatment. A final system would encompass simulators specifically for surgery, performance machines, telemedicine, and telesurgery. Each of these areas will need significant improvement for virtual reality to impact medicine successfully in the next century. This article gives an overview of, and the challenges faced by, current systems in the fast-changing field of virtual reality technology, and provides a set of specific milestones for a truly realistic virtual human body.
Social energy: mining energy from the society

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Jun Jason; Gao, David Wenzhong; Zhang, Yingchen

The inherent nature of energy, i.e., physicality, sociality and informatization, implies the inevitable and intensive interaction between energy systems and social systems. From this perspective, we define 'social energy' as a complex sociotechnical system of energy systems, social systems and the derived artificial virtual systems which characterize the intense intersystem and intra-system interactions. The recent advancement in intelligent technology, including artificial intelligence and machine learning technologies, sensing and communication in Internet of Things technologies, and massive high performance computing and extreme-scale data analytics technologies, enables the possibility of substantial advancement in socio-technical system optimization, scheduling, control and management. In thismore » paper, we provide a discussion on the nature of energy, and then propose the concept and intention of social energy systems for electrical power. A general methodology of establishing and investigating social energy is proposed, which is based on the ACP approach, i.e., 'artificial systems' (A), 'computational experiments' (C) and 'parallel execution' (P), and parallel system methodology. A case study on the University of Denver (DU) campus grid is provided and studied to demonstrate the social energy concept. In the concluding remarks, we discuss the technical pathway, in both social and nature sciences, to social energy, and our vision on its future.« less
Implementation and analysis of a Navier-Stokes algorithm on parallel computers

NASA Technical Reports Server (NTRS)

Fatoohi, Raad A.; Grosch, Chester E.

1988-01-01

The results of the implementation of a Navier-Stokes algorithm on three parallel/vector computers are presented. The object of this research is to determine how well, or poorly, a single numerical algorithm would map onto three different architectures. The algorithm is a compact difference scheme for the solution of the incompressible, two-dimensional, time-dependent Navier-Stokes equations. The computers were chosen so as to encompass a variety of architectures. They are the following: the MPP, an SIMD machine with 16K bit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. The basic comparison is among SIMD instruction parallelism on the MPP, MIMD process parallelism on the Flex/32, and vectorization of a serial code on the Cray/2. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.
AstroCloud, a Cyber-Infrastructure for Astronomy Research: Data Access and Interoperability

NASA Astrophysics Data System (ADS)

Fan, D.; He, B.; Xiao, J.; Li, S.; Li, C.; Cui, C.; Yu, C.; Hong, Z.; Yin, S.; Wang, C.; Cao, Z.; Fan, Y.; Mi, L.; Wan, W.; Wang, J.

2015-09-01

Data access and interoperability module connects the observation proposals, data, virtual machines and software. According to the unique identifier of PI (principal investigator), an email address or an internal ID, data can be collected by PI's proposals, or by the search interfaces, e.g. conesearch. Files associated with the searched results could be easily transported to cloud storages, including the storage with virtual machines, or several commercial platforms like Dropbox. Benefitted from the standards of IVOA (International Observatories Alliance), VOTable formatted searching result could be sent to kinds of VO software. Latter endeavor will try to integrate more data and connect archives and some other astronomical resources.
An efficient approach for improving virtual machine placement in cloud computing environment

NASA Astrophysics Data System (ADS)

Ghobaei-Arani, Mostafa; Shamsi, Mahboubeh; Rahmanian, Ali A.

2017-11-01

The ever increasing demand for the cloud services requires more data centres. The power consumption in the data centres is a challenging problem for cloud computing, which has not been considered properly by the data centre developer companies. Especially, large data centres struggle with the power cost and the Greenhouse gases production. Hence, employing the power efficient mechanisms are necessary to optimise the mentioned effects. Moreover, virtual machine (VM) placement can be used as an effective method to reduce the power consumption in data centres. In this paper by grouping both virtual and physical machines, and taking into account the maximum absolute deviation during the VM placement, the power consumption as well as the service level agreement (SLA) deviation in data centres are reduced. To this end, the best-fit decreasing algorithm is utilised in the simulation to reduce the power consumption by about 5% compared to the modified best-fit decreasing algorithm, and at the same time, the SLA violation is improved by 6%. Finally, the learning automata are used to a trade-off between power consumption reduction from one side, and SLA violation percentage from the other side.
Distribution Locational Real-Time Pricing Based Smart Building Control and Management

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hao, Jun; Dai, Xiaoxiao; Zhang, Yingchen

This paper proposes an real-virtual parallel computing scheme for smart building operations aiming at augmenting overall social welfare. The University of Denver's campus power grid and Ritchie fitness center is used for demonstrating the proposed approach. An artificial virtual system is built in parallel to the real physical system to evaluate the overall social cost of the building operation based on the social science based working productivity model, numerical experiment based building energy consumption model and the power system based real-time pricing mechanism. Through interactive feedback exchanged between the real and virtual system, enlarged social welfare, including monetary cost reductionmore » and energy saving, as well as working productivity improvements, can be achieved.« less
MOLA: a bootable, self-configuring system for virtual screening using AutoDock4/Vina on computer clusters.

PubMed

Abreu, Rui Mv; Froufe, Hugo Jc; Queiroz, Maria João Rp; Ferreira, Isabel Cfr

2010-10-28

Virtual screening of small molecules using molecular docking has become an important tool in drug discovery. However, large scale virtual screening is time demanding and usually requires dedicated computer clusters. There are a number of software tools that perform virtual screening using AutoDock4 but they require access to dedicated Linux computer clusters. Also no software is available for performing virtual screening with Vina using computer clusters. In this paper we present MOLA, an easy-to-use graphical user interface tool that automates parallel virtual screening using AutoDock4 and/or Vina in bootable non-dedicated computer clusters. MOLA automates several tasks including: ligand preparation, parallel AutoDock4/Vina jobs distribution and result analysis. When the virtual screening project finishes, an open-office spreadsheet file opens with the ligands ranked by binding energy and distance to the active site. All results files can automatically be recorded on an USB-flash drive or on the hard-disk drive using VirtualBox. MOLA works inside a customized Live CD GNU/Linux operating system, developed by us, that bypass the original operating system installed on the computers used in the cluster. This operating system boots from a CD on the master node and then clusters other computers as slave nodes via ethernet connections. MOLA is an ideal virtual screening tool for non-experienced users, with a limited number of multi-platform heterogeneous computers available and no access to dedicated Linux computer clusters. When a virtual screening project finishes, the computers can just be restarted to their original operating system. The originality of MOLA lies on the fact that, any platform-independent computer available can he added to the cluster, without ever using the computer hard-disk drive and without interfering with the installed operating system. With a cluster of 10 processors, and a potential maximum speed-up of 10x, the parallel algorithm of MOLA performed with a speed-up of 8,64× using AutoDock4 and 8,60× using Vina.
Architecture Adaptive Computing Environment

NASA Technical Reports Server (NTRS)

Dorband, John E.

2006-01-01

Architecture Adaptive Computing Environment (aCe) is a software system that includes a language, compiler, and run-time library for parallel computing. aCe was developed to enable programmers to write programs, more easily than was previously possible, for a variety of parallel computing architectures. Heretofore, it has been perceived to be difficult to write parallel programs for parallel computers and more difficult to port the programs to different parallel computing architectures. In contrast, aCe is supportable on all high-performance computing architectures. Currently, it is supported on LINUX clusters. aCe uses parallel programming constructs that facilitate writing of parallel programs. Such constructs were used in single-instruction/multiple-data (SIMD) programming languages of the 1980s, including Parallel Pascal, Parallel Forth, C*, *LISP, and MasPar MPL. In aCe, these constructs are extended and implemented for both SIMD and multiple- instruction/multiple-data (MIMD) architectures. Two new constructs incorporated in aCe are those of (1) scalar and virtual variables and (2) pre-computed paths. The scalar-and-virtual-variables construct increases flexibility in optimizing memory utilization in various architectures. The pre-computed-paths construct enables the compiler to pre-compute part of a communication operation once, rather than computing it every time the communication operation is performed.
Data parallel sorting for particle simulation

NASA Technical Reports Server (NTRS)

Dagum, Leonardo

1992-01-01

Sorting on a parallel architecture is a communications intensive event which can incur a high penalty in applications where it is required. In the case of particle simulation, only integer sorting is necessary, and sequential implementations easily attain the minimum performance bound of O (N) for N particles. Parallel implementations, however, have to cope with the parallel sorting problem which, in addition to incurring a heavy communications cost, can make the minimun performance bound difficult to attain. This paper demonstrates how the sorting problem in a particle simulation can be reduced to a merging problem, and describes an efficient data parallel algorithm to solve this merging problem in a particle simulation. The new algorithm is shown to be optimal under conditions usual for particle simulation, and its fieldwise implementation on the Connection Machine is analyzed in detail. The new algorithm is about four times faster than a fieldwise implementation of radix sort on the Connection Machine.
Self-replicating machines in continuous space with virtual physics.

PubMed

Smith, Arnold; Turney, Peter; Ewaschuk, Robert

2003-01-01

JohnnyVon is an implementation of self-replicating machines in continuous two-dimensional space. Two types of particles drift about in a virtual liquid. The particles are automata with discrete internal states but continuous external relationships. Their internal states are governed by finite state machines, but their external relationships are governed by a simulated physics that includes Brownian motion, viscosity, and springlike attractive and repulsive forces. The particles can be assembled into patterns that can encode arbitrary strings of bits. We demonstrate that, if an arbitrary seed pattern is put in a soup of separate individual particles, the pattern will replicate by assembling the individual particles into copies of itself. We also show that, given sufficient time, a soup of separate individual particles will eventually spontaneously form self-replicating patterns. We discuss the implications of JohnnyVon for research in nanotechnology, theoretical biology, and artificial life.
Noise and Vibration Risk Prevention Virtual Web for Ubiquitous Training

ERIC Educational Resources Information Center

Redel-Macías, María Dolores; Cubero-Atienza, Antonio J.; Martínez-Valle, José Miguel; Pedrós-Pérez, Gerardo; del Pilar Martínez-Jiménez, María

2015-01-01

This paper describes a new Web portal offering experimental labs for ubiquitous training of university engineering students in work-related risk prevention. The Web-accessible computer program simulates the noise and machine vibrations met in the work environment, in a series of virtual laboratories that mimic an actual laboratory and provide the…
Using shadow page cache to improve isolated drivers performance.

PubMed

Zheng, Hao; Dong, Xiaoshe; Wang, Endong; Chen, Baoke; Zhu, Zhengdong; Liu, Chengzhe

2015-01-01

With the advantage of the reusability property of the virtualization technology, users can reuse various types and versions of existing operating systems and drivers in a virtual machine, so as to customize their application environment. In order to prevent users' virtualization environments being impacted by driver faults in virtual machine, Chariot examines the correctness of driver's write operations by the method of combining a driver's write operation capture and a driver's private access control table. However, this method needs to keep the write permission of shadow page table as read-only, so as to capture isolated driver's write operations through page faults, which adversely affect the performance of the driver. Based on delaying setting frequently used shadow pages' write permissions to read-only, this paper proposes an algorithm using shadow page cache to improve the performance of isolated drivers and carefully study the relationship between the performance of drivers and the size of shadow page cache. Experimental results show that, through the shadow page cache, the performance of isolated drivers can be greatly improved without impacting Chariot's reliability too much.
Using Shadow Page Cache to Improve Isolated Drivers Performance

PubMed Central

Dong, Xiaoshe; Wang, Endong; Chen, Baoke; Zhu, Zhengdong; Liu, Chengzhe

2015-01-01

With the advantage of the reusability property of the virtualization technology, users can reuse various types and versions of existing operating systems and drivers in a virtual machine, so as to customize their application environment. In order to prevent users' virtualization environments being impacted by driver faults in virtual machine, Chariot examines the correctness of driver's write operations by the method of combining a driver's write operation capture and a driver's private access control table. However, this method needs to keep the write permission of shadow page table as read-only, so as to capture isolated driver's write operations through page faults, which adversely affect the performance of the driver. Based on delaying setting frequently used shadow pages' write permissions to read-only, this paper proposes an algorithm using shadow page cache to improve the performance of isolated drivers and carefully study the relationship between the performance of drivers and the size of shadow page cache. Experimental results show that, through the shadow page cache, the performance of isolated drivers can be greatly improved without impacting Chariot's reliability too much. PMID:25815373
A high performance parallel algorithm for 1-D FFT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Agarwal, R.C.; Gustavson, F.G.; Zubair, M.

1994-12-31

In this paper the authors propose a parallel high performance FFT algorithm based on a multi-dimensional formulation. They use this to solve a commonly encountered FFT based kernel on a distributed memory parallel machine, the IBM scalable parallel system, SP1. The kernel requires a forward FFT computation of an input sequence, multiplication of the transformed data by a coefficient array, and finally an inverse FFT computation of the resultant data. They show that the multi-dimensional formulation helps in reducing the communication costs and also improves the single node performance by effectively utilizing the memory system of the node. They implementedmore » this kernel on the IBM SP1 and observed a performance of 1.25 GFLOPS on a 64-node machine.« less

A Pervasive Parallel Processing Framework for Data Visualization and Analysis at Extreme Scale

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moreland, Kenneth; Geveci, Berk

2014-11-01

The evolution of the computing world from teraflop to petaflop has been relatively effortless, with several of the existing programming models scaling effectively to the petascale. The migration to exascale, however, poses considerable challenges. All industry trends infer that the exascale machine will be built using processors containing hundreds to thousands of cores per chip. It can be inferred that efficient concurrency on exascale machines requires a massive amount of concurrent threads, each performing many operations on a localized piece of data. Currently, visualization libraries and applications are based off what is known as the visualization pipeline. In the pipelinemore » model, algorithms are encapsulated as filters with inputs and outputs. These filters are connected by setting the output of one component to the input of another. Parallelism in the visualization pipeline is achieved by replicating the pipeline for each processing thread. This works well for today’s distributed memory parallel computers but cannot be sustained when operating on processors with thousands of cores. Our project investigates a new visualization framework designed to exhibit the pervasive parallelism necessary for extreme scale machines. Our framework achieves this by defining algorithms in terms of worklets, which are localized stateless operations. Worklets are atomic operations that execute when invoked unlike filters, which execute when a pipeline request occurs. The worklet design allows execution on a massive amount of lightweight threads with minimal overhead. Only with such fine-grained parallelism can we hope to fill the billions of threads we expect will be necessary for efficient computation on an exascale machine.« less
A Data Parallel Multizone Navier-Stokes Code

NASA Technical Reports Server (NTRS)

Jespersen, Dennis C.; Levit, Creon; Kwak, Dochan (Technical Monitor)

1995-01-01

We have developed a data parallel multizone compressible Navier-Stokes code on the Connection Machine CM-5. The code is set up for implicit time-stepping on single or multiple structured grids. For multiple grids and geometrically complex problems, we follow the "chimera" approach, where flow data on one zone is interpolated onto another in the region of overlap. We will describe our design philosophy and give some timing results for the current code. The design choices can be summarized as: 1. finite differences on structured grids; 2. implicit time-stepping with either distributed solves or data motion and local solves; 3. sequential stepping through multiple zones with interzone data transfer via a distributed data structure. We have implemented these ideas on the CM-5 using CMF (Connection Machine Fortran), a data parallel language which combines elements of Fortran 90 and certain extensions, and which bears a strong similarity to High Performance Fortran (HPF). One interesting feature is the issue of turbulence modeling, where the architecture of a parallel machine makes the use of an algebraic turbulence model awkward, whereas models based on transport equations are more natural. We will present some performance figures for the code on the CM-5, and consider the issues involved in transitioning the code to HPF for portability to other parallel platforms.
Ordered fast fourier transforms on a massively parallel hypercube multiprocessor

NASA Technical Reports Server (NTRS)

Tong, Charles; Swarztrauber, Paul N.

1989-01-01

Design alternatives for ordered Fast Fourier Transformation (FFT) algorithms were examined on massively parallel hypercube multiprocessors such as the Connection Machine. Particular emphasis is placed on reducing communication which is known to dominate the overall computing time. To this end, the order and computational phases of the FFT were combined, and the sequence to processor maps that reduce communication were used. The class of ordered transforms is expanded to include any FFT in which the order of the transform is the same as that of the input sequence. Two such orderings are examined, namely, standard-order and A-order which can be implemented with equal ease on the Connection Machine where orderings are determined by geometries and priorities. If the sequence has N = 2 exp r elements and the hypercube has P = 2 exp d processors, then a standard-order FFT can be implemented with d + r/2 + 1 parallel transmissions. An A-order sequence can be transformed with 2d - r/2 parallel transmissions which is r - d + 1 fewer than the standard order. A parallel method for computing the trigonometric coefficients is presented that does not use trigonometric functions or interprocessor communication. A performance of 0.9 GFLOPS was obtained for an A-order transform on the Connection Machine.
VirtualSpace: A vision of a machine-learned virtual space environment

NASA Astrophysics Data System (ADS)

Bortnik, J.; Sarno-Smith, L. K.; Chu, X.; Li, W.; Ma, Q.; Angelopoulos, V.; Thorne, R. M.

2017-12-01

Space borne instrumentation tends to come and go. A typical instrument will go through a phase of design and construction, be deployed on a spacecraft for several years while it collects data, and then be decommissioned and fade into obscurity. The data collected from that instrument will typically receive much attention while it is being collected, perhaps in the form of event studies, conjunctions with other instruments, or a few statistical surveys, but once the instrument or spacecraft is decommissioned, the data will be archived and receive progressively less attention with every passing year. This is the fate of all historical data, and will be the fate of data being collected by instruments even at the present time. But what if those instruments could come alive, and all be simultaneously present at any and every point in time and space? Imagine the scientific insights, and societal gains that could be achieved with a grand (virtual) heliophysical observatory that consists of every current and historical mission ever deployed? We propose that this is not just fantasy but is imminently doable with the data currently available, with the present computational resources, and with currently available algorithms. This project revitalizes existing data resources and lays the groundwork for incorporating data from every future mission to expand the scope and refine the resolution of the virtual observatory. We call this project VirtualSpace: a machine-learned virtual space environment.
Fast parallel 3D profilometer with DMD technology

NASA Astrophysics Data System (ADS)

Hou, Wenmei; Zhang, Yunbo

2011-12-01

Confocal microscope has been a powerful tool for three-dimensional profile analysis. Single mode confocal microscope is limited by scanning speed. This paper presents a 3D profilometer prototype of parallel confocal microscope based on DMD (Digital Micromirror Device). In this system the DMD takes the place of Nipkow Disk which is a classical parallel scanning scheme to realize parallel lateral scanning technique. Operated with certain pattern, the DMD generates a virtual pinholes array which separates the light into multi-beams. The key parameters that affect the measurement (pinhole size and the lateral scanning distance) can be configured conveniently by different patterns sent to DMD chip. To avoid disturbance between two virtual pinholes working at the same time, a scanning strategy is adopted. Depth response curve both axial and abaxial were extract. Measurement experiments have been carried out on silicon structured sample, and axial resolution of 55nm is achieved.
Conceptual Study of Permanent Magnet Machine Ship Propulsion Systems

DTIC Science & Technology

1977-12-01

cycloconverter subsystem is designed using advanced thyristors and can be either water or air cooled. The machine-cycloconverter, many-phase or parallel...Turnb, Phase, Poles, Air Gap ................................. 3-9 3-5 Machine Characteristics Versus Number of Poles (large machine, 40 000 hp). Poles...cylindrical permanent magnet generator forces the power conditioner to provide for both frequency change and voltage control. The complexity of this dual
Pre-resistance-welding resistance check

DOEpatents

Destefan, Dennis E.; Stompro, David A.

1991-01-01

A preweld resistance check for resistance welding machines uses an open circuited measurement to determine the welding machine resistance, a closed circuit measurement to determine the parallel resistance of a workpiece set and the machine, and a calculation to determine the resistance of the workpiece set. Any variation in workpiece set or machine resistance is an indication that the weld may be different from a control weld.
Advances in Parallelization for Large Scale Oct-Tree Mesh Generation

NASA Technical Reports Server (NTRS)

O'Connell, Matthew; Karman, Steve L.

2015-01-01

Despite great advancements in the parallelization of numerical simulation codes over the last 20 years, it is still common to perform grid generation in serial. Generating large scale grids in serial often requires using special "grid generation" compute machines that can have more than ten times the memory of average machines. While some parallel mesh generation techniques have been proposed, generating very large meshes for LES or aeroacoustic simulations is still a challenging problem. An automated method for the parallel generation of very large scale off-body hierarchical meshes is presented here. This work enables large scale parallel generation of off-body meshes by using a novel combination of parallel grid generation techniques and a hybrid "top down" and "bottom up" oct-tree method. Meshes are generated using hardware commonly found in parallel compute clusters. The capability to generate very large meshes is demonstrated by the generation of off-body meshes surrounding complex aerospace geometries. Results are shown including a one billion cell mesh generated around a Predator Unmanned Aerial Vehicle geometry, which was generated on 64 processors in under 45 minutes.
Virtual pools for interactive analysis and software development through an integrated Cloud environment

NASA Astrophysics Data System (ADS)

Grandi, C.; Italiano, A.; Salomoni, D.; Calabrese Melcarne, A. K.

2011-12-01

WNoDeS, an acronym for Worker Nodes on Demand Service, is software developed at CNAF-Tier1, the National Computing Centre of the Italian Institute for Nuclear Physics (INFN) located in Bologna. WNoDeS provides on demand, integrated access to both Grid and Cloud resources through virtualization technologies. Besides the traditional use of computing resources in batch mode, users need to have interactive and local access to a number of systems. WNoDeS can dynamically select these computers instantiating Virtual Machines, according to the requirements (computing, storage and network resources) of users through either the Open Cloud Computing Interface API, or through a web console. An interactive use is usually limited to activities in user space, i.e. where the machine configuration is not modified. In some other instances the activity concerns development and testing of services and thus implies the modification of the system configuration (and, therefore, root-access to the resource). The former use case is a simple extension of the WNoDeS approach, where the resource is provided in interactive mode. The latter implies saving the virtual image at the end of each user session so that it can be presented to the user at subsequent requests. This work describes how the LHC experiments at INFN-Bologna are testing and making use of these dynamically created ad-hoc machines via WNoDeS to support flexible, interactive analysis and software development at the INFN Tier-1 Computing Centre.
Towards a Better Distributed Framework for Learning Big Data

DTIC Science & Technology

2017-06-14

UNLIMITED: PB Public Release 13. SUPPLEMENTARY NOTES 14. ABSTRACT This work aimed at solving issues in distributed machine learning. The PI’s team proposed...communication load. Finally, the team proposed the parallel least-squares policy iteration (parallel LSPI) to parallelize a reinforcement policy learning. 15
Electrical Machines Laminations Magnetic Properties: A Virtual Instrument Laboratory

ERIC Educational Resources Information Center

Martinez-Roman, Javier; Perez-Cruz, Juan; Pineda-Sanchez, Manuel; Puche-Panadero, Ruben; Roger-Folch, Jose; Riera-Guasp, Martin; Sapena-Baño, Angel

2015-01-01

Undergraduate courses in electrical machines often include an introduction to their magnetic circuits and to the various magnetic materials used in their construction and their properties. The students must learn to be able to recognize and compare the permeability, saturation, and losses of these magnetic materials, relate each material to its…
MLViS: A Web Tool for Machine Learning-Based Virtual Screening in Early-Phase of Drug Discovery and Development

PubMed Central

Korkmaz, Selcuk; Zararsiz, Gokmen; Goksuluk, Dincer

2015-01-01

Virtual screening is an important step in early-phase of drug discovery process. Since there are thousands of compounds, this step should be both fast and effective in order to distinguish drug-like and nondrug-like molecules. Statistical machine learning methods are widely used in drug discovery studies for classification purpose. Here, we aim to develop a new tool, which can classify molecules as drug-like and nondrug-like based on various machine learning methods, including discriminant, tree-based, kernel-based, ensemble and other algorithms. To construct this tool, first, performances of twenty-three different machine learning algorithms are compared by ten different measures, then, ten best performing algorithms have been selected based on principal component and hierarchical cluster analysis results. Besides classification, this application has also ability to create heat map and dendrogram for visual inspection of the molecules through hierarchical cluster analysis. Moreover, users can connect the PubChem database to download molecular information and to create two-dimensional structures of compounds. This application is freely available through www.biosoft.hacettepe.edu.tr/MLViS/. PMID:25928885
Prediction Based Proactive Thermal Virtual Machine Scheduling in Green Clouds

PubMed Central

Kinger, Supriya; Kumar, Rajesh; Sharma, Anju

2014-01-01

Cloud computing has rapidly emerged as a widely accepted computing paradigm, but the research on Cloud computing is still at an early stage. Cloud computing provides many advanced features but it still has some shortcomings such as relatively high operating cost and environmental hazards like increasing carbon footprints. These hazards can be reduced up to some extent by efficient scheduling of Cloud resources. Working temperature on which a machine is currently running can be taken as a criterion for Virtual Machine (VM) scheduling. This paper proposes a new proactive technique that considers current and maximum threshold temperature of Server Machines (SMs) before making scheduling decisions with the help of a temperature predictor, so that maximum temperature is never reached. Different workload scenarios have been taken into consideration. The results obtained show that the proposed system is better than existing systems of VM scheduling, which does not consider current temperature of nodes before making scheduling decisions. Thus, a reduction in need of cooling systems for a Cloud environment has been obtained and validated. PMID:24737962
Parallel-distributed mobile robot simulator

NASA Astrophysics Data System (ADS)

Okada, Hiroyuki; Sekiguchi, Minoru; Watanabe, Nobuo

1996-06-01

The aim of this project is to achieve an autonomous learning and growth function based on active interaction with the real world. It should also be able to autonomically acquire knowledge about the context in which jobs take place, and how the jobs are executed. This article describes a parallel distributed movable robot system simulator with an autonomous learning and growth function. The autonomous learning and growth function which we are proposing is characterized by its ability to learn and grow through interaction with the real world. When the movable robot interacts with the real world, the system compares the virtual environment simulation with the interaction result in the real world. The system then improves the virtual environment to match the real-world result more closely. This the system learns and grows. It is very important that such a simulation is time- realistic. The parallel distributed movable robot simulator was developed to simulate the space of a movable robot system with an autonomous learning and growth function. The simulator constructs a virtual space faithful to the real world and also integrates the interfaces between the user, the actual movable robot and the virtual movable robot. Using an ultrafast CG (computer graphics) system (FUJITSU AG series), time-realistic 3D CG is displayed.
Parallelizing serial code for a distributed processing environment with an application to high frequency electromagnetic scattering

NASA Astrophysics Data System (ADS)

Work, Paul R.

1991-12-01

This thesis investigates the parallelization of existing serial programs in computational electromagnetics for use in a parallel environment. Existing algorithms for calculating the radar cross section of an object are covered, and a ray-tracing code is chosen for implementation on a parallel machine. Current parallel architectures are introduced and a suitable parallel machine is selected for the implementation of the chosen ray-tracing algorithm. The standard techniques for the parallelization of serial codes are discussed, including load balancing and decomposition considerations, and appropriate methods for the parallelization effort are selected. A load balancing algorithm is modified to increase the efficiency of the application, and a high level design of the structure of the serial program is presented. A detailed design of the modifications for the parallel implementation is also included, with both the high level and the detailed design specified in a high level design language called UNITY. The correctness of the design is proven using UNITY and standard logic operations. The theoretical and empirical results show that it is possible to achieve an efficient parallel application for a serial computational electromagnetic program where the characteristics of the algorithm and the target architecture critically influence the development of such an implementation.
Virtual reality hardware and graphic display options for brain-machine interfaces

PubMed Central

Marathe, Amar R.; Carey, Holle L.; Taylor, Dawn M.

2009-01-01

Virtual reality hardware and graphic displays are reviewed here as a development environment for brain-machine interfaces (BMIs). Two desktop stereoscopic monitors and one 2D monitor were compared in a visual depth discrimination task and in a 3D target-matching task where able-bodied individuals used actual hand movements to match a virtual hand to different target hands. Three graphic representations of the hand were compared: a plain sphere, a sphere attached to the fingertip of a realistic hand and arm, and a stylized pacman-like hand. Several subjects had great difficulty using either stereo monitor for depth perception when perspective size cues were removed. A mismatch in stereo and size cues generated inappropriate depth illusions. This phenomenon has implications for choosing target and virtual hand sizes in BMI experiments. Target matching accuracy was about as good with the 2D monitor as with either 3D monitor. However, users achieved this accuracy by exploring the boundaries of the hand in the target with carefully controlled movements. This method of determining relative depth may not be possible in BMI experiments if movement control is more limited. Intuitive depth cues, such as including a virtual arm, can significantly improve depth perception accuracy with or without stereo viewing. PMID:18006069
Establishing a group of endpoints to support collective operations without specifying unique identifiers for any endpoints

DOEpatents

Archer, Charles J.; Blocksom, Michael A.; Ratterman, Joseph D.; Smith, Brian E.; Xue, Hanghon

2016-02-02

A parallel computer executes a number of tasks, each task includes a number of endpoints and the endpoints are configured to support collective operations. In such a parallel computer, establishing a group of endpoints receiving a user specification of a set of endpoints included in a global collection of endpoints, where the user specification defines the set in accordance with a predefined virtual representation of the endpoints, the predefined virtual representation is a data structure setting forth an organization of tasks and endpoints included in the global collection of endpoints and the user specification defines the set of endpoints without a user specification of a particular endpoint; and defining a group of endpoints in dependence upon the predefined virtual representation of the endpoints and the user specification.
Towards the Teraflop CFD

NASA Technical Reports Server (NTRS)

Schreiber, Robert; Simon, Horst D.

1992-01-01

We are surveying current projects in the area of parallel supercomputers. The machines considered here will become commercially available in the 1990 - 1992 time frame. All are suitable for exploring the critical issues in applying parallel processors to large scale scientific computations, in particular CFD calculations. This chapter presents an overview of the surveyed machines, and a detailed analysis of the various architectural and technology approaches taken. Particular emphasis is placed on the feasibility of a Teraflops capability following the paths proposed by various developers.
Research on precision grinding technology of large scale and ultra thin optics

NASA Astrophysics Data System (ADS)

Zhou, Lian; Wei, Qiancai; Li, Jie; Chen, Xianhua; Zhang, Qinghua

2018-03-01

The flatness and parallelism error of large scale and ultra thin optics have an important influence on the subsequent polishing efficiency and accuracy. In order to realize the high precision grinding of those ductile elements, the low deformation vacuum chuck was designed first, which was used for clamping the optics with high supporting rigidity in the full aperture. Then the optics was planar grinded under vacuum adsorption. After machining, the vacuum system was turned off. The form error of optics was on-machine measured using displacement sensor after elastic restitution. The flatness would be convergenced with high accuracy by compensation machining, whose trajectories were integrated with the measurement result. For purpose of getting high parallelism, the optics was turned over and compensation grinded using the form error of vacuum chuck. Finally, the grinding experiment of large scale and ultra thin fused silica optics with aperture of 430mm×430mm×10mm was performed. The best P-V flatness of optics was below 3 μm, and parallelism was below 3 ″. This machining technique has applied in batch grinding of large scale and ultra thin optics.
myChEMBL: a virtual machine implementation of open data and cheminformatics tools.

PubMed

Ochoa, Rodrigo; Davies, Mark; Papadatos, George; Atkinson, Francis; Overington, John P

2014-01-15

myChEMBL is a completely open platform, which combines public domain bioactivity data with open source database and cheminformatics technologies. myChEMBL consists of a Linux (Ubuntu) Virtual Machine featuring a PostgreSQL schema with the latest version of the ChEMBL database, as well as the latest RDKit cheminformatics libraries. In addition, a self-contained web interface is available, which can be modified and improved according to user specifications. The VM is available at: ftp://ftp.ebi.ac.uk/pub/databases/chembl/VM/myChEMBL/current. The web interface and web services code is available at: https://github.com/rochoa85/myChEMBL.

General-Purpose Front End for Real-Time Data Processing

NASA Technical Reports Server (NTRS)

James, Mark

2007-01-01

FRONTIER is a computer program that functions as a front end for any of a variety of other software of both the artificial intelligence (AI) and conventional data-processing types. As used here, front end signifies interface software needed for acquiring and preprocessing data and making the data available for analysis by the other software. FRONTIER is reusable in that it can be rapidly tailored to any such other software with minimum effort. Each component of FRONTIER is programmable and is executed in an embedded virtual machine. Each component can be reconfigured during execution. The virtual-machine implementation making FRONTIER independent of the type of computing hardware on which it is executed.
Examining Effects of Virtual Machine Settings on Voice over Internet Protocol in a Private Cloud Environment

ERIC Educational Resources Information Center

Liao, Yuan

2011-01-01

The virtualization of computing resources, as represented by the sustained growth of cloud computing, continues to thrive. Information Technology departments are building their private clouds due to the perception of significant cost savings by managing all physical computing resources from a single point and assigning them to applications or…
New Virtual Field Trips. Revised Edition.

ERIC Educational Resources Information Center

Cooper, Gail; Cooper, Garry

This book is an annotated guidebook, arranged by subject matter, of World Wide Web sites for K-12 students. The following chapters are included: (1) Virtual Time Machine (i.e., sites that cover topics in world history); (2) Tour the World (i.e., sites that include information about countries); (3) Outer Space; (4) The Great Outdoors; (5) Aquatic…
Parallel algorithms for boundary value problems

NASA Technical Reports Server (NTRS)

Lin, Avi

1990-01-01

A general approach to solve boundary value problems numerically in a parallel environment is discussed. The basic algorithm consists of two steps: the local step where all the P available processors work in parallel, and the global step where one processor solves a tridiagonal linear system of the order P. The main advantages of this approach are two fold. First, this suggested approach is very flexible, especially in the local step and thus the algorithm can be used with any number of processors and with any of the SIMD or MIMD machines. Secondly, the communication complexity is very small and thus can be used as easily with shared memory machines. Several examples for using this strategy are discussed.
Vascular system modeling in parallel environment - distributed and shared memory approaches

PubMed Central

Jurczuk, Krzysztof; Kretowski, Marek; Bezy-Wendling, Johanne

2011-01-01

The paper presents two approaches in parallel modeling of vascular system development in internal organs. In the first approach, new parts of tissue are distributed among processors and each processor is responsible for perfusing its assigned parts of tissue to all vascular trees. Communication between processors is accomplished by passing messages and therefore this algorithm is perfectly suited for distributed memory architectures. The second approach is designed for shared memory machines. It parallelizes the perfusion process during which individual processing units perform calculations concerning different vascular trees. The experimental results, performed on a computing cluster and multi-core machines, show that both algorithms provide a significant speedup. PMID:21550891
Parallel matrix multiplication on the Connection Machine

NASA Technical Reports Server (NTRS)

Tichy, Walter F.

1988-01-01

Matrix multiplication is a computation and communication intensive problem. Six parallel algorithms for matrix multiplication on the Connection Machine are presented and compared with respect to their performance and processor usage. For n by n matrices, the algorithms have theoretical running times of O(n to the 2nd power log n), O(n log n), O(n), and O(log n), and require n, n to the 2nd power, n to the 2nd power, and n to the 3rd power processors, respectively. With careful attention to communication patterns, the theoretically predicted runtimes can indeed be achieved in practice. The parallel algorithms illustrate the tradeoffs between performance, communication cost, and processor usage.
PCLIPS: Parallel CLIPS

NASA Technical Reports Server (NTRS)

Gryphon, Coranth D.; Miller, Mark D.

1991-01-01

PCLIPS (Parallel CLIPS) is a set of extensions to the C Language Integrated Production System (CLIPS) expert system language. PCLIPS is intended to provide an environment for the development of more complex, extensive expert systems. Multiple CLIPS expert systems are now capable of running simultaneously on separate processors, or separate machines, thus dramatically increasing the scope of solvable tasks within the expert systems. As a tool for parallel processing, PCLIPS allows for an expert system to add to its fact-base information generated by other expert systems, thus allowing systems to assist each other in solving a complex problem. This allows individual expert systems to be more compact and efficient, and thus run faster or on smaller machines.
Review of Enabling Technologies to Facilitate Secure Compute Customization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aderholdt, Ferrol; Caldwell, Blake A; Hicks, Susan Elaine

High performance computing environments are often used for a wide variety of workloads ranging from simulation, data transformation and analysis, and complex workflows to name just a few. These systems may process data for a variety of users, often requiring strong separation between job allocations. There are many challenges to establishing these secure enclaves within the shared infrastructure of high-performance computing (HPC) environments. The isolation mechanisms in the system software are the basic building blocks for enabling secure compute enclaves. There are a variety of approaches and the focus of this report is to review the different virtualization technologies thatmore » facilitate the creation of secure compute enclaves. The report reviews current operating system (OS) protection mechanisms and modern virtualization technologies to better understand the performance/isolation properties. We also examine the feasibility of running ``virtualized'' computing resources as non-privileged users, and providing controlled administrative permissions for standard users running within a virtualized context. Our examination includes technologies such as Linux containers (LXC [32], Docker [15]) and full virtualization (KVM [26], Xen [5]). We categorize these different approaches to virtualization into two broad groups: OS-level virtualization and system-level virtualization. The OS-level virtualization uses containers to allow a single OS kernel to be partitioned to create Virtual Environments (VE), e.g., LXC. The resources within the host's kernel are only virtualized in the sense of separate namespaces. In contrast, system-level virtualization uses hypervisors to manage multiple OS kernels and virtualize the physical resources (hardware) to create Virtual Machines (VM), e.g., Xen, KVM. This terminology of VE and VM, detailed in Section 2, is used throughout the report to distinguish between the two different approaches to providing virtualized execution environments. As part of our technology review we analyzed several current virtualization solutions to assess their vulnerabilities. This included a review of common vulnerabilities and exposures (CVEs) for Xen, KVM, LXC and Docker to gauge their susceptibility to different attacks. The complete details are provided in Section 5 on page 33. Based on this review we concluded that system-level virtualization solutions have many more vulnerabilities than OS level virtualization solutions. As such, security mechanisms like sVirt (Section 3.3) should be considered when using system-level virtualization solutions in order to protect the host against exploits. The majority of vulnerabilities related to KVM, LXC, and Docker are in specific regions of the system. Therefore, future "zero day attacks" are likely to be in the same regions, which suggests that protecting these areas can simplify the protection of the host and maintain the isolation between users. The evaluations of virtualization technologies done thus far are discussed in Section 4. This includes experiments with 'user' namespaces in VEs, which provides the ability to isolate user privileges and allow a user to run with different UIDs within the container while mapping them to non-privileged UIDs in the host. We have identified Linux namespaces as a promising mechanism to isolate shared resources, while maintaining good performance. In Section 4.1 we describe our tests with LXC as a non-root user and leveraging namespaces to control UID/GID mappings and support controlled sharing of parallel file-systems. We highlight several of these namespace capabilities in Section 6.2.3. The other evaluations that were performed during this initial phase of work provide baseline performance data for comparing VEs and VMs to purely native execution. In Section 4.2 we performed tests using the High-Performance Computing Conjugate Gradient (HPCCG) benchmark to establish baseline performance for a scientific application when run on the Native (host) machine in contrast with execution under Docker and KVM. Our tests verified prior studies showing roughly 2-4% overheads in application execution time & MFlops when running in hypervisor-base environments (VMs) as compared to near native performance with VEs. For more details, see Figures 4.5 (page 28), 4.6 (page 28), and 4.7 (page 29). Additionally, in Section 4.3 we include network measurements for TCP bandwidth performance over the 10GigE interface in our testbed. The Native and Docker based tests achieved >= ~9Gbits/sec, while the KVM configuration only achieved 2.5Gbits/sec (Table 4.6 on page 32). This may be a configuration issue with our KVM installation, and is a point for further testing as we refine the network settings in the testbed. The initial network tests were done using a bridged networking configuration. The report outline is as follows: - Section 1 introduces the report and clarifies the scope of the proj...« less
Virtual Factory Framework for Supporting Production Planning and Control.

PubMed

Kibira, Deogratias; Shao, Guodong

2017-01-01

Developing optimal production plans for smart manufacturing systems is challenging because shop floor events change dynamically. A virtual factory incorporating engineering tools, simulation, and optimization generates and communicates performance data to guide wise decision making for different control levels. This paper describes such a platform specifically for production planning. We also discuss verification and validation of the constituent models. A case study of a machine shop is used to demonstrate data generation for production planning in a virtual factory.
Parallel Algorithms for Computer Vision

DTIC Science & Technology

1990-04-01

NA86-1, Thinking Machines Corporation, Cambridge, MA, December 1986. [43] J. Little, G. Blelloch, and T. Cass. How to program the connection machine for... to program the connection machine for computer vision. In Proc. Workshop on Comp. Architecture for Pattern Analysis and Machine Intell., 1987. [92] J...In Proceedings of SPIE Conf. on Advances in Intelligent Robotics Systems, Bellingham, VA, 1987. SPIE. [91] J. Little, G. Blelloch, and T. Cass. How
Sputnik: ad hoc distributed computation.

PubMed

Völkel, Gunnar; Lausser, Ludwig; Schmid, Florian; Kraus, Johann M; Kestler, Hans A

2015-04-15

In bioinformatic applications, computationally demanding algorithms are often parallelized to speed up computation. Nevertheless, setting up computational environments for distributed computation is often tedious. Aim of this project were the lightweight ad hoc set up and fault-tolerant computation requiring only a Java runtime, no administrator rights, while utilizing all CPU cores most effectively. The Sputnik framework provides ad hoc distributed computation on the Java Virtual Machine which uses all supplied CPU cores fully. It provides a graphical user interface for deployment setup and a web user interface displaying the current status of current computation jobs. Neither a permanent setup nor administrator privileges are required. We demonstrate the utility of our approach on feature selection of microarray data. The Sputnik framework is available on Github http://github.com/sysbio-bioinf/sputnik under the Eclipse Public License. hkestler@fli-leibniz.de or hans.kestler@uni-ulm.de Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Design and manufacture of the integrated field unit for the NIRSpec spectrometer on JWST

NASA Astrophysics Data System (ADS)

Lobb, Daniel; Robertson, David; Closs, Martin; Barnes, Andy

2008-09-01

The NIRSpec imaging spectrometer, which forms part of the James Webb Space Telescope instrumentation, will include an integrated field unit (IFU). The IFU will be tasked specifically with efficient analysis of extended objects, including galaxies; it will accept a square image area at the spectrometer entrance field, dissect this area into 30 parallel sub-slits, and image the sub-slits end-to-end, forming a single virtual entrance slit. The IFU, uses all-mirror optics to operate over the spectral range 700nm to 5000nm. 95 mirrors and the main support structure are made in a common aluminium alloy, to achieve athermal performance down to an operating temperature of around 30K. Relatively complex mirror surface shapes are produced by diamond machining. The IFU has been designed and constructed by SSTL, with optics produced by CfAI; the unit is currently undergoing performance tests. This paper describes the IFU optical design and performance, and outlines the mirror manufacturing methods and alignment procedures.
Searching for patterns in remote sensing image databases using neural networks

NASA Technical Reports Server (NTRS)

Paola, Justin D.; Schowengerdt, Robert A.

1995-01-01

We have investigated a method, based on a successful neural network multispectral image classification system, of searching for single patterns in remote sensing databases. While defining the pattern to search for and the feature to be used for that search (spectral, spatial, temporal, etc.) is challenging, a more difficult task is selecting competing patterns to train against the desired pattern. Schemes for competing pattern selection, including random selection and human interpreted selection, are discussed in the context of an example detection of dense urban areas in Landsat Thematic Mapper imagery. When applying the search to multiple images, a simple normalization method can alleviate the problem of inconsistent image calibration. Another potential problem, that of highly compressed data, was found to have a minimal effect on the ability to detect the desired pattern. The neural network algorithm has been implemented using the PVM (Parallel Virtual Machine) library and nearly-optimal speedups have been obtained that help alleviate the long process of searching through imagery.
Analysis of multigrid methods on massively parallel computers: Architectural implications

NASA Technical Reports Server (NTRS)

Matheson, Lesley R.; Tarjan, Robert E.

1993-01-01

We study the potential performance of multigrid algorithms running on massively parallel computers with the intent of discovering whether presently envisioned machines will provide an efficient platform for such algorithms. We consider the domain parallel version of the standard V cycle algorithm on model problems, discretized using finite difference techniques in two and three dimensions on block structured grids of size 10(exp 6) and 10(exp 9), respectively. Our models of parallel computation were developed to reflect the computing characteristics of the current generation of massively parallel multicomputers. These models are based on an interconnection network of 256 to 16,384 message passing, 'workstation size' processors executing in an SPMD mode. The first model accomplishes interprocessor communications through a multistage permutation network. The communication cost is a logarithmic function which is similar to the costs in a variety of different topologies. The second model allows single stage communication costs only. Both models were designed with information provided by machine developers and utilize implementation derived parameters. With the medium grain parallelism of the current generation and the high fixed cost of an interprocessor communication, our analysis suggests an efficient implementation requires the machine to support the efficient transmission of long messages, (up to 1000 words) or the high initiation cost of a communication must be significantly reduced through an alternative optimization technique. Furthermore, with variable length message capability, our analysis suggests the low diameter multistage networks provide little or no advantage over a simple single stage communications network.
Dynamic Load Balancing for Grid Partitioning on a SP-2 Multiprocessor: A Framework

NASA Technical Reports Server (NTRS)

Sohn, Andrew; Simon, Horst; Lasinski, T. A. (Technical Monitor)

1994-01-01

Computational requirements of full scale computational fluid dynamics change as computation progresses on a parallel machine. The change in computational intensity causes workload imbalance of processors, which in turn requires a large amount of data movement at runtime. If parallel CFD is to be successful on a parallel or massively parallel machine, balancing of the runtime load is indispensable. Here a framework is presented for dynamic load balancing for CFD applications, called Jove. One processor is designated as a decision maker Jove while others are assigned to computational fluid dynamics. Processors running CFD send flags to Jove in a predetermined number of iterations to initiate load balancing. Jove starts working on load balancing while other processors continue working with the current data and load distribution. Jove goes through several steps to decide if the new data should be taken, including preliminary evaluate, partition, processor reassignment, cost evaluation, and decision. Jove running on a single EBM SP2 node has been completely implemented. Preliminary experimental results show that the Jove approach to dynamic load balancing can be effective for full scale grid partitioning on the target machine IBM SP2.
Dynamic Load Balancing For Grid Partitioning on a SP-2 Multiprocessor: A Framework

NASA Technical Reports Server (NTRS)

Sohn, Andrew; Simon, Horst; Lasinski, T. A. (Technical Monitor)

1994-01-01

Computational requirements of full scale computational fluid dynamics change as computation progresses on a parallel machine. The change in computational intensity causes workload imbalance of processors, which in turn requires a large amount of data movement at runtime. If parallel CFD is to be successful on a parallel or massively parallel machine, balancing of the runtime load is indispensable. Here a framework is presented for dynamic load balancing for CFD applications, called Jove. One processor is designated as a decision maker Jove while others are assigned to computational fluid dynamics. Processors running CFD send flags to Jove in a predetermined number of iterations to initiate load balancing. Jove starts working on load balancing while other processors continue working with the current data and load distribution. Jove goes through several steps to decide if the new data should be taken, including preliminary evaluate, partition, processor reassignment, cost evaluation, and decision. Jove running on a single IBM SP2 node has been completely implemented. Preliminary experimental results show that the Jove approach to dynamic load balancing can be effective for full scale grid partitioning on the target machine IBM SP2.
The influence of foot position on scrum kinetics during machine scrummaging.

PubMed

Bayne, Helen; Kat, Cor-Jacques

2018-05-23

The purpose of this study was to investigate the effect of variations in the alignment of the feet on scrum kinetics during machine scrummaging. Twenty nine rugby forwards from amateur-level teams completed maximal scrum efforts against an instrumented scrum machine, with the feet in parallel and non-parallel positions. Three-dimensional forces, the moment about the vertical axis and sagittal plane joint angles were measured during the sustained pushing phase. There was a decrease in the magnitude of the resultant force and compression force in both of the non-parallel conditions compared to parallel and larger compression forces were associated with more extended hip and knee angles. Scrummaging with the left foot forward resulted in the lateral force being directed more towards the left and the turning moment becoming more clockwise. These directional changes were reversed when scrummaging with the right foot forward. Scrummaging with the right foot positioned ahead of the left may serve to counteract the natural clockwise wheel of the live scrum and could be used to achieve an anti-clockwise rotation of the scrum for tactical reasons. However, this would be associated with lower resultant forces and a greater lateral shear force component directed towards the right.
On-Line Scheduling of Parallel Machines

DTIC Science & Technology

1990-11-01

machine without losing any work; this is referred to as the preemptive model. In contrast to the nonpreemptive model which we have considered in this paper...that there exists no schedule of length d. The 2-relaxed decision procedure is as follows. Put each job into the queue of the slowest machine Mk such...in their queues . If a machine’s queue is empty it takes jobs to process from the queue of the first machine that is slower than it and that has a
Design, Results, Evolution and Status of the ATLAS Simulation at Point1 Project

NASA Astrophysics Data System (ADS)

Ballestrero, S.; Batraneanu, S. M.; Brasolin, F.; Contescu, C.; Fazio, D.; Di Girolamo, A.; Lee, C. J.; Pozo Astigarraga, M. E.; Scannicchio, D. A.; Sedov, A.; Twomey, M. S.; Wang, F.; Zaytsev, A.

2015-12-01

During the LHC Long Shutdown 1 (LSI) period, that started in 2013, the Simulation at Point1 (Sim@P1) project takes advantage, in an opportunistic way, of the TDAQ (Trigger and Data Acquisition) HLT (High-Level Trigger) farm of the ATLAS experiment. This farm provides more than 1300 compute nodes, which are particularly suited for running event generation and Monte Carlo production jobs that are mostly CPU and not I/O bound. It is capable of running up to 2700 Virtual Machines (VMs) each with 8 CPU cores, for a total of up to 22000 parallel jobs. This contribution gives a review of the design, the results, and the evolution of the Sim@P1 project, operating a large scale OpenStack based virtualized platform deployed on top of the ATLAS TDAQ HLT farm computing resources. During LS1, Sim@P1 was one of the most productive ATLAS sites: it delivered more than 33 million CPU-hours and it generated more than 1.1 billion Monte Carlo events. The design aspects are presented: the virtualization platform exploited by Sim@P1 avoids interferences with TDAQ operations and it guarantees the security and the usability of the ATLAS private network. The cloud mechanism allows the separation of the needed support on both infrastructural (hardware, virtualization layer) and logical (Grid site support) levels. This paper focuses on the operational aspects of such a large system during the upcoming LHC Run 2 period: simple, reliable, and efficient tools are needed to quickly switch from Sim@P1 to TDAQ mode and back, to exploit the resources when they are not used for the data acquisition, even for short periods. The evolution of the central OpenStack infrastructure is described, as it was upgraded from Folsom to the Icehouse release, including the scalability issues addressed.
Exploring Infiniband Hardware Virtualization in OpenNebula towards Efficient High-Performance Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pais Pitta de Lacerda Ruivo, Tiago; Bernabeu Altayo, Gerard; Garzoglio, Gabriele

2014-11-11

has been widely accepted that software virtualization has a big negative impact on high-performance computing (HPC) application performance. This work explores the potential use of Infiniband hardware virtualization in an OpenNebula cloud towards the efficient support of MPI-based workloads. We have implemented, deployed, and tested an Infiniband network on the FermiCloud private Infrastructure-as-a-Service (IaaS) cloud. To avoid software virtualization towards minimizing the virtualization overhead, we employed a technique called Single Root Input/Output Virtualization (SRIOV). Our solution spanned modifications to the Linux’s Hypervisor as well as the OpenNebula manager. We evaluated the performance of the hardware virtualization on up to 56more » virtual machines connected by up to 8 DDR Infiniband network links, with micro-benchmarks (latency and bandwidth) as well as w a MPI-intensive application (the HPL Linpack benchmark).« less

Implementation of an ADI method on parallel computers

NASA Technical Reports Server (NTRS)

Fatoohi, Raad A.; Grosch, Chester E.

1987-01-01

The implementation of an ADI method for solving the diffusion equation on three parallel/vector computers is discussed. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, an SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The Gaussian elimination algorithm is used to solve a set of tridiagonal systems on the FLEX/32 and CRAY/2 while the cyclic elimination algorithm is used to solve these systems on the MPP. The implementation of the method is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.
Implementation of an ADI method on parallel computers

NASA Technical Reports Server (NTRS)

Fatoohi, Raad A.; Grosch, Chester E.

1987-01-01

In this paper the implementation of an ADI method for solving the diffusion equation on three parallel/vector computers is discussed. The computers were chosen so as to encompass a variety of architectures. They are the MPP, an SIMD machine with 16-Kbit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2, an MIMD machine with four vector processors. The Gaussian elimination algorithm is used to solve a set of tridiagonal systems on the Flex/32 and Cray/2 while the cyclic elimination algorithm is used to solve these systems on the MPP. The implementation of the method is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally conclusions are presented.
Empirical Evaluation of Conservative and Optimistic Discrete Event Execution on Cloud and VM Platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yoginath, Srikanth B; Perumalla, Kalyan S

2013-01-01

Virtual machine (VM) technologies, especially those offered via Cloud platforms, present new dimensions with respect to performance and cost in executing parallel discrete event simulation (PDES) applications. Due to the introduction of overall cost as a metric, the choice of the highest-end computing configuration is no longer the most economical one. Moreover, runtime dynamics unique to VM platforms introduce new performance characteristics, and the variety of possible VM configurations give rise to a range of choices for hosting a PDES run. Here, an empirical study of these issues is undertaken to guide an understanding of the dynamics, trends and trade-offsmore » in executing PDES on VM/Cloud platforms. Performance results and cost measures are obtained from actual execution of a range of scenarios in two PDES benchmark applications on the Amazon Cloud offerings and on a high-end VM host machine. The data reveals interesting insights into the new VM-PDES dynamics that come into play and also leads to counter-intuitive guidelines with respect to choosing the best and second-best configurations when overall cost of execution is considered. In particular, it is found that choosing the highest-end VM configuration guarantees neither the best runtime nor the least cost. Interestingly, choosing a (suitably scaled) low-end VM configuration provides the least overall cost without adversely affecting the total runtime.« less
Testing an Open Source installation and server provisioning tool for the INFN CNAF Tierl Storage system

NASA Astrophysics Data System (ADS)

Pezzi, M.; Favaro, M.; Gregori, D.; Ricci, P. P.; Sapunenko, V.

2014-06-01

In large computing centers, such as the INFN CNAF Tier1 [1], is essential to be able to configure all the machines, depending on use, in an automated way. For several years at the Tier1 has been used Quattor[2], a server provisioning tool, which is currently used in production. Nevertheless we have recently started a comparison study involving other tools able to provide specific server installation and configuration features and also offer a proper full customizable solution as an alternative to Quattor. Our choice at the moment fell on integration between two tools: Cobbler [3] for the installation phase and Puppet [4] for the server provisioning and management operation. The tool should provide the following properties in order to replicate and gradually improve the current system features: implement a system check for storage specific constraints such as kernel modules black list at boot time to avoid undesired SAN (Storage Area Network) access during disk partitioning; a simple and effective mechanism for kernel upgrade and downgrade; the ability of setting package provider using yum, rpm or apt; easy to use Virtual Machine installation support including bonding and specific Ethernet configuration; scalability for managing thousands of nodes and parallel installations. This paper describes the results of the comparison and the tests carried out to verify the requirements and the new system suitability in the INFN-T1 environment.
Virtual Network Configuration Management System for Data Center Operations and Management

NASA Astrophysics Data System (ADS)

Okita, Hideki; Yoshizawa, Masahiro; Uehara, Keitaro; Mizuno, Kazuhiko; Tarui, Toshiaki; Naono, Ken

Virtualization technologies are widely deployed in data centers to improve system utilization. However, they increase the workload for operators, who have to manage the structure of virtual networks in data centers. A virtual-network management system which automates the integration of the configurations of the virtual networks is provided. The proposed system collects the configurations from server virtualization platforms and VLAN-supported switches, and integrates these configurations according to a newly developed XML-based management information model for virtual-network configurations. Preliminary evaluations show that the proposed system helps operators by reducing the time to acquire the configurations from devices and correct the inconsistency of operators' configuration management database by about 40 percent. Further, they also show that the proposed system has excellent scalability; the system takes less than 20 minutes to acquire the virtual-network configurations from a large scale network that includes 300 virtual machines. These results imply that the proposed system is effective for improving the configuration management process for virtual networks in data centers.
Virtual Employment Test Bed Operational Research and Systems Analysis to Test Armaments Designs Early in the Life Cycle

DTIC Science & Technology

2014-06-01

motion capture data used to determine position and orientation of a Soldier’s head, turret and the M2 machine gun • Controlling and acquiring user/weapon...data from the M2 simulation machine gun • Controlling paintball guns used to fire at the GPK during an experimental run • Sending and receiving TCP...Mounted, Armor/Cavalry, Combat Engineers, Field Artillery Cannon Crewmember, or MP duty assignment – Currently M2 .50 Caliber Machine Gun qualified
Performance verification of network function virtualization in software defined optical transport networks

NASA Astrophysics Data System (ADS)

Zhao, Yongli; Hu, Liyazhou; Wang, Wei; Li, Yajie; Zhang, Jie

2017-01-01

With the continuous opening of resource acquisition and application, there are a large variety of network hardware appliances deployed as the communication infrastructure. To lunch a new network application always implies to replace the obsolete devices and needs the related space and power to accommodate it, which will increase the energy and capital investment. Network function virtualization1 (NFV) aims to address these problems by consolidating many network equipment onto industry standard elements such as servers, switches and storage. Many types of IT resources have been deployed to run Virtual Network Functions (vNFs), such as virtual switches and routers. Then how to deploy NFV in optical transport networks is a of great importance problem. This paper focuses on this problem, and gives an implementation architecture of NFV-enabled optical transport networks based on Software Defined Optical Networking (SDON) with the procedure of vNFs call and return. Especially, an implementation solution of NFV-enabled optical transport node is designed, and a parallel processing method for NFV-enabled OTN nodes is proposed. To verify the performance of NFV-enabled SDON, the protocol interaction procedures of control function virtualization and node function virtualization are demonstrated on SDON testbed. Finally, the benefits and challenges of the parallel processing method for NFV-enabled OTN nodes are simulated and analyzed.
Heavy Analysis and Light Virtualization of Water Use Data with Python

NASA Astrophysics Data System (ADS)

Kim, H.; Bijoor, N.; Famiglietti, J. S.

2014-12-01

Water utilities possess a large amount of water data that could be used to inform urban ecohydrology, management decisions, and conservation policies, but such data are rarely analyzed owing to difficulty in analyzation, visualization, and interpretion. We have developed a high performance computing resource for this purpose. We partnered with 6 water agencies in Orange County who provided 10 years of parcel-level monthly water use billing data for a pilot study. The first challenge that we overcame was to refine all human errors and unify the many different formats of data over all agencies. Second, we tested and applied experimental approaches to the data, including complex calculations, with high efficiency. Third, we developed a method to refine the data so it can be browsed along a time series index and/or geo-spatial queries with high efficiency, no matter how large the data. Python scientific libraries were the best match to handle arbitrary data sets in our environment. Further milestones include agency entry, sets of formulae, and maintaining 15M rows X 70 columns of data with high performance of cpu-bound processes. To deal with billions of rows, we performed an analysis virtualization stack by leveraging iPython parallel computing. With this architecture, one agency could be considered one computing node or virtual machine that maintains its own data sets respectively. For example, a big agency could use a large node, and a small agency could use a micro node. Under the minimum required raw data specs, more agencies could be analyzed. The program developed in this study simplifies data analysis, visualization, and interpretation of large water datasets, and can be used to analyze large data volumes from water agencies nationally or worldwide.
Interfacing HTCondor-CE with OpenStack

NASA Astrophysics Data System (ADS)

Bockelman, B.; Caballero Bejar, J.; Hover, J.

2017-10-01

Over the past few years, Grid Computing technologies have reached a high level of maturity. One key aspect of this success has been the development and adoption of newer Compute Elements to interface the external Grid users with local batch systems. These new Compute Elements allow for better handling of jobs requirements and a more precise management of diverse local resources. However, despite this level of maturity, the Grid Computing world is lacking diversity in local execution platforms. As Grid Computing technologies have historically been driven by the needs of the High Energy Physics community, most resource providers run the platform (operating system version and architecture) that best suits the needs of their particular users. In parallel, the development of virtualization and cloud technologies has accelerated recently, making available a variety of solutions, both commercial and academic, proprietary and open source. Virtualization facilitates performing computational tasks on platforms not available at most computing sites. This work attempts to join the technologies, allowing users to interact with computing sites through one of the standard Computing Elements, HTCondor-CE, but running their jobs within VMs on a local cloud platform, OpenStack, when needed. The system will re-route, in a transparent way, end user jobs into dynamically-launched VM worker nodes when they have requirements that cannot be satisfied by the static local batch system nodes. Also, once the automated mechanisms are in place, it becomes straightforward to allow an end user to invoke a custom Virtual Machine at the site. This will allow cloud resources to be used without requiring the user to establish a separate account. Both scenarios are described in this work.
Software architecture and design of the web services facilitating climate model diagnostic analysis

NASA Astrophysics Data System (ADS)

Pan, L.; Lee, S.; Zhang, J.; Tang, B.; Zhai, C.; Jiang, J. H.; Wang, W.; Bao, Q.; Qi, M.; Kubar, T. L.; Teixeira, J.

2015-12-01

Climate model diagnostic analysis is a computationally- and data-intensive task because it involves multiple numerical model outputs and satellite observation data that can both be high resolution. We have built an online tool that facilitates this process. The tool is called Climate Model Diagnostic Analyzer (CMDA). It employs the web service technology and provides a web-based user interface. The benefits of these choices include: (1) No installation of any software other than a browser, hence it is platform compatable; (2) Co-location of computation and big data on the server side, and small results and plots to be downloaded on the client side, hence high data efficiency; (3) multi-threaded implementation to achieve parallel performance on multi-core servers; and (4) cloud deployment so each user has a dedicated virtual machine. In this presentation, we will focus on the computer science aspects of this tool, namely the architectural design, the infrastructure of the web services, the implementation of the web-based user interface, the mechanism of provenance collection, the approach to virtualization, and the Amazon Cloud deployment. As an example, We will describe our methodology to transform an existing science application code into a web service using a Python wrapper interface and Python web service frameworks (i.e., Flask, Gunicorn, and Tornado). Another example is the use of Docker, a light-weight virtualization container, to distribute and deploy CMDA onto an Amazon EC2 instance. Our tool of CMDA has been successfully used in the 2014 Summer School hosted by the JPL Center for Climate Science. Students had positive feedbacks in general and we will report their comments. An enhanced version of CMDA with several new features, some requested by the 2014 students, will be used in the 2015 Summer School soon.
Concurrent extensions to the FORTRAN language for parallel programming of computational fluid dynamics algorithms

NASA Technical Reports Server (NTRS)

Weeks, Cindy Lou

1986-01-01

Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.
An Introduction to 3-D Sound

NASA Technical Reports Server (NTRS)

Begault, Durand R.; Null, Cynthia H. (Technical Monitor)

1997-01-01

This talk will overview the basic technologies related to the creation of virtual acoustic images, and the potential of including spatial auditory displays in human-machine interfaces. Research into the perceptual error inherent in both natural and virtual spatial hearing is reviewed, since the formation of improved technologies is tied to psychoacoustic research. This includes a discussion of Head Related Transfer Function (HRTF) measurement techniques (the HRTF provides important perceptual cues within a virtual acoustic display). Many commercial applications of virtual acoustics have so far focused on games and entertainment ; in this review, other types of applications are examined, including aeronautic safety, voice communications, virtual reality, and room acoustic simulation. In particular, the notion that realistic simulation is optimized within a virtual acoustic display when head motion and reverberation cues are included within a perceptual model.
Large-Scale Linear Optimization through Machine Learning: From Theory to Practical System Design and Implementation

DTIC Science & Technology

2016-08-10

AFRL-AFOSR-JP-TR-2016-0073 Large-scale Linear Optimization through Machine Learning: From Theory to Practical System Design and Implementation ...2016 4. TITLE AND SUBTITLE Large-scale Linear Optimization through Machine Learning: From Theory to Practical System Design and Implementation 5a...performances on various machine learning tasks and it naturally lends itself to fast parallel implementations . Despite this, very little work has been
Efficiently Scheduling Multi-core Guest Virtual Machines on Multi-core Hosts in Network Simulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yoginath, Srikanth B; Perumalla, Kalyan S

2011-01-01

Virtual machine (VM)-based simulation is a method used by network simulators to incorporate realistic application behaviors by executing actual VMs as high-fidelity surrogates for simulated end-hosts. A critical requirement in such a method is the simulation time-ordered scheduling and execution of the VMs. Prior approaches such as time dilation are less efficient due to the high degree of multiplexing possible when multiple multi-core VMs are simulated on multi-core host systems. We present a new simulation time-ordered scheduler to efficiently schedule multi-core VMs on multi-core real hosts, with a virtual clock realized on each virtual core. The distinguishing features of ourmore » approach are: (1) customizable granularity of the VM scheduling time unit on the simulation time axis, (2) ability to take arbitrary leaps in virtual time by VMs to maximize the utilization of host (real) cores when guest virtual cores idle, and (3) empirically determinable optimality in the tradeoff between total execution (real) time and time-ordering accuracy levels. Experiments show that it is possible to get nearly perfect time-ordered execution, with a slight cost in total run time, relative to optimized non-simulation VM schedulers. Interestingly, with our time-ordered scheduler, it is also possible to reduce the time-ordering error from over 50% of non-simulation scheduler to less than 1% realized by our scheduler, with almost the same run time efficiency as that of the highly efficient non-simulation VM schedulers.« less
Decomposition method for fast computation of gigapixel-sized Fresnel holograms on a graphics processing unit cluster.

PubMed

Jackin, Boaz Jessie; Watanabe, Shinpei; Ootsu, Kanemitsu; Ohkawa, Takeshi; Yokota, Takashi; Hayasaki, Yoshio; Yatagai, Toyohiko; Baba, Takanobu

2018-04-20

A parallel computation method for large-size Fresnel computer-generated hologram (CGH) is reported. The method was introduced by us in an earlier report as a technique for calculating Fourier CGH from 2D object data. In this paper we extend the method to compute Fresnel CGH from 3D object data. The scale of the computation problem is also expanded to 2 gigapixels, making it closer to real application requirements. The significant feature of the reported method is its ability to avoid communication overhead and thereby fully utilize the computing power of parallel devices. The method exhibits three layers of parallelism that favor small to large scale parallel computing machines. Simulation and optical experiments were conducted to demonstrate the workability and to evaluate the efficiency of the proposed technique. A two-times improvement in computation speed has been achieved compared to the conventional method, on a 16-node cluster (one GPU per node) utilizing only one layer of parallelism. A 20-times improvement in computation speed has been estimated utilizing two layers of parallelism on a very large-scale parallel machine with 16 nodes, where each node has 16 GPUs.
Open access for ALICE analysis based on virtualization technology

NASA Astrophysics Data System (ADS)

Buncic, P.; Gheata, M.; Schutz, Y.

2015-12-01

Open access is one of the important leverages for long-term data preservation for a HEP experiment. To guarantee the usability of data analysis tools beyond the experiment lifetime it is crucial that third party users from the scientific community have access to the data and associated software. The ALICE Collaboration has developed a layer of lightweight components built on top of virtualization technology to hide the complexity and details of the experiment-specific software. Users can perform basic analysis tasks within CernVM, a lightweight generic virtual machine, paired with an ALICE specific contextualization. Once the virtual machine is launched, a graphical user interface is automatically started without any additional configuration. This interface allows downloading the base ALICE analysis software and running a set of ALICE analysis modules. Currently the available tools include fully documented tutorials for ALICE analysis, such as the measurement of strange particle production or the nuclear modification factor in Pb-Pb collisions. The interface can be easily extended to include an arbitrary number of additional analysis modules. We present the current status of the tools used by ALICE through the CERN open access portal, and the plans for future extensions of this system.
Enhanced emotional responses during social coordination with a virtual partner

PubMed Central

Dumas, Guillaume; Kelso, J.A. Scott; Tognoli, Emmanuelle

2016-01-01

Emotion and motion, though seldom studied in tandem, are complementary aspects of social experience. This study investigates variations in emotional responses during movement coordination between a human and a Virtual Partner (VP), an agent whose virtual finger movements are driven by the Haken-Kelso-Bunz (HKB) equations of Coordination Dynamics. Twenty-one subjects were instructed to coordinate finger movements with the VP in either inphase or antiphase patterns. By adjusting model parameters, we manipulated the ‘intention’ of VP as cooperative or competitive with the human's instructed goal. Skin potential responses (SPR) were recorded to quantify the intensity of emotional response. At the end of each trial, subjects rated the VP's intention and whether they thought their partner was another human being or a machine. We found greater emotional responses when subjects reported that their partner was human and when coordination was stable. That emotional responses are strongly influenced by dynamic features of the VP's behavior, has implications for mental health, brain disorders and the design of socially cooperative machines. PMID:27094374
Introduction of Virtualization Technology to Multi-Process Model Checking

NASA Technical Reports Server (NTRS)

Leungwattanakit, Watcharin; Artho, Cyrille; Hagiya, Masami; Tanabe, Yoshinori; Yamamoto, Mitsuharu

2009-01-01

Model checkers find failures in software by exploring every possible execution schedule. Java PathFinder (JPF), a Java model checker, has been extended recently to cover networked applications by caching data transferred in a communication channel. A target process is executed by JPF, whereas its peer process runs on a regular virtual machine outside. However, non-deterministic target programs may produce different output data in each schedule, causing the cache to restart the peer process to handle the different set of data. Virtualization tools could help us restore previous states of peers, eliminating peer restart. This paper proposes the application of virtualization technology to networked model checking, concentrating on JPF.
Achieving High Resolution Timer Events in Virtualized Environment.

PubMed

Adamczyk, Blazej; Chydzinski, Andrzej

2015-01-01

Virtual Machine Monitors (VMM) have become popular in different application areas. Some applications may require to generate the timer events with high resolution and precision. This however may be challenging due to the complexity of VMMs. In this paper we focus on the timer functionality provided by five different VMMs-Xen, KVM, Qemu, VirtualBox and VMWare. Firstly, we evaluate resolutions and precisions of their timer events. Apparently, provided resolutions and precisions are far too low for some applications (e.g. networking applications with the quality of service). Then, using Xen virtualization we demonstrate the improved timer design that greatly enhances both the resolution and precision of achieved timer events.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Reed, D.A.; Grunwald, D.C.

The spectrum of parallel processor designs can be divided into three sections according to the number and complexity of the processors. At one end there are simple, bit-serial processors. Any one of thee processors is of little value, but when it is coupled with many others, the aggregate computing power can be large. This approach to parallel processing can be likened to a colony of termites devouring a log. The most notable examples of this approach are the NASA/Goodyear Massively Parallel Processor, which has 16K one-bit processors, and the Thinking Machines Connection Machine, which has 64K one-bit processors. At themore » other end of the spectrum, a small number of processors, each built using the fastest available technology and the most sophisticated architecture, are combined. An example of this approach is the Cray X-MP. This type of parallel processing is akin to four woodmen attacking the log with chainsaws.« less

Large-scale virtual screening on public cloud resources with Apache Spark.

PubMed

Capuccini, Marco; Ahmed, Laeeq; Schaal, Wesley; Laure, Erwin; Spjuth, Ola

2017-01-01

Structure-based virtual screening is an in-silico method to screen a target receptor against a virtual molecular library. Applying docking-based screening to large molecular libraries can be computationally expensive, however it constitutes a trivially parallelizable task. Most of the available parallel implementations are based on message passing interface, relying on low failure rate hardware and fast network connection. Google's MapReduce revolutionized large-scale analysis, enabling the processing of massive datasets on commodity hardware and cloud resources, providing transparent scalability and fault tolerance at the software level. Open source implementations of MapReduce include Apache Hadoop and the more recent Apache Spark. We developed a method to run existing docking-based screening software on distributed cloud resources, utilizing the MapReduce approach. We benchmarked our method, which is implemented in Apache Spark, docking a publicly available target receptor against [Formula: see text]2.2 M compounds. The performance experiments show a good parallel efficiency (87%) when running in a public cloud environment. Our method enables parallel Structure-based virtual screening on public cloud resources or commodity computer clusters. The degree of scalability that we achieve allows for trying out our method on relatively small libraries first and then to scale to larger libraries. Our implementation is named Spark-VS and it is freely available as open source from GitHub (https://github.com/mcapuccini/spark-vs).Graphical abstract.
Scan line graphics generation on the massively parallel processor

NASA Technical Reports Server (NTRS)

Dorband, John E.

1988-01-01

Described here is how researchers implemented a scan line graphics generation algorithm on the Massively Parallel Processor (MPP). Pixels are computed in parallel and their results are applied to the Z buffer in large groups. To perform pixel value calculations, facilitate load balancing across the processors and apply the results to the Z buffer efficiently in parallel requires special virtual routing (sort computation) techniques developed by the author especially for use on single-instruction multiple-data (SIMD) architectures.
A video, text, and speech-driven realistic 3-d virtual head for human-machine interface.

PubMed

Yu, Jun; Wang, Zeng-Fu

2015-05-01

A multiple inputs-driven realistic facial animation system based on 3-D virtual head for human-machine interface is proposed. The system can be driven independently by video, text, and speech, thus can interact with humans through diverse interfaces. The combination of parameterized model and muscular model is used to obtain a tradeoff between computational efficiency and high realism of 3-D facial animation. The online appearance model is used to track 3-D facial motion from video in the framework of particle filtering, and multiple measurements, i.e., pixel color value of input image and Gabor wavelet coefficient of illumination ratio image, are infused to reduce the influence of lighting and person dependence for the construction of online appearance model. The tri-phone model is used to reduce the computational consumption of visual co-articulation in speech synchronized viseme synthesis without sacrificing any performance. The objective and subjective experiments show that the system is suitable for human-machine interaction.
Strength computation of forged parts taking into account strain hardening and damage

NASA Astrophysics Data System (ADS)

Cristescu, Michel L.

2004-06-01

Modern non-linear simulation software, such as FORGE 3 (registered trade mark of TRANSVALOR), are able to compute the residual stresses, the strain hardening and the damage during the forging process. A thermally dependent elasto-visco-plastic law is used to simulate the behavior of the material of the hot forged piece. A modified Lemaitre law coupled with elasticiy, plasticity and thermic is used to simulate the damage. After the simulation of the different steps of the forging process, the part is cooled and then virtually machined, in order to obtain the finished part. An elastic computation is then performed to equilibrate the residual stresses, so that we obtain the true geometry of the finished part after machining. The response of the part to the loadings it will sustain during it's life is then computed, taking into account the residual stresses, the strain hardening and the damage that occur during forging. This process is illustrated by the forging, virtual machining and stress analysis of an aluminium wheel hub.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, C.

Almost every computer architect dreams of achieving high system performance with low implementation costs. A multigauge machine can reconfigure its data-path width, provide parallelism, achieve better resource utilization, and sometimes can trade computational precision for increased speed. A simple experimental method is used here to capture the main characteristics of multigauging. The measurements indicate evidence of near-optimal speedups. Adapting these ideas in designing parallel processors incurs low costs and provides flexibility. Several operational aspects of designing a multigauge machine are discussed as well. Thus, this research reports the technical, economical, and operational feasibility studies of multigauging.
Progressive Vector Quantization on a massively parallel SIMD machine with application to multispectral image data

NASA Technical Reports Server (NTRS)

Manohar, Mareboyana; Tilton, James C.

1994-01-01

A progressive vector quantization (VQ) compression approach is discussed which decomposes image data into a number of levels using full search VQ. The final level is losslessly compressed, enabling lossless reconstruction. The computational difficulties are addressed by implementation on a massively parallel SIMD machine. We demonstrate progressive VQ on multispectral imagery obtained from the Advanced Very High Resolution Radiometer instrument and other Earth observation image data, and investigate the trade-offs in selecting the number of decomposition levels and codebook training method.
GRIDVIEW: Recent Improvements in Research and Education Software for Exploring Mars Topography

NASA Technical Reports Server (NTRS)

Roark, J. H.; Masuoka, C. M.; Frey, H. V.

2004-01-01

GRIDVIEW is being developed by the GEODYNAMICS Branch at NASA's Goddard Space Flight Center and can be downloaded on the web at http://geodynamics.gsfc.nasa.gov/gridview/. The program is very mature and has been successfully used for more than four years, but is still under development as we add new features for data analysis and visualization. The software can run on any computer supported by the IDL virtual machine application supplied by RSI. The virtual machine application is currently available for recent versions of MS Windows, MacOS X, Red Hat Linux and UNIX. Minimum system memory requirement is 32 MB, however loading large data sets may require larger amounts of RAM to function adequately.
Telescoping magnetic ball bar test gage

DOEpatents

Bryan, J.B.

1982-03-15

A telescoping magnetic ball bar test gage for determining the accuracy of machine tools, including robots, and those measuring machines having non-disengagable servo drives which cannot be clutched out. Two gage balls are held and separated from one another by a telescoping fixture which allows them relative radial motional freedom but not relative lateral motional freedom. The telescoping fixture comprises a parallel reed flexure unit and a rigid member. One gage ball is secured by a magnetic socket knuckle assembly which fixes its center with respect to the machine being tested. The other gage ball is secured by another magnetic socket knuckle assembly which is engaged or held by the machine in such manner that the center of that ball is directed to execute a prescribed trajectory, all points of which are equidistant from the center of the fixed gage ball. As the moving ball executes its trajectory, changes in the radial distance between the centers of the two balls caused by inaccuracies in the machine are determined or measured by a linear variable differential transformer (LVDT) assembly actuated by the parallel reed flexure unit. Measurements can be quickly and easily taken for multiple trajectories about several different fixed ball locations, thereby determining the accuracy of the machine.
A compositional reservoir simulator on distributed memory parallel computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rame, M.; Delshad, M.

1995-12-31

This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. Amore » portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented.« less
Agreements in Virtual Organizations

NASA Astrophysics Data System (ADS)

Pankowska, Malgorzata

This chapter is an attempt to explain the important impact that contract theory delivers with respect to the concept of virtual organization. The author believes that not enough research has been conducted in order to transfer theoretical foundations for networking to the phenomena of virtual organizations and open autonomic computing environment to ensure the controllability and management of them. The main research problem of this chapter is to explain the significance of agreements for virtual organizations governance. The first part of this chapter comprises explanations of differences among virtual machines and virtual organizations for further descriptions of the significance of the first ones to the development of the second. Next, the virtual organization development tendencies are presented and problems of IT governance in highly distributed organizational environment are discussed. The last part of this chapter covers analysis of contracts and agreements management for governance in open computing environments.
High Performance Distributed Computing in a Supercomputer Environment: Computational Services and Applications Issues

NASA Technical Reports Server (NTRS)

Kramer, Williams T. C.; Simon, Horst D.

1994-01-01

This tutorial proposes to be a practical guide for the uninitiated to the main topics and themes of high-performance computing (HPC), with particular emphasis to distributed computing. The intent is first to provide some guidance and directions in the rapidly increasing field of scientific computing using both massively parallel and traditional supercomputers. Because of their considerable potential computational power, loosely or tightly coupled clusters of workstations are increasingly considered as a third alternative to both the more conventional supercomputers based on a small number of powerful vector processors, as well as high massively parallel processors. Even though many research issues concerning the effective use of workstation clusters and their integration into a large scale production facility are still unresolved, such clusters are already used for production computing. In this tutorial we will utilize the unique experience made at the NAS facility at NASA Ames Research Center. Over the last five years at NAS massively parallel supercomputers such as the Connection Machines CM-2 and CM-5 from Thinking Machines Corporation and the iPSC/860 (Touchstone Gamma Machine) and Paragon Machines from Intel were used in a production supercomputer center alongside with traditional vector supercomputers such as the Cray Y-MP and C90.
ProperCAD: A portable object-oriented parallel environment for VLSI CAD

NASA Technical Reports Server (NTRS)

Ramkumar, Balkrishna; Banerjee, Prithviraj

1993-01-01

Most parallel algorithms for VLSI CAD proposed to date have one important drawback: they work efficiently only on machines that they were designed for. As a result, algorithms designed to date are dependent on the architecture for which they are developed and do not port easily to other parallel architectures. A new project under way to address this problem is described. A Portable object-oriented parallel environment for CAD algorithms (ProperCAD) is being developed. The objectives of this research are (1) to develop new parallel algorithms that run in a portable object-oriented environment (CAD algorithms using a general purpose platform for portable parallel programming called CARM is being developed and a C++ environment that is truly object-oriented and specialized for CAD applications is also being developed); and (2) to design the parallel algorithms around a good sequential algorithm with a well-defined parallel-sequential interface (permitting the parallel algorithm to benefit from future developments in sequential algorithms). One CAD application that has been implemented as part of the ProperCAD project, flat VLSI circuit extraction, is described. The algorithm, its implementation, and its performance on a range of parallel machines are discussed in detail. It currently runs on an Encore Multimax, a Sequent Symmetry, Intel iPSC/2 and i860 hypercubes, a NCUBE 2 hypercube, and a network of Sun Sparc workstations. Performance data for other applications that were developed are provided: namely test pattern generation for sequential circuits, parallel logic synthesis, and standard cell placement.
The development of a scalable parallel 3-D CFD algorithm for turbomachinery. M.S. Thesis Final Report

NASA Technical Reports Server (NTRS)

Luke, Edward Allen

1993-01-01

Two algorithms capable of computing a transonic 3-D inviscid flow field about rotating machines are considered for parallel implementation. During the study of these algorithms, a significant new method of measuring the performance of parallel algorithms is developed. The theory that supports this new method creates an empirical definition of scalable parallel algorithms that is used to produce quantifiable evidence that a scalable parallel application was developed. The implementation of the parallel application and an automated domain decomposition tool are also discussed.
The OpenMP Implementation of NAS Parallel Benchmarks and its Performance

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Frumkin, Michael; Yan, Jerry

1999-01-01

As the new ccNUMA architecture became popular in recent years, parallel programming with compiler directives on these machines has evolved to accommodate new needs. In this study, we examine the effectiveness of OpenMP directives for parallelizing the NAS Parallel Benchmarks. Implementation details will be discussed and performance will be compared with the MPI implementation. We have demonstrated that OpenMP can achieve very good results for parallelization on a shared memory system, but effective use of memory and cache is very important.
Parallelization and automatic data distribution for nuclear reactor simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liebrock, L.M.

1997-07-01

Detailed attempts at realistic nuclear reactor simulations currently take many times real time to execute on high performance workstations. Even the fastest sequential machine can not run these simulations fast enough to ensure that the best corrective measure is used during a nuclear accident to prevent a minor malfunction from becoming a major catastrophe. Since sequential computers have nearly reached the speed of light barrier, these simulations will have to be run in parallel to make significant improvements in speed. In physical reactor plants, parallelism abounds. Fluids flow, controls change, and reactions occur in parallel with only adjacent components directlymore » affecting each other. These do not occur in the sequentialized manner, with global instantaneous effects, that is often used in simulators. Development of parallel algorithms that more closely approximate the real-world operation of a reactor may, in addition to speeding up the simulations, actually improve the accuracy and reliability of the predictions generated. Three types of parallel architecture (shared memory machines, distributed memory multicomputers, and distributed networks) are briefly reviewed as targets for parallelization of nuclear reactor simulation. Various parallelization models (loop-based model, shared memory model, functional model, data parallel model, and a combined functional and data parallel model) are discussed along with their advantages and disadvantages for nuclear reactor simulation. A variety of tools are introduced for each of the models. Emphasis is placed on the data parallel model as the primary focus for two-phase flow simulation. Tools to support data parallel programming for multiple component applications and special parallelization considerations are also discussed.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

McCaskey, Alexander J.

There is a lack of state-of-the-art HPC simulation tools for simulating general quantum computing. Furthermore, there are no real software tools that integrate current quantum computers into existing classical HPC workflows. This product, the Quantum Virtual Machine (QVM), solves this problem by providing an extensible framework for pluggable virtual, or physical, quantum processing units (QPUs). It enables the execution of low level quantum assembly codes and returns the results of such executions.
The Fluke Security Project

DTIC Science & Technology

2000-04-01

be an extension of Utah’s nascent Quarks system, oriented to closely coupled cluster environments. However, the grant did not actually begin until... Intel x86, implemented ten virtual machine monitors and servers, including a virtual memory manager, a checkpointer, a process manager, a file server...Fluke, we developed a novel hierarchical processor scheduling frame- work called CPU inheritance scheduling [5]. This is a framework for scheduling
Abstract quantum computing machines and quantum computational logics

NASA Astrophysics Data System (ADS)

Chiara, Maria Luisa Dalla; Giuntini, Roberto; Sergioli, Giuseppe; Leporini, Roberto

2016-06-01

Classical and quantum parallelism are deeply different, although it is sometimes claimed that quantum Turing machines are nothing but special examples of classical probabilistic machines. We introduce the concepts of deterministic state machine, classical probabilistic state machine and quantum state machine. On this basis, we discuss the question: To what extent can quantum state machines be simulated by classical probabilistic state machines? Each state machine is devoted to a single task determined by its program. Real computers, however, behave differently, being able to solve different kinds of problems. This capacity can be modeled, in the quantum case, by the mathematical notion of abstract quantum computing machine, whose different programs determine different quantum state machines. The computations of abstract quantum computing machines can be linguistically described by the formulas of a particular form of quantum logic, termed quantum computational logic.
Design, development and use of the finite element machine

NASA Technical Reports Server (NTRS)

Adams, L. M.; Voigt, R. C.

1983-01-01

Some of the considerations that went into the design of the Finite Element Machine, a research asynchronous parallel computer are described. The present status of the system is also discussed along with some indication of the type of results that were obtained.
The Integration of CloudStack and OCCI/OpenNebula with DIRAC

NASA Astrophysics Data System (ADS)

Méndez Muñoz, Víctor; Fernández Albor, Víctor; Graciani Diaz, Ricardo; Casajús Ramo, Adriàn; Fernández Pena, Tomás; Merino Arévalo, Gonzalo; José Saborido Silva, Juan

2012-12-01

The increasing availability of Cloud resources is arising as a realistic alternative to the Grid as a paradigm for enabling scientific communities to access large distributed computing resources. The DIRAC framework for distributed computing is an easy way to efficiently access to resources from both systems. This paper explains the integration of DIRAC with two open-source Cloud Managers: OpenNebula (taking advantage of the OCCI standard) and CloudStack. These are computing tools to manage the complexity and heterogeneity of distributed data center infrastructures, allowing to create virtual clusters on demand, including public, private and hybrid clouds. This approach has required to develop an extension to the previous DIRAC Virtual Machine engine, which was developed for Amazon EC2, allowing the connection with these new cloud managers. In the OpenNebula case, the development has been based on the CernVM Virtual Software Appliance with appropriate contextualization, while in the case of CloudStack, the infrastructure has been kept more general, which permits other Virtual Machine sources and operating systems being used. In both cases, CernVM File System has been used to facilitate software distribution to the computing nodes. With the resulting infrastructure, the cloud resources are transparent to the users through a friendly interface, like the DIRAC Web Portal. The main purpose of this integration is to get a system that can manage cloud and grid resources at the same time. This particular feature pushes DIRAC to a new conceptual denomination as interware, integrating different middleware. Users from different communities do not need to care about the installation of the standard software that is available at the nodes, nor the operating system of the host machine which is transparent to the user. This paper presents an analysis of the overhead of the virtual layer, doing some tests to compare the proposed approach with the existing Grid solution. License Notice: Published under licence in Journal of Physics: Conference Series by IOP Publishing Ltd.

Generating Contextual Descriptions of Virtual Reality (VR) Spaces

NASA Astrophysics Data System (ADS)

Olson, D. M.; Zaman, C. H.; Sutherland, A.

2017-12-01

Virtual reality holds great potential for science communication, education, and research. However, interfaces for manipulating data and environments in virtual worlds are limited and idiosyncratic. Furthermore, speech and vision are the primary modalities by which humans collect information about the world, but the linking of visual and natural language domains is a relatively new pursuit in computer vision. Machine learning techniques have been shown to be effective at image and speech classification, as well as at describing images with language (Karpathy 2016), but have not yet been used to describe potential actions. We propose a technique for creating a library of possible context-specific actions associated with 3D objects in immersive virtual worlds based on a novel dataset generated natively in virtual reality containing speech, image, gaze, and acceleration data. We will discuss the design and execution of a user study in virtual reality that enabled the collection and the development of this dataset. We will also discuss the development of a hybrid machine learning algorithm linking vision data with environmental affordances in natural language. Our findings demonstrate that it is possible to develop a model which can generate interpretable verbal descriptions of possible actions associated with recognized 3D objects within immersive VR environments. This suggests promising applications for more intuitive user interfaces through voice interaction within 3D environments. It also demonstrates the potential to apply vast bodies of embodied and semantic knowledge to enrich user interaction within VR environments. This technology would allow for applications such as expert knowledge annotation of 3D environments, complex verbal data querying and object manipulation in virtual spaces, and computer-generated, dynamic 3D object affordances and functionality during simulations.
The perception of spatial layout in real and virtual worlds.

PubMed

Arthur, E J; Hancock, P A; Chrysler, S T

1997-01-01

As human-machine interfaces grow more immersive and graphically-oriented, virtual environment systems become more prominent as the medium for human-machine communication. Often, virtual environments (VE) are built to provide exact metrical representations of existing or proposed physical spaces. However, it is not known how individuals develop representational models of these spaces in which they are immersed and how those models may be distorted with respect to both the virtual and real-world equivalents. To evaluate the process of model development, the present experiment examined participant's ability to reproduce a complex spatial layout of objects having experienced them previously under different viewing conditions. The layout consisted of nine common objects arranged on a flat plane. These objects could be viewed in a free binocular virtual condition, a free binocular real-world condition, and in a static monocular view of the real world. The first two allowed active exploration of the environment while the latter condition allowed the participant only a passive opportunity to observe from a single viewpoint. Viewing conditions were a between-subject variable with 10 participants randomly assigned to each condition. Performance was assessed using mapping accuracy and triadic comparisons of relative inter-object distances. Mapping results showed a significant effect of viewing condition where, interestingly, the static monocular condition was superior to both the active virtual and real binocular conditions. Results for the triadic comparisons showed a significant interaction for gender by viewing condition in which males were more accurate than females. These results suggest that the situation model resulting from interaction with a virtual environment was indistinguishable from interaction with real objects at least within the constraints of the present procedure.
Direct Machining of Low-Loss THz Waveguide Components With an RF Choke.

PubMed

Lewis, Samantha M; Nanni, Emilio A; Temkin, Richard J

2014-12-01

We present results for the successful fabrication of low-loss THz metallic waveguide components using direct machining with a CNC end mill. The approach uses a split-block machining process with the addition of an RF choke running parallel to the waveguide. The choke greatly reduces coupling to the parasitic mode of the parallel-plate waveguide produced by the split-block. This method has demonstrated loss as low as 0.2 dB/cm at 280 GHz for a copper WR-3 waveguide. It has also been used in the fabrication of 3 and 10 dB directional couplers in brass, demonstrating excellent agreement with design simulations from 240-260 GHz. The method may be adapted to structures with features on the order of 200 μm.
Efficient bounding schemes for the two-center hybrid flow shop scheduling problem with removal times.

PubMed

Hidri, Lotfi; Gharbi, Anis; Louly, Mohamed Aly

2014-01-01

We focus on the two-center hybrid flow shop scheduling problem with identical parallel machines and removal times. The job removal time is the required duration to remove it from a machine after its processing. The objective is to minimize the maximum completion time (makespan). A heuristic and a lower bound are proposed for this NP-Hard problem. These procedures are based on the optimal solution of the parallel machine scheduling problem with release dates and delivery times. The heuristic is composed of two phases. The first one is a constructive phase in which an initial feasible solution is provided, while the second phase is an improvement one. Intensive computational experiments have been conducted to confirm the good performance of the proposed procedures.
Efficient Bounding Schemes for the Two-Center Hybrid Flow Shop Scheduling Problem with Removal Times

PubMed Central

2014-01-01

We focus on the two-center hybrid flow shop scheduling problem with identical parallel machines and removal times. The job removal time is the required duration to remove it from a machine after its processing. The objective is to minimize the maximum completion time (makespan). A heuristic and a lower bound are proposed for this NP-Hard problem. These procedures are based on the optimal solution of the parallel machine scheduling problem with release dates and delivery times. The heuristic is composed of two phases. The first one is a constructive phase in which an initial feasible solution is provided, while the second phase is an improvement one. Intensive computational experiments have been conducted to confirm the good performance of the proposed procedures. PMID:25610911
Automatic recognition of vector and parallel operations in a higher level language

NASA Technical Reports Server (NTRS)

Schneck, P. B.

1971-01-01

A compiler for recognizing statements of a FORTRAN program which are suited for fast execution on a parallel or pipeline machine such as Illiac-4, Star or ASC is described. The technique employs interval analysis to provide flow information to the vector/parallel recognizer. Where profitable the compiler changes scalar variables to subscripted variables. The output of the compiler is an extension to FORTRAN which shows parallel and vector operations explicitly.
The QCDSP project —a status report

NASA Astrophysics Data System (ADS)

Chen, Dong; Chen, Ping; Christ, Norman; Edwards, Robert; Fleming, George; Gara, Alan; Hansen, Sten; Jung, Chulwoo; Kaehler, Adrian; Kasow, Steven; Kennedy, Anthony; Kilcup, Gregory; Luo, Yubin; Malureanu, Catalin; Mawhinney, Robert; Parsons, John; Sexton, James; Sui, Chengzhong; Vranas, Pavlos

1998-01-01

We give a brief overview of the massively parallel computer project underway for nearly the past four years, centered at Columbia University. A 6 Gflops and a 50 Gflops machine are presently being debugged for installation at OSU and SCRI respectively, while a 0.4 Tflops machine is under construction for Columbia and a 0.6 Tflops machine is planned for the new RIKEN Brookhaven Research Center.
Virtual Sensor for Kinematic Estimation of Flexible Links in Parallel Robots

PubMed Central

Cabanes, Itziar; Mancisidor, Aitziber; Pinto, Charles

2017-01-01

The control of flexible link parallel manipulators is still an open area of research, endpoint trajectory tracking being one of the main challenges in this type of robot. The flexibility and deformations of the limbs make the estimation of the Tool Centre Point (TCP) position a challenging one. Authors have proposed different approaches to estimate this deformation and deduce the location of the TCP. However, most of these approaches require expensive measurement systems or the use of high computational cost integration methods. This work presents a novel approach based on a virtual sensor which can not only precisely estimate the deformation of the flexible links in control applications (less than 2% error), but also its derivatives (less than 6% error in velocity and 13% error in acceleration) according to simulation results. The validity of the proposed Virtual Sensor is tested in a Delta Robot, where the position of the TCP is estimated based on the Virtual Sensor measurements with less than a 0.03% of error in comparison with the flexible approach developed in ADAMS Multibody Software. PMID:28832510
Performance of machine-learning scoring functions in structure-based virtual screening.

PubMed

Wójcikowski, Maciej; Ballester, Pedro J; Siedlecki, Pawel

2017-04-25

Classical scoring functions have reached a plateau in their performance in virtual screening and binding affinity prediction. Recently, machine-learning scoring functions trained on protein-ligand complexes have shown great promise in small tailored studies. They have also raised controversy, specifically concerning model overfitting and applicability to novel targets. Here we provide a new ready-to-use scoring function (RF-Score-VS) trained on 15 426 active and 893 897 inactive molecules docked to a set of 102 targets. We use the full DUD-E data sets along with three docking tools, five classical and three machine-learning scoring functions for model building and performance assessment. Our results show RF-Score-VS can substantially improve virtual screening performance: RF-Score-VS top 1% provides 55.6% hit rate, whereas that of Vina only 16.2% (for smaller percent the difference is even more encouraging: RF-Score-VS top 0.1% achieves 88.6% hit rate for 27.5% using Vina). In addition, RF-Score-VS provides much better prediction of measured binding affinity than Vina (Pearson correlation of 0.56 and -0.18, respectively). Lastly, we test RF-Score-VS on an independent test set from the DEKOIS benchmark and observed comparable results. We provide full data sets to facilitate further research in this area (http://github.com/oddt/rfscorevs) as well as ready-to-use RF-Score-VS (http://github.com/oddt/rfscorevs_binary).
Multiphase complete exchange on Paragon, SP2 and CS-2

NASA Technical Reports Server (NTRS)

Bokhari, Shahid H.

1995-01-01

The overhead of interprocessor communication is a major factor in limiting the performance of parallel computer systems. The complete exchange is the severest communication pattern in that it requires each processor to send a distinct message to every other processor. This pattern is at the heart of many important parallel applications. On hypercubes, multiphase complete exchange has been developed and shown to provide optimal performance over varying message sizes. Most commercial multicomputer systems do not have a hypercube interconnect. However, they use special purpose hardware and dedicated communication processors to achieve very high performance communication and can be made to emulate the hypercube quite well. Multiphase complete exchange has been implemented on three contemporary parallel architectures: the Intel Paragon, IBM SP2 and Meiko CS-2. The essential features of these machines are described and their basic interprocessor communication overheads are discussed. The performance of multiphase complete exchange is evaluated on each machine. It is shown that the theoretical ideas developed for hypercubes are also applicable in practice to these machines and that multiphase complete exchange can lead to major savings in execution time over traditional solutions.
Bridging Realty to Virtual Reality: Investigating Gender Effect and Student Engagement on Learning through Video Game Play in an Elementary School Classroom

ERIC Educational Resources Information Center

Annetta, Leonard; Mangrum, Jennifer; Holmes, Shawn; Collazo, Kimberly; Cheng, Meng-Tzu

2009-01-01

The purpose of this study was to examine students' learning of simple machines, a fifth-grade (ages 10-11) forces and motion unit, and student engagement using a teacher-created Multiplayer Educational Gaming Application. This mixed-method study collected pre-test/post-test results to determine student knowledge about simple machines. A survey…
Buffered coscheduling for parallel programming and enhanced fault tolerance

DOEpatents

Petrini, Fabrizio [Los Alamos, NM; Feng, Wu-chun [Los Alamos, NM

2006-01-31

A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval. The buffered coscheduling method of this invention also enhances the fault tolerance of a network of parallel machine processors or distributed system processors
Architecture and Key Techniques of Augmented Reality Maintenance Guiding System for Civil Aircrafts

NASA Astrophysics Data System (ADS)

hong, Zhou; Wenhua, Lu

2017-01-01

Augmented reality technology is introduced into the maintenance related field for strengthened information in real-world scenarios through integration of virtual assistant maintenance information with real-world scenarios. This can lower the difficulty of maintenance, reduce maintenance errors, and improve the maintenance efficiency and quality of civil aviation crews. Architecture of augmented reality virtual maintenance guiding system is proposed on the basis of introducing the definition of augmented reality and analyzing the characteristics of augmented reality virtual maintenance. Key techniques involved, such as standardization and organization of maintenance data, 3D registration, modeling of maintenance guidance information and virtual maintenance man-machine interaction, are elaborated emphatically, and solutions are given.
Achieving High Resolution Timer Events in Virtualized Environment

PubMed Central

Adamczyk, Blazej; Chydzinski, Andrzej

2015-01-01

Virtual Machine Monitors (VMM) have become popular in different application areas. Some applications may require to generate the timer events with high resolution and precision. This however may be challenging due to the complexity of VMMs. In this paper we focus on the timer functionality provided by five different VMMs—Xen, KVM, Qemu, VirtualBox and VMWare. Firstly, we evaluate resolutions and precisions of their timer events. Apparently, provided resolutions and precisions are far too low for some applications (e.g. networking applications with the quality of service). Then, using Xen virtualization we demonstrate the improved timer design that greatly enhances both the resolution and precision of achieved timer events. PMID:26177366
A computer-based training system combining virtual reality and multimedia

NASA Technical Reports Server (NTRS)

Stansfield, Sharon A.

1993-01-01

Training new users of complex machines is often an expensive and time-consuming process. This is particularly true for special purpose systems, such as those frequently encountered in DOE applications. This paper presents a computer-based training system intended as a partial solution to this problem. The system extends the basic virtual reality (VR) training paradigm by adding a multimedia component which may be accessed during interaction with the virtual environment. The 3D model used to create the virtual reality is also used as the primary navigation tool through the associated multimedia. This method exploits the natural mapping between a virtual world and the real world that it represents to provide a more intuitive way for the student to interact with all forms of information about the system.
DIRAC universal pilots

NASA Astrophysics Data System (ADS)

Stagni, F.; McNab, A.; Luzzi, C.; Krzemien, W.; Consortium, DIRAC

2017-10-01

In the last few years, new types of computing models, such as IAAS (Infrastructure as a Service) and IAAC (Infrastructure as a Client), gained popularity. New resources may come as part of pledged resources, while others are in the form of opportunistic ones. Most but not all of these new infrastructures are based on virtualization techniques. In addition, some of them, present opportunities for multi-processor computing slots to the users. Virtual Organizations are therefore facing heterogeneity of the available resources and the use of an Interware software like DIRAC to provide the transparent, uniform interface has become essential. The transparent access to the underlying resources is realized by implementing the pilot model. DIRAC’s newest generation of generic pilots (the so-called Pilots 2.0) are the “pilots for all the skies”, and have been successfully released in production more than a year ago. They use a plugin mechanism that makes them easily adaptable. Pilots 2.0 have been used for fetching and running jobs on every type of resource, being it a Worker Node (WN) behind a CREAM/ARC/HTCondor/DIRAC Computing element, a Virtual Machine running on IaaC infrastructures like Vac or BOINC, on IaaS cloud resources managed by Vcycle, the LHCb High Level Trigger farm nodes, and any type of opportunistic computing resource. Make a machine a “Pilot Machine”, and all diversities between them will disappear. This contribution describes how pilots are made suitable for different resources, and the recent steps taken towards a fully unified framework, including monitoring. Also, the cases of multi-processor computing slots either on real or virtual machines, with the whole node or a partition of it, is discussed.
A virtual simulator designed for collision prevention in proton therapy.

PubMed

Jung, Hyunuk; Kum, Oyeon; Han, Youngyih; Park, Hee Chul; Kim, Jin Sung; Choi, Doo Ho

2015-10-01

In proton therapy, collisions between the patient and nozzle potentially occur because of the large nozzle structure and efforts to minimize the air gap. Thus, software was developed to predict such collisions between the nozzle and patient using treatment virtual simulation. Three-dimensional (3D) modeling of a gantry inner-floor, nozzle, and robotic-couch was performed using SolidWorks based on the manufacturer's machine data. To obtain patient body information, a 3D-scanner was utilized right before CT scanning. Using the acquired images, a 3D-image of the patient's body contour was reconstructed. The accuracy of the image was confirmed against the CT image of a humanoid phantom. The machine components and the virtual patient were combined on the treatment-room coordinate system, resulting in a virtual simulator. The simulator simulated the motion of its components such as rotation and translation of the gantry, nozzle, and couch in real scale. A collision, if any, was examined both in static and dynamic modes. The static mode assessed collisions only at fixed positions of the machine's components, while the dynamic mode operated any time a component was in motion. A collision was identified if any voxels of two components, e.g., the nozzle and the patient or couch, overlapped when calculating volume locations. The event and collision point were visualized, and collision volumes were reported. All components were successfully assembled, and the motions were accurately controlled. The 3D-shape of the phantom agreed with CT images within a deviation of 2 mm. Collision situations were simulated within minutes, and the results were displayed and reported. The developed software will be useful in improving patient safety and clinical efficiency of proton therapy.
Optical Symbolic Computing

NASA Astrophysics Data System (ADS)

Neff, John A.

1989-12-01

Experiments originating from Gestalt psychology have shown that representing information in a symbolic form provides a more effective means to understanding. Computer scientists have been struggling for the last two decades to determine how best to create, manipulate, and store collections of symbolic structures. In the past, much of this struggling led to software innovations because that was the path of least resistance. For example, the development of heuristics for organizing the searching through knowledge bases was much less expensive than building massively parallel machines that could search in parallel. That is now beginning to change with the emergence of parallel architectures which are showing the potential for handling symbolic structures. This paper will review the relationships between symbolic computing and parallel computing architectures, and will identify opportunities for optics to significantly impact the performance of such computing machines. Although neural networks are an exciting subset of massively parallel computing structures, this paper will not touch on this area since it is receiving a great deal of attention in the literature. That is, the concepts presented herein do not consider the distributed representation of knowledge.
File-access characteristics of parallel scientific workloads

NASA Technical Reports Server (NTRS)

Nieuwejaar, Nils; Kotz, David; Purakayastha, Apratim; Best, Michael; Ellis, Carla Schlatter

1995-01-01

Phenomenal improvements in the computational performance of multiprocessors have not been matched by comparable gains in I/O system performance. This imbalance has resulted in I/O becoming a significant bottleneck for many scientific applications. One key to overcoming this bottleneck is improving the performance of parallel file systems. The design of a high-performance parallel file system requires a comprehensive understanding of the expected workload. Unfortunately, until recently, no general workload studies of parallel file systems have been conducted. The goal of the CHARISMA project was to remedy this problem by characterizing the behavior of several production workloads, on different machines, at the level of individual reads and writes. The first set of results from the CHARISMA project describe the workloads observed on an Intel iPSC/860 and a Thinking Machines CM-5. This paper is intended to compare and contrast these two workloads for an understanding of their essential similarities and differences, isolating common trends and platform-dependent variances. Using this comparison, we are able to gain more insight into the general principles that should guide parallel file-system design.
How can machine-learning methods assist in virtual screening for hyperuricemia? A healthcare machine-learning approach.

PubMed

Ichikawa, Daisuke; Saito, Toki; Ujita, Waka; Oyama, Hiroshi

2016-12-01

Our purpose was to develop a new machine-learning approach (a virtual health check-up) toward identification of those at high risk of hyperuricemia. Applying the system to general health check-ups is expected to reduce medical costs compared with administering an additional test. Data were collected during annual health check-ups performed in Japan between 2011 and 2013 (inclusive). We prepared training and test datasets from the health check-up data to build prediction models; these were composed of 43,524 and 17,789 persons, respectively. Gradient-boosting decision tree (GBDT), random forest (RF), and logistic regression (LR) approaches were trained using the training dataset and were then used to predict hyperuricemia in the test dataset. Undersampling was applied to build the prediction models to deal with the imbalanced class dataset. The results showed that the RF and GBDT approaches afforded the best performances in terms of sensitivity and specificity, respectively. The area under the curve (AUC) values of the models, which reflected the total discriminative ability of the classification, were 0.796 [95% confidence interval (CI): 0.766-0.825] for the GBDT, 0.784 [95% CI: 0.752-0.815] for the RF, and 0.785 [95% CI: 0.752-0.819] for the LR approaches. No significant differences were observed between pairs of each approach. Small changes occurred in the AUCs after applying undersampling to build the models. We developed a virtual health check-up that predicted the development of hyperuricemia using machine-learning methods. The GBDT, RF, and LR methods had similar predictive capability. Undersampling did not remarkably improve predictive power. Copyright Â© 2016 Elsevier Inc. All rights reserved.

The HARNESS Workbench: Unified and Adaptive Access to Diverse HPC Platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sunderam, Vaidy S.

2012-03-20

The primary goal of the Harness WorkBench (HWB) project is to investigate innovative software environments that will help enhance the overall productivity of applications science on diverse HPC platforms. Two complementary frameworks were designed: one, a virtualized command toolkit for application building, deployment, and execution, that provides a common view across diverse HPC systems, in particular the DOE leadership computing platforms (Cray, IBM, SGI, and clusters); and two, a unified runtime environment that consolidates access to runtime services via an adaptive framework for execution-time and post processing activities. A prototype of the first was developed based on the concept ofmore » a 'system-call virtual machine' (SCVM), to enhance portability of the HPC application deployment process across heterogeneous high-end machines. The SCVM approach to portable builds is based on the insertion of toolkit-interpretable directives into original application build scripts. Modifications resulting from these directives preserve the semantics of the original build instruction flow. The execution of the build script is controlled by our toolkit that intercepts build script commands in a manner transparent to the end-user. We have applied this approach to a scientific production code (Gamess-US) on the Cray-XT5 machine. The second facet, termed Unibus, aims to facilitate provisioning and aggregation of multifaceted resources from resource providers and end-users perspectives. To achieve that, Unibus proposes a Capability Model and mediators (resource drivers) to virtualize access to diverse resources, and soft and successive conditioning to enable automatic and user-transparent resource provisioning. A proof of concept implementation has demonstrated the viability of this approach on high end machines, grid systems and computing clouds.« less
Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach.

PubMed

Gómez-Bombarelli, Rafael; Aguilera-Iparraguirre, Jorge; Hirzel, Timothy D; Duvenaud, David; Maclaurin, Dougal; Blood-Forsythe, Martin A; Chae, Hyun Sik; Einzinger, Markus; Ha, Dong-Gwang; Wu, Tony; Markopoulos, Georgios; Jeon, Soonok; Kang, Hosuk; Miyazaki, Hiroshi; Numata, Masaki; Kim, Sunghan; Huang, Wenliang; Hong, Seong Ik; Baldo, Marc; Adams, Ryan P; Aspuru-Guzik, Alán

2016-10-01

Virtual screening is becoming a ground-breaking tool for molecular discovery due to the exponential growth of available computer time and constant improvement of simulation and machine learning techniques. We report an integrated organic functional material design process that incorporates theoretical insight, quantum chemistry, cheminformatics, machine learning, industrial expertise, organic synthesis, molecular characterization, device fabrication and optoelectronic testing. After exploring a search space of 1.6 million molecules and screening over 400,000 of them using time-dependent density functional theory, we identified thousands of promising novel organic light-emitting diode molecules across the visible spectrum. Our team collaboratively selected the best candidates from this set. The experimentally determined external quantum efficiencies for these synthesized candidates were as large as 22%.
Using a Virtual Tablet Machine to Improve Student Understanding of the Complex Processes Involved in Tablet Manufacturing.

PubMed

Mattsson, Sofia; Sjöström, Hans-Erik; Englund, Claire

2016-06-25

Objective. To develop and implement a virtual tablet machine simulation to aid distance students' understanding of the processes involved in tablet production. Design. A tablet simulation was created enabling students to study the effects different parameters have on the properties of the tablet. Once results were generated, students interpreted and explained them on the basis of current theory. Assessment. The simulation was evaluated using written questionnaires and focus group interviews. Students appreciated the exercise and considered it to be motivational. Students commented that they found the simulation, together with the online seminar and the writing of the report, was beneficial for their learning process. Conclusion. According to students' perceptions, the use of the tablet simulation contributed to their understanding of the compaction process.
Using a Virtual Tablet Machine to Improve Student Understanding of the Complex Processes Involved in Tablet Manufacturing

PubMed Central

Sjöström, Hans-Erik; Englund, Claire

2016-01-01

Objective. To develop and implement a virtual tablet machine simulation to aid distance students’ understanding of the processes involved in tablet production. Design. A tablet simulation was created enabling students to study the effects different parameters have on the properties of the tablet. Once results were generated, students interpreted and explained them on the basis of current theory. Assessment. The simulation was evaluated using written questionnaires and focus group interviews. Students appreciated the exercise and considered it to be motivational. Students commented that they found the simulation, together with the online seminar and the writing of the report, was beneficial for their learning process. Conclusion. According to students’ perceptions, the use of the tablet simulation contributed to their understanding of the compaction process. PMID:27402990
Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach

NASA Astrophysics Data System (ADS)

Gómez-Bombarelli, Rafael; Aguilera-Iparraguirre, Jorge; Hirzel, Timothy D.; Duvenaud, David; MacLaurin, Dougal; Blood-Forsythe, Martin A.; Chae, Hyun Sik; Einzinger, Markus; Ha, Dong-Gwang; Wu, Tony; Markopoulos, Georgios; Jeon, Soonok; Kang, Hosuk; Miyazaki, Hiroshi; Numata, Masaki; Kim, Sunghan; Huang, Wenliang; Hong, Seong Ik; Baldo, Marc; Adams, Ryan P.; Aspuru-Guzik, Alán

2016-10-01

Virtual screening is becoming a ground-breaking tool for molecular discovery due to the exponential growth of available computer time and constant improvement of simulation and machine learning techniques. We report an integrated organic functional material design process that incorporates theoretical insight, quantum chemistry, cheminformatics, machine learning, industrial expertise, organic synthesis, molecular characterization, device fabrication and optoelectronic testing. After exploring a search space of 1.6 million molecules and screening over 400,000 of them using time-dependent density functional theory, we identified thousands of promising novel organic light-emitting diode molecules across the visible spectrum. Our team collaboratively selected the best candidates from this set. The experimentally determined external quantum efficiencies for these synthesized candidates were as large as 22%.
Interleaved EPI diffusion imaging using SPIRiT-based reconstruction with virtual coil compression.

PubMed

Dong, Zijing; Wang, Fuyixue; Ma, Xiaodong; Zhang, Zhe; Dai, Erpeng; Yuan, Chun; Guo, Hua

2018-03-01

To develop a novel diffusion imaging reconstruction framework based on iterative self-consistent parallel imaging reconstruction (SPIRiT) for multishot interleaved echo planar imaging (iEPI), with computation acceleration by virtual coil compression. As a general approach for autocalibrating parallel imaging, SPIRiT improves the performance of traditional generalized autocalibrating partially parallel acquisitions (GRAPPA) methods in that the formulation with self-consistency is better conditioned, suggesting SPIRiT to be a better candidate in k-space-based reconstruction. In this study, a general SPIRiT framework is adopted to incorporate both coil sensitivity and phase variation information as virtual coils and then is applied to 2D navigated iEPI diffusion imaging. To reduce the reconstruction time when using a large number of coils and shots, a novel shot-coil compression method is proposed for computation acceleration in Cartesian sampling. Simulations and in vivo experiments were conducted to evaluate the performance of the proposed method. Compared with the conventional coil compression, the shot-coil compression achieved higher compression rates with reduced errors. The simulation and in vivo experiments demonstrate that the SPIRiT-based reconstruction outperformed the existing method, realigned GRAPPA, and provided superior images with reduced artifacts. The SPIRiT-based reconstruction with virtual coil compression is a reliable method for high-resolution iEPI diffusion imaging. Magn Reson Med 79:1525-1531, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
A One-Year Case Study: Understanding the Rich Potential of Project-Based Learning in a Virtual Reality Class for High School Students

ERIC Educational Resources Information Center

Morales, Teresa M.; Bang, EunJin; Andre, Thomas

2013-01-01

This paper presents a qualitative case analysis of a new and unique, high school, student-directed, project-based learning (PBL), virtual reality (VR) class. In order to create projects, students learned, on an independent basis, how to program an industrial-level VR machine. A constraint was that students were required to produce at least one…
Can Science Education Research Give an Answer to Questions Posed by History of Science and Technology? The Case of Steam Engine's Measurement

ERIC Educational Resources Information Center

Kanderakis, Nikos E.

2009-01-01

According to the principle of virtual velocities, if on a simple machine in equilibrium we suppose a slight virtual movement, then the ratio of weights or forces equals the inverse ratio of velocities or displacements. The product of the weight raised or force applied multiplied by the height or displacement plays a central role there. British…
Methods For Self-Organizing Software

DOEpatents

Bouchard, Ann M.; Osbourn, Gordon C.

2005-10-18

A method for dynamically self-assembling and executing software is provided, containing machines that self-assemble execution sequences and data structures. In addition to ordered functions calls (found commonly in other software methods), mutual selective bonding between bonding sites of machines actuates one or more of the bonding machines. Two or more machines can be virtually isolated by a construct, called an encapsulant, containing a population of machines and potentially other encapsulants that can only bond with each other. A hierarchical software structure can be created using nested encapsulants. Multi-threading is implemented by populations of machines in different encapsulants that are interacting concurrently. Machines and encapsulants can move in and out of other encapsulants, thereby changing the functionality. Bonding between machines' sites can be deterministic or stochastic with bonding triggering a sequence of actions that can be implemented by each machine. A self-assembled execution sequence occurs as a sequence of stochastic binding between machines followed by their deterministic actuation. It is the sequence of bonding of machines that determines the execution sequence, so that the sequence of instructions need not be contiguous in memory.
Telescoping magnetic ball bar test gage

DOEpatents

Bryan, J.B.

1984-03-13

A telescoping magnetic ball bar test gage for determining the accuracy of machine tools, including robots, and those measuring machines having non-disengageable servo drives which cannot be clutched out is disclosed. Two gage balls are held and separated from one another by a telescoping fixture which allows them relative radial motional freedom but not relative lateral motional freedom. The telescoping fixture comprises a parallel reed flexure unit and a rigid member. One gage ball is secured by a magnetic socket knuckle assembly which fixes its center with respect to the machine being tested. The other gage ball is secured by another magnetic socket knuckle assembly which is engaged or held by the machine in such manner that the center of that ball is directed to execute a prescribed trajectory, all points of which are equidistant from the center of the fixed gage ball. As the moving ball executes its trajectory, changes in the radial distance between the centers of the two balls caused by inaccuracies in the machine are determined or measured by a linear variable differential transformer (LVDT) assembly actuated by the parallel reed flexure unit. Measurements can be quickly and easily taken for multiple trajectories about several different fixed ball locations, thereby determining the accuracy of the machine. 3 figs.
[Virtual microscopy in pathology teaching and postgraduate training (continuing education)].

PubMed

Sinn, H P; Andrulis, M; Mogler, C; Schirmacher, P

2008-11-01

As with conventional microscopy, virtual microscopy permits histological tissue sections to be viewed on a computer screen with a free choice of viewing areas and a wide range of magnifications. This, combined with the possibility of linking virtual microscopy to E-Learning courses, make virtual microscopy an ideal tool for teaching and postgraduate training in pathology. Uses of virtual microscopy in pathology teaching include blended learning with the presentation of digital teaching slides in the internet parallel to presentation in the histology lab, extending student access to histology slides beyond the lab. Other uses are student self-learning in the Internet, as well as the presentation of virtual slides in the classroom with or without replacing real microscopes. Successful integration of virtual microscopy depends on its embedding in the virtual classroom and the creation of interactive E-learning content. Applications derived from this include the use of virtual microscopy in video clips, podcasts, SCORM modules and the presentation of virtual microscopy using interactive whiteboards in the classroom.
Bit-parallel arithmetic in a massively-parallel associative processor

NASA Technical Reports Server (NTRS)

Scherson, Isaac D.; Kramer, David A.; Alleyne, Brian D.

1992-01-01

A simple but powerful new architecture based on a classical associative processor model is presented. Algorithms for performing the four basic arithmetic operations both for integer and floating point operands are described. For m-bit operands, the proposed architecture makes it possible to execute complex operations in O(m) cycles as opposed to O(m exp 2) for bit-serial machines. A word-parallel, bit-parallel, massively-parallel computing system can be constructed using this architecture with VLSI technology. The operation of this system is demonstrated for the fast Fourier transform and matrix multiplication.
Next Generation Parallelization Systems for Processing and Control of PDS Image Node Assets

NASA Astrophysics Data System (ADS)

Verma, R.

2017-06-01

We present next-generation parallelization tools to help Planetary Data System (PDS) Imaging Node (IMG) better monitor, process, and control changes to nearly 650 million file assets and over a dozen machines on which they are referenced or stored.
Block-Parallel Data Analysis with DIY2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morozov, Dmitriy; Peterka, Tom

DIY2 is a programming model and runtime for block-parallel analytics on distributed-memory machines. Its main abstraction is block-structured data parallelism: data are decomposed into blocks; blocks are assigned to processing elements (processes or threads); computation is described as iterations over these blocks, and communication between blocks is defined by reusable patterns. By expressing computation in this general form, the DIY2 runtime is free to optimize the movement of blocks between slow and fast memories (disk and flash vs. DRAM) and to concurrently execute blocks residing in memory with multiple threads. This enables the same program to execute in-core, out-of-core, serial,more » parallel, single-threaded, multithreaded, or combinations thereof. This paper describes the implementation of the main features of the DIY2 programming model and optimizations to improve performance. DIY2 is evaluated on benchmark test cases to establish baseline performance for several common patterns and on larger complete analysis codes running on large-scale HPC machines.« less
Statistical Analysis of NAS Parallel Benchmarks and LINPACK Results

NASA Technical Reports Server (NTRS)

Meuer, Hans-Werner; Simon, Horst D.; Strohmeier, Erich; Lasinski, T. A. (Technical Monitor)

1994-01-01

In the last three years extensive performance data have been reported for parallel machines both based on the NAS Parallel Benchmarks, and on LINPACK. In this study we have used the reported benchmark results and performed a number of statistical experiments using factor, cluster, and regression analyses. In addition to the performance results of LINPACK and the eight NAS parallel benchmarks, we have also included peak performance of the machine, and the LINPACK n and n(sub 1/2) values. Some of the results and observations can be summarized as follows: 1) All benchmarks are strongly correlated with peak performance. 2) LINPACK and EP have each a unique signature. 3) The remaining NPB can grouped into three groups as follows: (CG and IS), (LU and SP), and (MG, FT, and BT). Hence three (or four with EP) benchmarks are sufficient to characterize the overall NPB performance. Our poster presentation will follow a standard poster format, and will present the data of our statistical analysis in detail.
High Performance Active Database Management on a Shared-Nothing Parallel Processor

DTIC Science & Technology

1998-05-01

either stored or virtual. A stored node is like a materialized view. It actually contains the specified tuples. A virtual node is like a real view...90292-6695 DL-5 COLUMBIA UNIV/DEPT COMPUTER SCIENCi ATTN: OR GAIL £. KAISER 450 COMPUTER SCIENCE 3LDG 500 WEST 12ÖTH STRSET NEW YORK NY 10027
A Novel Artificial Bee Colony Approach of Live Virtual Machine Migration Policy Using Bayes Theorem

PubMed Central

Xu, Gaochao; Hu, Liang; Fu, Xiaodong

2013-01-01

Green cloud data center has become a research hotspot of virtualized cloud computing architecture. Since live virtual machine (VM) migration technology is widely used and studied in cloud computing, we have focused on the VM placement selection of live migration for power saving. We present a novel heuristic approach which is called PS-ABC. Its algorithm includes two parts. One is that it combines the artificial bee colony (ABC) idea with the uniform random initialization idea, the binary search idea, and Boltzmann selection policy to achieve an improved ABC-based approach with better global exploration's ability and local exploitation's ability. The other one is that it uses the Bayes theorem to further optimize the improved ABC-based process to faster get the final optimal solution. As a result, the whole approach achieves a longer-term efficient optimization for power saving. The experimental results demonstrate that PS-ABC evidently reduces the total incremental power consumption and better protects the performance of VM running and migrating compared with the existing research. It makes the result of live VM migration more high-effective and meaningful. PMID:24385877
A novel artificial bee colony approach of live virtual machine migration policy using Bayes theorem.

PubMed

Xu, Gaochao; Ding, Yan; Zhao, Jia; Hu, Liang; Fu, Xiaodong

2013-01-01

Green cloud data center has become a research hotspot of virtualized cloud computing architecture. Since live virtual machine (VM) migration technology is widely used and studied in cloud computing, we have focused on the VM placement selection of live migration for power saving. We present a novel heuristic approach which is called PS-ABC. Its algorithm includes two parts. One is that it combines the artificial bee colony (ABC) idea with the uniform random initialization idea, the binary search idea, and Boltzmann selection policy to achieve an improved ABC-based approach with better global exploration's ability and local exploitation's ability. The other one is that it uses the Bayes theorem to further optimize the improved ABC-based process to faster get the final optimal solution. As a result, the whole approach achieves a longer-term efficient optimization for power saving. The experimental results demonstrate that PS-ABC evidently reduces the total incremental power consumption and better protects the performance of VM running and migrating compared with the existing research. It makes the result of live VM migration more high-effective and meaningful.
Human Machine Interfaces for Teleoperators and Virtual Environments Conference

NASA Technical Reports Server (NTRS)

1990-01-01

In a teleoperator system the human operator senses, moves within, and operates upon a remote or hazardous environment by means of a slave mechanism (a mechanism often referred to as a teleoperator). In a virtual environment system the interactive human machine interface is retained but the slave mechanism and its environment are replaced by a computer simulation. Video is replaced by computer graphics. The auditory and force sensations imparted to the human operator are similarly computer generated. In contrast to a teleoperator system, where the purpose is to extend the operator's sensorimotor system in a manner that facilitates exploration and manipulation of the physical environment, in a virtual environment system, the purpose is to train, inform, alter, or study the human operator to modify the state of the computer and the information environment. A major application in which the human operator is the target is that of flight simulation. Although flight simulators have been around for more than a decade, they had little impact outside aviation presumably because the application was so specialized and so expensive.
Address tracing for parallel machines

NASA Technical Reports Server (NTRS)

Stunkel, Craig B.; Janssens, Bob; Fuchs, W. Kent

1991-01-01

Recently implemented parallel system address-tracing methods based on several metrics are surveyed. The issues specific to collection of traces for both shared and distributed memory parallel computers are highlighted. Five general categories of address-trace collection methods are examined: hardware-captured, interrupt-based, simulation-based, altered microcode-based, and instrumented program-based traces. The problems unique to shared memory and distributed memory multiprocessors are examined separately.

Mathematical defense method of networked servers with controlled remote backups

NASA Astrophysics Data System (ADS)

Kim, Song-Kyoo

2006-05-01

The networked server defense model is focused on reliability and availability in security respects. The (remote) backup servers are hooked up by VPN (Virtual Private Network) with high-speed optical network and replace broken main severs immediately. The networked server can be represent as "machines" and then the system deals with main unreliable, spare, and auxiliary spare machine. During vacation periods, when the system performs a mandatory routine maintenance, auxiliary machines are being used for back-ups; the information on the system is naturally delayed. Analog of the N-policy to restrict the usage of auxiliary machines to some reasonable quantity. The results are demonstrated in the network architecture by using the stochastic optimization techniques.
Default Parallels Plesk Panel Page

Science.gov Websites

services that small businesses want and need. Our software includes key building blocks of cloud service virtualized servers Service Provider Products ParallelsÂ® Automation Hosting, SaaS, and cloud computing , the leading hosting automation software. You see this page because there is no Web site at this
On Why It Is Impossible to Prove that the BDX90 Dispatcher Implements a Time-sharing System

NASA Technical Reports Server (NTRS)

Boyer, R. S.; Moore, J. S.

1983-01-01

The Software Implemented Fault Tolerance SIFT system, is written in PASCAL except for about a page of machine code. The SIFT system implements a small time sharing system in which PASCAL programs for separate application tasks are executed according to a schedule with real time constraints. The PASCAL language has no provision for handling the notion of an interrupt such as the B930 clock interrupt. The PASCAL language also lacks the notion of running a PASCAL subroutine for a given amount of time, suspending it, saving away the suspension, and later activating the suspension. Machine code was used to overcome these inadequacies of PASCAL. Code which handles clock interrupts and suspends processes is called a dispatcher. The time sharing/virtual machine idea is completely destroyed by the reconfiguration task. After termination of the reconfiguration task, the tasks run by the dispatcher have no relation to those run before reconfiguration. It is impossible to view the dispatcher as a time-sharing system implementing virtual BDX930s running concurrently when one process can wipe out the others.
Efficient diagonalization of the sparse matrices produced within the framework of the UK R-matrix molecular codes

NASA Astrophysics Data System (ADS)

Galiatsatos, P. G.; Tennyson, J.

2012-11-01

The most time consuming step within the framework of the UK R-matrix molecular codes is that of the diagonalization of the inner region Hamiltonian matrix (IRHM). Here we present the method that we follow to speed up this step. We use shared memory machines (SMM), distributed memory machines (DMM), the OpenMP directive based parallel language, the MPI function based parallel language, the sparse matrix diagonalizers ARPACK and PARPACK, a variation for real symmetric matrices of the official coordinate sparse matrix format and finally a parallel sparse matrix-vector product (PSMV). The efficient application of the previous techniques rely on two important facts: the sparsity of the matrix is large enough (more than 98%) and in order to get back converged results we need a small only part of the matrix spectrum.
A scheme for solving the plane-plane challenge in force measurements at the nanoscale.

PubMed

Siria, Alessandro; Huant, Serge; Auvert, Geoffroy; Comin, Fabio; Chevrier, Joel

2010-05-19

Non-contact interaction between two parallel flat surfaces is a central paradigm in sciences. This situation is the starting point for a wealth of different models: the capacitor description in electrostatics, hydrodynamic flow, thermal exchange, the Casimir force, direct contact study, third body confinement such as liquids or films of soft condensed matter. The control of parallelism is so demanding that no versatile single force machine in this geometry has been proposed so far. Using a combination of nanopositioning based on inertial motors, of microcrystal shaping with a focused-ion beam (FIB) and of accurate in situ and real-time control of surface parallelism with X-ray diffraction, we propose here a "gedanken" surface-force machine that should enable one to measure interactions between movable surfaces separated by gaps in the micrometer and nanometer ranges.
Virtualization of open-source secure web services to support data exchange in a pediatric critical care research network

PubMed Central

Sward, Katherine A; Newth, Christopher JL; Khemani, Robinder G; Cryer, Martin E; Thelen, Julie L; Enriquez, Rene; Shaoyu, Su; Pollack, Murray M; Harrison, Rick E; Meert, Kathleen L; Berg, Robert A; Wessel, David L; Shanley, Thomas P; Dalton, Heidi; Carcillo, Joseph; Jenkins, Tammara L; Dean, J Michael

2015-01-01

Objectives To examine the feasibility of deploying a virtual web service for sharing data within a research network, and to evaluate the impact on data consistency and quality. Material and Methods Virtual machines (VMs) encapsulated an open-source, semantically and syntactically interoperable secure web service infrastructure along with a shadow database. The VMs were deployed to 8 Collaborative Pediatric Critical Care Research Network Clinical Centers. Results Virtual web services could be deployed in hours. The interoperability of the web services reduced format misalignment from 56% to 1% and demonstrated that 99% of the data consistently transferred using the data dictionary and 1% needed human curation. Conclusions Use of virtualized open-source secure web service technology could enable direct electronic abstraction of data from hospital databases for research purposes. PMID:25796596
Scheduling of hybrid types of machines with two-machine flowshop as the first type and a single machine as the second type

NASA Astrophysics Data System (ADS)

Hsiao, Ming-Chih; Su, Ling-Huey

2018-02-01

This research addresses the problem of scheduling hybrid machine types, in which one type is a two-machine flowshop and another type is a single machine. A job is either processed on the two-machine flowshop or on the single machine. The objective is to determine a production schedule for all jobs so as to minimize the makespan. The problem is NP-hard since the two parallel machines problem was proved to be NP-hard. Simulated annealing algorithms are developed to solve the problem optimally. A mixed integer programming (MIP) is developed and used to evaluate the performance for two SAs. Computational experiments demonstrate the efficiency of the simulated annealing algorithms, the quality of the simulated annealing algorithms will also be reported.
Institute for Defense Analysis. Annual Report 1995.

DTIC Science & Technology

1995-01-01

staff have been involved in the community-wide development of MPI as well as in its application to specific NSA problems. 35 Parallel Groebner ...Basis Code — Symbolic Computing on Parallel Machines The Groebner basis method is a set of algorithms for reformulating very complex algebraic expres
Blocked inverted indices for exact clustering of large chemical spaces.

PubMed

Thiel, Philipp; Sach-Peltason, Lisa; Ottmann, Christian; Kohlbacher, Oliver

2014-09-22

The calculation of pairwise compound similarities based on fingerprints is one of the fundamental tasks in chemoinformatics. Methods for efficient calculation of compound similarities are of the utmost importance for various applications like similarity searching or library clustering. With the increasing size of public compound databases, exact clustering of these databases is desirable, but often computationally prohibitively expensive. We present an optimized inverted index algorithm for the calculation of all pairwise similarities on 2D fingerprints of a given data set. In contrast to other algorithms, it neither requires GPU computing nor yields a stochastic approximation of the clustering. The algorithm has been designed to work well with multicore architectures and shows excellent parallel speedup. As an application example of this algorithm, we implemented a deterministic clustering application, which has been designed to decompose virtual libraries comprising tens of millions of compounds in a short time on current hardware. Our results show that our implementation achieves more than 400 million Tanimoto similarity calculations per second on a common desktop CPU. Deterministic clustering of the available chemical space thus can be done on modern multicore machines within a few days.
Liberating Virtual Machines from Physical Boundaries through Execution Knowledge

DTIC Science & Technology

2015-12-01

trivial infrastructures such as VM distribution networks, clients need to wait for an extended period of time before launching a VM. In cloud settings...hardware support. MobiDesk [28] efficiently supports virtual desktops in mobile environments by decou- pling the user’s workload from host systems and...experiment set-up. VMs are migrated between a pair of source and destination hosts, which are connected through a backend 10 Gbps network for
Mapping, Awareness, and Virtualization Network Administrator Training Tool (MAVNATT) Architecture and Framework

DTIC Science & Technology

2015-06-01

unit may setup and teardown the entire tactical infrastructure multiple times per day. This tactical network administrator training is a critical...language and runs on Linux and Unix based systems. All provisioning is based around the Nagios Core application, a powerful backend solution for network...start up a large number of virtual machines quickly. CORE supports the simulation of fixed and mobile networks. CORE is open-source, written in Python
Humans and machines in space: The vision, the challenge, the payoff; Proceedings of the 29th Goddard Memorial Symposium, Washington, Mar. 14, 15, 1991

NASA Astrophysics Data System (ADS)

Johnson, Bradley; May, Gayle L.; Korn, Paula

The present conference discusses the currently envisioned goals of human-machine systems in spacecraft environments, prospects for human exploration of the solar system, and plausible methods for meeting human needs in space. Also discussed are the problems of human-machine interaction in long-duration space flights, remote medical systems for space exploration, the use of virtual reality for planetary exploration, the alliance between U.S. Antarctic and space programs, and the economic and educational impacts of the U.S. space program.
Fundamental Study about the Landscape Estimation and Analysis by CG

NASA Astrophysics Data System (ADS)

Nakashima, Yoshio; Miyagoshi, Takashi; Takamatsu, Mamoru; Sassa, Kazuhiro

In recent years, the color of advertising signboards or vending machines on the streets should be harmonized with the surrounding landscape. In this study, we investigated how the colors (red and white) of the vending machines virtually installed by CG would affect the traditional landscape. 20 subjects estimated landscape samples in Hida-Furukawa by the SD technique. The result of our experiment shows that the vending machines have great influence on the surrounding landscape. On the other hand, we have confirmed that they can harmonize with the circumference landscape by the color to use.
A Framework for Analyzing the Whole Body Surface Area from a Single View

PubMed Central

Doretto, Gianfranco; Adjeroh, Donald

2017-01-01

We present a virtual reality (VR) framework for the analysis of whole human body surface area. Usual methods for determining the whole body surface area (WBSA) are based on well known formulae, characterized by large errors when the subject is obese, or belongs to certain subgroups. For these situations, we believe that a computer vision approach can overcome these problems and provide a better estimate of this important body indicator. Unfortunately, using machine learning techniques to design a computer vision system able to provide a new body indicator that goes beyond the use of only body weight and height, entails a long and expensive data acquisition process. A more viable solution is to use a dataset composed of virtual subjects. Generating a virtual dataset allowed us to build a population with different characteristics (obese, underweight, age, gender). However, synthetic data might differ from a real scenario, typical of the physician’s clinic. For this reason we develop a new virtual environment to facilitate the analysis of human subjects in 3D. This framework can simulate the acquisition process of a real camera, making it easy to analyze and to create training data for machine learning algorithms. With this virtual environment, we can easily simulate the real setup of a clinic, where a subject is standing in front of a camera, or may assume a different pose with respect to the camera. We use this newly designated environment to analyze the whole body surface area (WBSA). In particular, we show that we can obtain accurate WBSA estimations with just one view, virtually enabling the possibility to use inexpensive depth sensors (e.g., the Kinect) for large scale quantification of the WBSA from a single view 3D map. PMID:28045895
Performance of machine-learning scoring functions in structure-based virtual screening

PubMed Central

Wójcikowski, Maciej; Ballester, Pedro J.; Siedlecki, Pawel

2017-01-01

Classical scoring functions have reached a plateau in their performance in virtual screening and binding affinity prediction. Recently, machine-learning scoring functions trained on protein-ligand complexes have shown great promise in small tailored studies. They have also raised controversy, specifically concerning model overfitting and applicability to novel targets. Here we provide a new ready-to-use scoring function (RF-Score-VS) trained on 15 426 active and 893 897 inactive molecules docked to a set of 102 targets. We use the full DUD-E data sets along with three docking tools, five classical and three machine-learning scoring functions for model building and performance assessment. Our results show RF-Score-VS can substantially improve virtual screening performance: RF-Score-VS top 1% provides 55.6% hit rate, whereas that of Vina only 16.2% (for smaller percent the difference is even more encouraging: RF-Score-VS top 0.1% achieves 88.6% hit rate for 27.5% using Vina). In addition, RF-Score-VS provides much better prediction of measured binding affinity than Vina (Pearson correlation of 0.56 and −0.18, respectively). Lastly, we test RF-Score-VS on an independent test set from the DEKOIS benchmark and observed comparable results. We provide full data sets to facilitate further research in this area (http://github.com/oddt/rfscorevs) as well as ready-to-use RF-Score-VS (http://github.com/oddt/rfscorevs_binary). PMID:28440302
Decentralized real-time simulation of forest machines

NASA Astrophysics Data System (ADS)

Freund, Eckhard; Adam, Frank; Hoffmann, Katharina; Rossmann, Juergen; Kraemer, Michael; Schluse, Michael

2000-10-01

To develop realistic forest machine simulators is a demanding task. A useful simulator has to provide a close- to-reality simulation of the forest environment as well as the simulation of the physics of the vehicle. Customers demand a highly realistic three dimensional forestry landscape and the realistic simulation of the complex motion of the vehicle even in rough terrain in order to be able to use the simulator for operator training under close-to- reality conditions. The realistic simulation of the vehicle, especially with the driver's seat mounted on a motion platform, greatly improves the effect of immersion into the virtual reality of a simulated forest and the achievable level of education of the driver. Thus, the connection of the real control devices of forest machines to the simulation system has to be supported, i.e. the real control devices like the joysticks or the board computer system to control the crane, the aggregate etc. Beyond, the fusion of the board computer system and the simulation system is realized by means of sensors, i.e. digital and analog signals. The decentralized system structure allows several virtual reality systems to evaluate and visualize the information of the control devices and the sensors. So, while the driver is practicing, the instructor can immerse into the same virtual forest to monitor the session from his own viewpoint. In this paper, we are describing the realized structure as well as the necessary software and hardware components and application experiences.
Multilevel Parallelization of AutoDock 4.2.

PubMed

Norgan, Andrew P; Coffman, Paul K; Kocher, Jean-Pierre A; Katzmann, David J; Sosa, Carlos P

2011-04-28

Virtual (computational) screening is an increasingly important tool for drug discovery. AutoDock is a popular open-source application for performing molecular docking, the prediction of ligand-receptor interactions. AutoDock is a serial application, though several previous efforts have parallelized various aspects of the program. In this paper, we report on a multi-level parallelization of AutoDock 4.2 (mpAD4). Using MPI and OpenMP, AutoDock 4.2 was parallelized for use on MPI-enabled systems and to multithread the execution of individual docking jobs. In addition, code was implemented to reduce input/output (I/O) traffic by reusing grid maps at each node from docking to docking. Performance of mpAD4 was examined on two multiprocessor computers. Using MPI with OpenMP multithreading, mpAD4 scales with near linearity on the multiprocessor systems tested. In situations where I/O is limiting, reuse of grid maps reduces both system I/O and overall screening time. Multithreading of AutoDock's Lamarkian Genetic Algorithm with OpenMP increases the speed of execution of individual docking jobs, and when combined with MPI parallelization can significantly reduce the execution time of virtual screens. This work is significant in that mpAD4 speeds the execution of certain molecular docking workloads and allows the user to optimize the degree of system-level (MPI) and node-level (OpenMP) parallelization to best fit both workloads and computational resources.
Multiprocessing the Sieve of Eratosthenes

NASA Technical Reports Server (NTRS)

Bokhari, S.

1986-01-01

The Sieve of Eratosthenes for finding prime numbers in recent years has seen much use as a benchmark algorithm for serial computers while its intrinsically parallel nature has gone largely unnoticed. The implementation of a parallel version of this algorithm for a real parallel computer, the Flex/32, is described and its performance discussed. It is shown that the algorithm is sensitive to several fundamental performance parameters of parallel machines, such as spawning time, signaling time, memory access, and overhead of process switching. Because of the nature of the algorithm, it is impossible to get any speedup beyond 4 or 5 processors unless some form of dynamic load balancing is employed. We describe the performance of our algorithm with and without load balancing and compare it with theoretical lower bounds and simulated results. It is straightforward to understand this algorithm and to check the final results. However, its efficient implementation on a real parallel machine requires thoughtful design, especially if dynamic load balancing is desired. The fundamental operations required by the algorithm are very simple: this means that the slightest overhead appears prominently in performance data. The Sieve thus serves not only as a very severe test of the capabilities of a parallel processor but is also an interesting challenge for the programmer.
A parallel algorithm for switch-level timing simulation on a hypercube multiprocessor

NASA Technical Reports Server (NTRS)

Rao, Hariprasad Nannapaneni

1989-01-01

The parallel approach to speeding up simulation is studied, specifically the simulation of digital LSI MOS circuitry on the Intel iPSC/2 hypercube. The simulation algorithm is based on RSIM, an event driven switch-level simulator that incorporates a linear transistor model for simulating digital MOS circuits. Parallel processing techniques based on the concepts of Virtual Time and rollback are utilized so that portions of the circuit may be simulated on separate processors, in parallel for as large an increase in speed as possible. A partitioning algorithm is also developed in order to subdivide the circuit for parallel processing.
Improved Learning Efficiency and Increased Student Collaboration through Use of Virtual Microscopy in the Teaching of Human Pathology

ERIC Educational Resources Information Center

Braun, Mark W.; Kearns, Katherine D.

2008-01-01

The implementation of virtual microscopy in the teaching of pathology at the Bloomington, Indiana extension of the Indiana University School of Medicine permitted the assessment of student attitudes, use and academic performance with respect to this new technology. A gradual and integrated approach allowed the parallel assessment with respect to…

Virtual terrain: a security-based representation of a computer network

NASA Astrophysics Data System (ADS)

Holsopple, Jared; Yang, Shanchieh; Argauer, Brian

2008-03-01

Much research has been put forth towards detection, correlating, and prediction of cyber attacks in recent years. As this set of research progresses, there is an increasing need for contextual information of a computer network to provide an accurate situational assessment. Typical approaches adopt contextual information as needed; yet such ad hoc effort may lead to unnecessary or even conflicting features. The concept of virtual terrain is, therefore, developed and investigated in this work. Virtual terrain is a common representation of crucial information about network vulnerabilities, accessibilities, and criticalities. A virtual terrain model encompasses operating systems, firewall rules, running services, missions, user accounts, and network connectivity. It is defined as connected graphs with arc attributes defining dynamic relationships among vertices modeling network entities, such as services, users, and machines. The virtual terrain representation is designed to allow feasible development and maintenance of the model, as well as efficacy in terms of the use of the model. This paper will describe the considerations in developing the virtual terrain schema, exemplary virtual terrain models, and algorithms utilizing the virtual terrain model for situation and threat assessment.
A Survey of Parallel Computing

DTIC Science & Technology

1988-07-01

Evaluating Two Massively Parallel Machines. Communications of the ACM .9, , , 176 BIBLIOGRAPHY 29, 8 (August), pp. 752-758. Gajski , D.D., Padua, D.A., Kuck...Computer Architecture, edited by Gajski , D. D., Milutinovic, V. M. Siegel, H. J. and Furht, B. P. IEEE Computer Society Press, Washington, D.C., pp. 387-407
Humans and machines in space: The vision, the challenge, the payoff; AAS Goddard Memorial Symposium, 29th, Washington, DC, March 14-15, 1991

NASA Astrophysics Data System (ADS)

Johnson, Bradley; May, Gayle L.; Korn, Paula

A recent symposium produced papers in the areas of solar system exploration, man machine interfaces, cybernetics, virtual reality, telerobotics, life support systems and the scientific and technology spinoff from the NASA space program. A number of papers also addressed the social and economic impacts of the space program. For individual titles, see A95-87468 through A95-87479.
Software Support Measurement and Estimating for Oracle Database Applications Using Mark II Function Points

DTIC Science & Technology

1992-12-01

36 V.33. Coe ncint of De minstioi ........................ 37 V3A. F-Raio .................................... 37 V3.5... de ations. Instructions ae defined as lines of code or card images. Thus, a line containin two or mome souce statements counts as one instruction; a...understand the productivity paradox, recall de concept of virtual machines. When a higher level machine groups ogether many instructm of a lower level
'Dilute-and-shoot' triple parallel mass spectrometry method for analysis of vitamin D and triacylglycerols in dietary supplements

USDA-ARS?s Scientific Manuscript database

A method is demonstrated for analysis of vitamin D-fortified dietary supplements that eliminates virtually all chemical pretreatment prior to analysis, and is referred to as a ‘dilute and shoot’ method. Three mass spectrometers, in parallel, plus a UV detector, an evaporative light scattering detec...
Simulation-Based Probabilistic Seismic Hazard Assessment Using System-Level, Physics-Based Models: Assembling Virtual California

NASA Astrophysics Data System (ADS)

Rundle, P. B.; Rundle, J. B.; Morein, G.; Donnellan, A.; Turcotte, D.; Klein, W.

2004-12-01

The research community is rapidly moving towards the development of an earthquake forecast technology based on the use of complex, system-level earthquake fault system simulations. Using these topologically and dynamically realistic simulations, it is possible to develop ensemble forecasting methods similar to that used in weather and climate research. To effectively carry out such a program, one needs 1) a topologically realistic model to simulate the fault system; 2) data sets to constrain the model parameters through a systematic program of data assimilation; 3) a computational technology making use of modern paradigms of high performance and parallel computing systems; and 4) software to visualize and analyze the results. In particular, we focus attention on a new version of our code Virtual California (version 2001) in which we model all of the major strike slip faults in California, from the Mexico-California border to the Mendocino Triple Junction. Virtual California is a "backslip model", meaning that the long term rate of slip on each fault segment in the model is matched to the observed rate. We use the historic data set of earthquakes larger than magnitude M > 6 to define the frictional properties of 650 fault segments (degrees of freedom) in the model. To compute the dynamics and the associated surface deformation, we use message passing as implemented in the MPICH standard distribution on a Beowulf clusters consisting of >10 cpus. We also will report results from implementing the code on significantly larger machines so that we can begin to examine much finer spatial scales of resolution, and to assess scaling properties of the code. We present results of simulations both as static images and as mpeg movies, so that the dynamical aspects of the computation can be assessed by the viewer. We compute a variety of statistics from the simulations, including magnitude-frequency relations, and compare these with data from real fault systems. We report recent results on use of Virtual California for probabilistic earthquake forecasting for several sub-groups of major faults in California. These methods have the advantage that system-level fault interactions are explicitly included, as well as laboratory-based friction laws.
Molecular Symmetry in Ab Initio Calculations

NASA Astrophysics Data System (ADS)

Madhavan, P. V.; Written, J. L.

1987-05-01

A scheme is presented for the construction of the Fock matrix in LCAO-SCF calculations and for the transformation of basis integrals to LCAO-MO integrals that can utilize several symmetry unique lists of integrals corresponding to different symmetry groups. The algorithm is fully compatible with vector processing machines and is especially suited for parallel processing machines.
Simulation fidelity of a virtual environment display

NASA Technical Reports Server (NTRS)

Nemire, Kenneth; Jacoby, Richard H.; Ellis, Stephen R.

1994-01-01

We assessed the degree to which a virtual environment system produced a faithful simulation of three-dimensional space by investigating the influence of a pitched optic array on the perception of gravity-referenced eye level (GREL). We compared the results with those obtained in a physical environment. In a within-subjects factorial design, 12 subjects indicated GREL while viewing virtual three-dimensional arrays at different static orientations. A physical array biased GREL more than did a geometrically identical virtual pitched array. However, addition of two sets of orthogonal parallel lines (a grid) to the virtual pitched array resulted in as large a bias as that obtained with the physical pitched array. The increased bias was caused by longitudinal, but not the transverse, components of the grid. We discuss implications of our results for spatial orientation models and for designs of virtual displays.
Performance Comparison of a Matrix Solver on a Heterogeneous Network Using Two Implementations of MPI: MPICH and LAM

NASA Technical Reports Server (NTRS)

Phillips, Jennifer K.

1995-01-01

Two of the current and most popular implementations of the Message-Passing Standard, Message Passing Interface (MPI), were contrasted: MPICH by Argonne National Laboratory, and LAM by the Ohio Supercomputer Center at Ohio State University. A parallel skyline matrix solver was adapted to be run in a heterogeneous environment using MPI. The Message-Passing Interface Forum was held in May 1994 which lead to a specification of library functions that implement the message-passing model of parallel communication. LAM, which creates it's own environment, is more robust in a highly heterogeneous network. MPICH uses the environment native to the machine architecture. While neither of these free-ware implementations provides the performance of native message-passing or vendor's implementations, MPICH begins to approach that performance on the SP-2. The machines used in this study were: IBM RS6000, 3 Sun4, SGI, and the IBM SP-2. Each machine is unique and a few machines required specific modifications during the installation. When installed correctly, both implementations worked well with only minor problems.
Telescoping magnetic ball bar test gage

DOEpatents

Bryan, James B.

1984-01-01

A telescoping magnetic ball bar test gage for determining the accuracy of machine tools, including robots, and those measuring machines having non-disengageable servo drives which cannot be clutched out. Two gage balls (10, 12) are held and separated from one another by a telescoping fixture which allows them relative radial motional freedom but not relative lateral motional freedom. The telescoping fixture comprises a parallel reed flexure unit (14) and a rigid member (16, 18, 20, 22, 24). One gage ball (10) is secured by a magnetic socket knuckle assembly (34) which fixes its center with respect to the machine being tested. The other gage ball (12) is secured by another magnetic socket knuckle assembly (38) which is engaged or held by the machine in such manner that the center of that ball (12) is directed to execute a prescribed trajectory, all points of which are equidistant from the center of the fixed gage ball (10). As the moving ball (12) executes its trajectory, changes in the radial distance between the centers of the two balls (10, 12) caused by inaccuracies in the machine are determined or measured by a linear variable differential transformer (LVDT) assembly (50, 52, 54, 56, 58, 60) actuated by the parallel reed flexure unit (14). Measurements can be quickly and easily taken for multiple trajectories about several different fixed ball (10) locations, thereby determining the accuracy of the machine.
Information integration and diagnosis analysis of equipment status and production quality for machining process

NASA Astrophysics Data System (ADS)

Zan, Tao; Wang, Min; Hu, Jianzhong

2010-12-01

Machining status monitoring technique by multi-sensors can acquire and analyze the machining process information to implement abnormity diagnosis and fault warning. Statistical quality control technique is normally used to distinguish abnormal fluctuations from normal fluctuations through statistical method. In this paper by comparing the advantages and disadvantages of the two methods, the necessity and feasibility of integration and fusion is introduced. Then an approach that integrates multi-sensors status monitoring and statistical process control based on artificial intelligent technique, internet technique and database technique is brought forward. Based on virtual instrument technique the author developed the machining quality assurance system - MoniSysOnline, which has been used to monitoring the grinding machining process. By analyzing the quality data and AE signal information of wheel dressing process the reason of machining quality fluctuation has been obtained. The experiment result indicates that the approach is suitable for the status monitoring and analyzing of machining process.
Large-scale Parallel Unstructured Mesh Computations for 3D High-lift Analysis

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.; Pirzadeh, S.

1999-01-01

A complete "geometry to drag-polar" analysis capability for the three-dimensional high-lift configurations is described. The approach is based on the use of unstructured meshes in order to enable rapid turnaround for complicated geometries that arise in high-lift configurations. Special attention is devoted to creating a capability for enabling analyses on highly resolved grids. Unstructured meshes of several million vertices are initially generated on a work-station, and subsequently refined on a supercomputer. The flow is solved on these refined meshes on large parallel computers using an unstructured agglomeration multigrid algorithm. Good prediction of lift and drag throughout the range of incidences is demonstrated on a transport take-off configuration using up to 24.7 million grid points. The feasibility of using this approach in a production environment on existing parallel machines is demonstrated, as well as the scalability of the solver on machines using up to 1450 processors.
Communication overhead on the Intel Paragon, IBM SP2 and Meiko CS-2

NASA Technical Reports Server (NTRS)

Bokhari, Shahid H.

1995-01-01

Interprocessor communication overhead is a crucial measure of the power of parallel computing systems-its impact can severely limit the performance of parallel programs. This report presents measurements of communication overhead on three contemporary commercial multicomputer systems: the Intel Paragon, the IBM SP2 and the Meiko CS-2. In each case the time to communicate between processors is presented as a function of message length. The time for global synchronization and memory access is discussed. The performance of these machines in emulating hypercubes and executing random pairwise exchanges is also investigated. It is shown that the interprocessor communication time depends heavily on the specific communication pattern required. These observations contradict the commonly held belief that communication overhead on contemporary machines is independent of the placement of tasks on processors. The information presented in this report permits the evaluation of the efficiency of parallel algorithm implementations against standard baselines.
A system for routing arbitrary directed graphs on SIMD architectures

NASA Technical Reports Server (NTRS)

Tomboulian, Sherryl

1987-01-01

There are many problems which can be described in terms of directed graphs that contain a large number of vertices where simple computations occur using data from connecting vertices. A method is given for parallelizing such problems on an SIMD machine model that is bit-serial and uses only nearest neighbor connections for communication. Each vertex of the graph will be assigned to a processor in the machine. Algorithms are given that will be used to implement movement of data along the arcs of the graph. This architecture and algorithms define a system that is relatively simple to build and can do graph processing. All arcs can be transversed in parallel in time O(T), where T is empirically proportional to the diameter of the interconnection network times the average degree of the graph. Modifying or adding a new arc takes the same time as parallel traversal.
Toward Millions of File System IOPS on Low-Cost, Commodity Hardware

PubMed Central

Zheng, Da; Burns, Randal; Szalay, Alexander S.

2013-01-01

We describe a storage system that removes I/O bottlenecks to achieve more than one million IOPS based on a user-space file abstraction for arrays of commodity SSDs. The file abstraction refactors I/O scheduling and placement for extreme parallelism and non-uniform memory and I/O. The system includes a set-associative, parallel page cache in the user space. We redesign page caching to eliminate CPU overhead and lock-contention in non-uniform memory architecture machines. We evaluate our design on a 32 core NUMA machine with four, eight-core processors. Experiments show that our design delivers 1.23 million 512-byte read IOPS. The page cache realizes the scalable IOPS of Linux asynchronous I/O (AIO) and increases user-perceived I/O performance linearly with cache hit rates. The parallel, set-associative cache matches the cache hit rates of the global Linux page cache under real workloads. PMID:24402052
Toward Millions of File System IOPS on Low-Cost, Commodity Hardware.

PubMed

Zheng, Da; Burns, Randal; Szalay, Alexander S

2013-01-01

We describe a storage system that removes I/O bottlenecks to achieve more than one million IOPS based on a user-space file abstraction for arrays of commodity SSDs. The file abstraction refactors I/O scheduling and placement for extreme parallelism and non-uniform memory and I/O. The system includes a set-associative, parallel page cache in the user space. We redesign page caching to eliminate CPU overhead and lock-contention in non-uniform memory architecture machines. We evaluate our design on a 32 core NUMA machine with four, eight-core processors. Experiments show that our design delivers 1.23 million 512-byte read IOPS. The page cache realizes the scalable IOPS of Linux asynchronous I/O (AIO) and increases user-perceived I/O performance linearly with cache hit rates. The parallel, set-associative cache matches the cache hit rates of the global Linux page cache under real workloads.
Computational Performance of a Parallelized Three-Dimensional High-Order Spectral Element Toolbox

NASA Astrophysics Data System (ADS)

Bosshard, Christoph; Bouffanais, Roland; Clémençon, Christian; Deville, Michel O.; Fiétier, Nicolas; Gruber, Ralf; Kehtari, Sohrab; Keller, Vincent; Latt, Jonas

In this paper, a comprehensive performance review of an MPI-based high-order three-dimensional spectral element method C++ toolbox is presented. The focus is put on the performance evaluation of several aspects with a particular emphasis on the parallel efficiency. The performance evaluation is analyzed with help of a time prediction model based on a parameterization of the application and the hardware resources. A tailor-made CFD computation benchmark case is introduced and used to carry out this review, stressing the particular interest for clusters with up to 8192 cores. Some problems in the parallel implementation have been detected and corrected. The theoretical complexities with respect to the number of elements, to the polynomial degree, and to communication needs are correctly reproduced. It is concluded that this type of code has a nearly perfect speed up on machines with thousands of cores, and is ready to make the step to next-generation petaflop machines.
Implementation of NASTRAN on the IBM/370 CMS operating system

NASA Technical Reports Server (NTRS)

Britten, S. S.; Schumacker, B.

1980-01-01

The NASA Structural Analysis (NASTRAN) computer program is operational on the IBM 360/370 series computers. While execution of NASTRAN has been described and implemented under the virtual storage operating systems of the IBM 370 models, the IBM 370/168 computer can also operate in a time-sharing mode under the virtual machine operating system using the Conversational Monitor System (CMS) subset. The changes required to make NASTRAN operational under the CMS operating system are described.
Virtual reality hardware for use in interactive 3D data fusion and visualization

NASA Astrophysics Data System (ADS)

Gourley, Christopher S.; Abidi, Mongi A.

1997-09-01

Virtual reality has become a tool for use in many areas of research. We have designed and built a VR system for use in range data fusion and visualization. One major VR tool is the CAVE. This is the ultimate visualization tool, but comes with a large price tag. Our design uses a unique CAVE whose graphics are powered by a desktop computer instead of a larger rack machine making it much less costly. The system consists of a screen eight feet tall by twenty-seven feet wide giving a variable field-of-view currently set at 160 degrees. A silicon graphics Indigo2 MaxImpact with the impact channel option is used for display. This gives the capability to drive three projectors at a resolution of 640 by 480 for use in displaying the virtual environment and one 640 by 480 display for a user control interface. This machine is also the first desktop package which has built-in hardware texture mapping. This feature allows us to quickly fuse the range and intensity data and other multi-sensory data. The final goal is a complete 3D texture mapped model of the environment. A dataglove, magnetic tracker, and spaceball are to be used for manipulation of the data and navigation through the virtual environment. This system gives several users the ability to interactively create 3D models from multiple range images.
An adaptive process-based cloud infrastructure for space situational awareness applications

NASA Astrophysics Data System (ADS)

Liu, Bingwei; Chen, Yu; Shen, Dan; Chen, Genshe; Pham, Khanh; Blasch, Erik; Rubin, Bruce

2014-06-01

Space situational awareness (SSA) and defense space control capabilities are top priorities for groups that own or operate man-made spacecraft. Also, with the growing amount of space debris, there is an increase in demand for contextual understanding that necessitates the capability of collecting and processing a vast amount sensor data. Cloud computing, which features scalable and flexible storage and computing services, has been recognized as an ideal candidate that can meet the large data contextual challenges as needed by SSA. Cloud computing consists of physical service providers and middleware virtual machines together with infrastructure, platform, and software as service (IaaS, PaaS, SaaS) models. However, the typical Virtual Machine (VM) abstraction is on a per operating systems basis, which is at too low-level and limits the flexibility of a mission application architecture. In responding to this technical challenge, a novel adaptive process based cloud infrastructure for SSA applications is proposed in this paper. In addition, the details for the design rationale and a prototype is further examined. The SSA Cloud (SSAC) conceptual capability will potentially support space situation monitoring and tracking, object identification, and threat assessment. Lastly, the benefits of a more granular and flexible cloud computing resources allocation are illustrated for data processing and implementation considerations within a representative SSA system environment. We show that the container-based virtualization performs better than hypervisor-based virtualization technology in an SSA scenario.

Experimental Realization of a Quantum Support Vector Machine

NASA Astrophysics Data System (ADS)

Li, Zhaokai; Liu, Xiaomei; Xu, Nanyang; Du, Jiangfeng

2015-04-01

The fundamental principle of artificial intelligence is the ability of machines to learn from previous experience and do future work accordingly. In the age of big data, classical learning machines often require huge computational resources in many practical cases. Quantum machine learning algorithms, on the other hand, could be exponentially faster than their classical counterparts by utilizing quantum parallelism. Here, we demonstrate a quantum machine learning algorithm to implement handwriting recognition on a four-qubit NMR test bench. The quantum machine learns standard character fonts and then recognizes handwritten characters from a set with two candidates. Because of the wide spread importance of artificial intelligence and its tremendous consumption of computational resources, quantum speedup would be extremely attractive against the challenges of big data.
Hierarchical virtual screening approaches in small molecule drug discovery.

PubMed

Kumar, Ashutosh; Zhang, Kam Y J

2015-01-01

Virtual screening has played a significant role in the discovery of small molecule inhibitors of therapeutic targets in last two decades. Various ligand and structure-based virtual screening approaches are employed to identify small molecule ligands for proteins of interest. These approaches are often combined in either hierarchical or parallel manner to take advantage of the strength and avoid the limitations associated with individual methods. Hierarchical combination of ligand and structure-based virtual screening approaches has received noteworthy success in numerous drug discovery campaigns. In hierarchical virtual screening, several filters using ligand and structure-based approaches are sequentially applied to reduce a large screening library to a number small enough for experimental testing. In this review, we focus on different hierarchical virtual screening strategies and their application in the discovery of small molecule modulators of important drug targets. Several virtual screening studies are discussed to demonstrate the successful application of hierarchical virtual screening in small molecule drug discovery. Copyright © 2014 Elsevier Inc. All rights reserved.
Network Hardware Virtualization for Application Provisioning in Core Networks

DOE PAGES

Gumaste, Ashwin; Das, Tamal; Khandwala, Kandarp; ...

2017-02-03

We present that service providers and vendors are moving toward a network virtualized core, whereby multiple applications would be treated on their own merit in programmable hardware. Such a network would have the advantage of being customized for user requirements and allow provisioning of next generation services that are built specifically to meet user needs. In this article, we articulate the impact of network virtualization on networks that provide customized services and how a provider's business can grow with network virtualization. We outline a decision map that allows mapping of applications with technology that is supported in network-virtualization - orientedmore » equipment. Analogies to the world of virtual machines and generic virtualization show that hardware supporting network virtualization will facilitate new customer needs while optimizing the provider network from the cost and performance perspectives. A key conclusion of the article is that growth would yield sizable revenue when providers plan ahead in terms of supporting network-virtualization-oriented technology in their networks. To be precise, providers have to incorporate into their growth plans network elements capable of new service deployments while protecting network neutrality. Finally, a simulation study validates our NV-induced model.« less
Network Hardware Virtualization for Application Provisioning in Core Networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gumaste, Ashwin; Das, Tamal; Khandwala, Kandarp

We present that service providers and vendors are moving toward a network virtualized core, whereby multiple applications would be treated on their own merit in programmable hardware. Such a network would have the advantage of being customized for user requirements and allow provisioning of next generation services that are built specifically to meet user needs. In this article, we articulate the impact of network virtualization on networks that provide customized services and how a provider's business can grow with network virtualization. We outline a decision map that allows mapping of applications with technology that is supported in network-virtualization - orientedmore » equipment. Analogies to the world of virtual machines and generic virtualization show that hardware supporting network virtualization will facilitate new customer needs while optimizing the provider network from the cost and performance perspectives. A key conclusion of the article is that growth would yield sizable revenue when providers plan ahead in terms of supporting network-virtualization-oriented technology in their networks. To be precise, providers have to incorporate into their growth plans network elements capable of new service deployments while protecting network neutrality. Finally, a simulation study validates our NV-induced model.« less
Parallel image compression

NASA Technical Reports Server (NTRS)

Reif, John H.

1987-01-01

A parallel compression algorithm for the 16,384 processor MPP machine was developed. The serial version of the algorithm can be viewed as a combination of on-line dynamic lossless test compression techniques (which employ simple learning strategies) and vector quantization. These concepts are described. How these concepts are combined to form a new strategy for performing dynamic on-line lossy compression is discussed. Finally, the implementation of this algorithm in a massively parallel fashion on the MPP is discussed.
PUP: An Architecture to Exploit Parallel Unification in Prolog

DTIC Science & Technology

1988-03-01

environment stacking mo del similar to the Warren Abstract Machine [23] since it has been shown to be super ior to other known models (see [21]). The storage...execute in groups of independent operations. Unifications belonging to different group s may not overlap. Also unification operations belonging to the...since all parallel operations on the unification units must complete before any of the units can star t executing the next group of parallel
Fast adaptive composite grid methods on distributed parallel architectures

NASA Technical Reports Server (NTRS)

Lemke, Max; Quinlan, Daniel

1992-01-01

The fast adaptive composite (FAC) grid method is compared with the adaptive composite method (AFAC) under variety of conditions including vectorization and parallelization. Results are given for distributed memory multiprocessor architectures (SUPRENUM, Intel iPSC/2 and iPSC/860). It is shown that the good performance of AFAC and its superiority over FAC in a parallel environment is a property of the algorithm and not dependent on peculiarities of any machine.
Directions in parallel programming: HPF, shared virtual memory and object parallelism in pC++

NASA Technical Reports Server (NTRS)

Bodin, Francois; Priol, Thierry; Mehrotra, Piyush; Gannon, Dennis

1994-01-01

Fortran and C++ are the dominant programming languages used in scientific computation. Consequently, extensions to these languages are the most popular for programming massively parallel computers. We discuss two such approaches to parallel Fortran and one approach to C++. The High Performance Fortran Forum has designed HPF with the intent of supporting data parallelism on Fortran 90 applications. HPF works by asking the user to help the compiler distribute and align the data structures with the distributed memory modules in the system. Fortran-S takes a different approach in which the data distribution is managed by the operating system and the user provides annotations to indicate parallel control regions. In the case of C++, we look at pC++ which is based on a concurrent aggregate parallel model.
Suitability of virtual prototypes to support human factors/ergonomics evaluation during the design.

PubMed

Aromaa, Susanna; Väänänen, Kaisa

2016-09-01

In recent years, the use of virtual prototyping has increased in product development processes, especially in the assessment of complex systems targeted at end-users. The purpose of this study was to evaluate the suitability of virtual prototyping to support human factors/ergonomics evaluation (HFE) during the design phase. Two different virtual prototypes were used: augmented reality (AR) and virtual environment (VE) prototypes of a maintenance platform of a rock crushing machine. Nineteen designers and other stakeholders were asked to assess the suitability of the prototype for HFE evaluation. Results indicate that the system model characteristics and user interface affect the experienced suitability. The VE system was valued as being more suitable to support the assessment of visibility, reach, and the use of tools than the AR system. The findings of this study can be used as a guidance for the implementing virtual prototypes in the product development process. Copyright © 2016 Elsevier Ltd. All rights reserved.
An obstacle to building a time machine

NASA Astrophysics Data System (ADS)

Carroll, Sean M.; Farhi, Edward; Guth, Alan H.

1992-01-01

Gott (1991) has shown that a spacetime with two infinite parallel cosmic strings passing each other with sufficient velocity contains closed timelike curves. An attempt to build such a time machine is discussed. Using the energy-momentum conservation laws in the equivalent (2 + 1)-dimensional theory, the spacetime representing the decay of one gravitating particle into two is explicitly constructed; there is never enough mass in an open universe to build the time machine from the products of decays of stationary particles. More generally, the Gott time machine cannot exist in any open (2 + 1)-dimensional universe for which the total momentum is timelike.
RAMA: A file system for massively parallel computers

NASA Technical Reports Server (NTRS)

Miller, Ethan L.; Katz, Randy H.

1993-01-01

This paper describes a file system design for massively parallel computers which makes very efficient use of a few disks per processor. This overcomes the traditional I/O bottleneck of massively parallel machines by storing the data on disks within the high-speed interconnection network. In addition, the file system, called RAMA, requires little inter-node synchronization, removing another common bottleneck in parallel processor file systems. Support for a large tertiary storage system can easily be integrated in lo the file system; in fact, RAMA runs most efficiently when tertiary storage is used.
Code Optimization and Parallelization on the Origins: Looking from Users' Perspective

NASA Technical Reports Server (NTRS)

Chang, Yan-Tyng Sherry; Thigpen, William W. (Technical Monitor)

2002-01-01

Parallel machines are becoming the main compute engines for high performance computing. Despite their increasing popularity, it is still a challenge for most users to learn the basic techniques to optimize/parallelize their codes on such platforms. In this paper, we present some experiences on learning these techniques for the Origin systems at the NASA Advanced Supercomputing Division. Emphasis of this paper will be on a few essential issues (with examples) that general users should master when they work with the Origins as well as other parallel systems.
Design of virtual SCADA simulation system for pressurized water reactor

NASA Astrophysics Data System (ADS)

Wijaksono, Umar; Abdullah, Ade Gafar; Hakim, Dadang Lukman

2016-02-01

The Virtual SCADA system is a software-based Human-Machine Interface that can visualize the process of a plant. This paper described the results of the virtual SCADA system design that aims to recognize the principle of the Nuclear Power Plant type Pressurized Water Reactor. This simulation uses technical data of the Nuclear Power Plant Unit Olkiluoto 3 in Finland. This device was developed using Wonderware Intouch, which is equipped with manual books for each component, animation links, alarm systems, real time and historical trending, and security system. The results showed that in general this device can demonstrate clearly the principles of energy flow and energy conversion processes in Pressurized Water Reactors. This virtual SCADA simulation system can be used as instructional media to recognize the principle of Pressurized Water Reactor.
An Australian and New Zealand Scoping Study on the Use of 3D Immersive Virtual Worlds in Higher Education

ERIC Educational Resources Information Center

Dalgarno, Barney; Lee, Mark J. W.; Carlson, Lauren; Gregory, Sue; Tynan, Belinda

2011-01-01

This article describes the research design of, and reports selected findings from, a scoping study aimed at examining current and planned applications of 3D immersive virtual worlds at higher education institutions across Australia and New Zealand. The scoping study is the first of its kind in the region, intended to parallel and complement a…
Realizing a partial general quantum cloning machine with superconducting quantum-interference devices in a cavity QED

NASA Astrophysics Data System (ADS)

Fang, Bao-Long; Yang, Zhen; Ye, Liu

2009-05-01

We propose a scheme for implementing a partial general quantum cloning machine with superconducting quantum-interference devices coupled to a nonresonant cavity. By regulating the time parameters, our system can perform optimal symmetric (asymmetric) universal quantum cloning, optimal symmetric (asymmetric) phase-covariant cloning, and optimal symmetric economical phase-covariant cloning. In the scheme the cavity is only virtually excited, thus, the cavity decay is suppressed during the cloning operations.
A Unified Access Model for Interconnecting Heterogeneous Wireless Networks

DTIC Science & Technology

2015-05-01

Defined Networking, OpenFlow, WiFi, LTE 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT UU 18. NUMBER OF PAGES 18 19a. NAME OF...Machine Configurations with WiFi and LTE 4 2.3 Three Virtual Machine Configurations with WiFi and LTE 5 3. Results and Discussion 5 4. Summary and...WiFi and long-term evolution ( LTE ), and created a communication pathway between them via a central controller node. Our simulation serves as a
Development of Fast Algorithms Using Recursion, Nesting and Iterations for Computational Electromagnetics

NASA Technical Reports Server (NTRS)

Chew, W. C.; Song, J. M.; Lu, C. C.; Weedon, W. H.

1995-01-01

In the first phase of our work, we have concentrated on laying the foundation to develop fast algorithms, including the use of recursive structure like the recursive aggregate interaction matrix algorithm (RAIMA), the nested equivalence principle algorithm (NEPAL), the ray-propagation fast multipole algorithm (RPFMA), and the multi-level fast multipole algorithm (MLFMA). We have also investigated the use of curvilinear patches to build a basic method of moments code where these acceleration techniques can be used later. In the second phase, which is mainly reported on here, we have concentrated on implementing three-dimensional NEPAL on a massively parallel machine, the Connection Machine CM-5, and have been able to obtain some 3D scattering results. In order to understand the parallelization of codes on the Connection Machine, we have also studied the parallelization of 3D finite-difference time-domain (FDTD) code with PML material absorbing boundary condition (ABC). We found that simple algorithms like the FDTD with material ABC can be parallelized very well allowing us to solve within a minute a problem of over a million nodes. In addition, we have studied the use of the fast multipole method and the ray-propagation fast multipole algorithm to expedite matrix-vector multiplication in a conjugate-gradient solution to integral equations of scattering. We find that these methods are faster than LU decomposition for one incident angle, but are slower than LU decomposition when many incident angles are needed as in the monostatic RCS calculations.
Highly fault-tolerant parallel computation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Spielman, D.A.

We re-introduce the coded model of fault-tolerant computation in which the input and output of a computational device are treated as words in an error-correcting code. A computational device correctly computes a function in the coded model if its input and output, once decoded, are a valid input and output of the function. In the coded model, it is reasonable to hope to simulate all computational devices by devices whose size is greater by a constant factor but which are exponentially reliable even if each of their components can fail with some constant probability. We consider fine-grained parallel computations inmore » which each processor has a constant probability of producing the wrong output at each time step. We show that any parallel computation that runs for time t on w processors can be performed reliably on a faulty machine in the coded model using w log{sup O(l)} w processors and time t log{sup O(l)} w. The failure probability of the computation will be at most t {center_dot} exp(-w{sup 1/4}). The codes used to communicate with our fault-tolerant machines are generalized Reed-Solomon codes and can thus be encoded and decoded in O(n log{sup O(1)} n) sequential time and are independent of the machine they are used to communicate with. We also show how coded computation can be used to self-correct many linear functions in parallel with arbitrarily small overhead.« less
Parallel spatial direct numerical simulations on the Intel iPSC/860 hypercube

NASA Technical Reports Server (NTRS)

Joslin, Ronald D.; Zubair, Mohammad

1993-01-01

The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube is documented. The direct numerical simulation approach is used to compute spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows. The feasibility of using the PSDNS on the hypercube to perform transition studies is examined. The results indicate that the direct numerical simulation approach can effectively be parallelized on a distributed-memory parallel machine. By increasing the number of processors nearly ideal linear speedups are achieved with nonoptimized routines; slower than linear speedups are achieved with optimized (machine dependent library) routines. This slower than linear speedup results because the Fast Fourier Transform (FFT) routine dominates the computational cost and because the routine indicates less than ideal speedups. However with the machine-dependent routines the total computational cost decreases by a factor of 4 to 5 compared with standard FORTRAN routines. The computational cost increases linearly with spanwise wall-normal and streamwise grid refinements. The hypercube with 32 processors was estimated to require approximately twice the amount of Cray supercomputer single processor time to complete a comparable simulation; however it is estimated that a subgrid-scale model which reduces the required number of grid points and becomes a large-eddy simulation (PSLES) would reduce the computational cost and memory requirements by a factor of 10 over the PSDNS. This PSLES implementation would enable transition simulations on the hypercube at a reasonable computational cost.
High-performance Chinese multiclass traffic sign detection via coarse-to-fine cascade and parallel support vector machine detectors

NASA Astrophysics Data System (ADS)

Chang, Faliang; Liu, Chunsheng

2017-09-01

The high variability of sign colors and shapes in uncontrolled environments has made the detection of traffic signs a challenging problem in computer vision. We propose a traffic sign detection (TSD) method based on coarse-to-fine cascade and parallel support vector machine (SVM) detectors to detect Chinese warning and danger traffic signs. First, a region of interest (ROI) extraction method is proposed to extract ROIs using color contrast features in local regions. The ROI extraction can reduce scanning regions and save detection time. For multiclass TSD, we propose a structure that combines a coarse-to-fine cascaded tree with a parallel structure of histogram of oriented gradients (HOG) + SVM detectors. The cascaded tree is designed to detect different types of traffic signs in a coarse-to-fine process. The parallel HOG + SVM detectors are designed to do fine detection of different types of traffic signs. The experiments demonstrate the proposed TSD method can rapidly detect multiclass traffic signs with different colors and shapes in high accuracy.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.