Science.gov

Sample records for shared memory multiprocessors

  1. Shared versus distributed memory multiprocessors

    NASA Technical Reports Server (NTRS)

    Jordan, Harry F.

    1991-01-01

    The question of whether multiprocessors should have shared or distributed memory has attracted a great deal of attention. Some researchers argue strongly for building distributed memory machines, while others argue just as strongly for programming shared memory multiprocessors. A great deal of research is underway on both types of parallel systems. Special emphasis is placed on systems with a very large number of processors for computation intensive tasks and considers research and implementation trends. It appears that the two types of systems will likely converge to a common form for large scale multiprocessors.

  2. Parallel language support on shared memory multiprocessors

    SciTech Connect

    Sah, A.

    1991-01-01

    The study of general purpose parallel computing requires efficient and inexpensive platforms for parallel program execution. This helps in ascertaining tradeoff choices between hardware complexity and software solutions for massively parallel systems design. In this paper, the authors present an implementation of an efficient parallel execution model on shared memory multiprocessors based on a Threaded Abstract Machine. The authors discuss a k-way generalized locking strategy suitable for our model. The authors study the performance gains obtained by a queuing strategy which uses multiple gueues with reduced access contention. The authors also present performance models in shared memory machines, related to lock contention and serialization in shared memory allocation. A bin-based memory management technique which reduces the serialization is presented. These issues are critical for obtaining an efficient parallel execution environment.

  3. Efficient ICCG on a shared memory multiprocessor

    NASA Technical Reports Server (NTRS)

    Hammond, Steven W.; Schreiber, Robert

    1989-01-01

    Different approaches are discussed for exploiting parallelism in the ICCG (Incomplete Cholesky Conjugate Gradient) method for solving large sparse symmetric positive definite systems of equations on a shared memory parallel computer. Techniques for efficiently solving triangular systems and computing sparse matrix-vector products are explored. Three methods for scheduling the tasks in solving triangular systems are implemented on the Sequent Balance 21000. Sample problems that are representative of a large class of problems solved using iterative methods are used. We show that a static analysis to determine data dependences in the triangular solve can greatly improve its parallel efficiency. We also show that ignoring symmetry and storing the whole matrix can reduce solution time substantially.

  4. MPF: A portable message passing facility for shared memory multiprocessors

    NASA Technical Reports Server (NTRS)

    Malony, Allen D.; Reed, Daniel A.; Mcguire, Patrick J.

    1987-01-01

    The design, implementation, and performance evaluation of a message passing facility (MPF) for shared memory multiprocessors are presented. The MPF is based on a message passing model conceptually similar to conversations. Participants (parallel processors) can enter or leave a conversation at any time. The message passing primitives for this model are implemented as a portable library of C function calls. The MPF is currently operational on a Sequent Balance 21000, and several parallel applications were developed and tested. Several simple benchmark programs are presented to establish interprocess communication performance for common patterns of interprocess communication. Finally, performance figures are presented for two parallel applications, linear systems solution, and iterative solution of partial differential equations.

  5. A robot arm simulation with a shared memory multiprocessor machine

    NASA Technical Reports Server (NTRS)

    Kim, Sung-Soo; Chuang, Li-Ping

    1989-01-01

    A parallel processing scheme for a single chain robot arm is presented for high speed computation on a shared memory multiprocessor. A recursive formulation that is derived from a virtual work form of the d'Alembert equations of motion is utilized for robot arm dynamics. A joint drive system that consists of a motor rotor and gears is included in the arm dynamics model, in order to take into account gyroscopic effects due to the spinning of the rotor. The fine grain parallelism of mechanical and control subsystem models is exploited, based on independent computation associated with bodies, joint drive systems, and controllers. Efficiency and effectiveness of the parallel scheme are demonstrated through simulations of a telerobotic manipulator arm. Two different mechanical subsystem models, i.e., with and without gyroscopic effects, are compared, to show the trade-off between efficiency and accuracy.

  6. Parallelization of NAS Benchmarks for Shared Memory Multiprocessors

    NASA Technical Reports Server (NTRS)

    Waheed, Abdul; Yan, Jerry C.; Saini, Subhash (Technical Monitor)

    1998-01-01

    This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting to new generations of high performance computing systems to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically and present the results of parallelization to the users. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks.

  7. Paging tradeoffs in distributed-shared-memory multiprocessors

    SciTech Connect

    Burger, D.C.; Hyder, R.S.; Miller, B.P.; Wood, D.A.

    1994-12-31

    Massively parallel processors have begun using commodity operating systems that support demand-paged virtual memory. To evaluate the utility of virtual memory, the authors measured the behavior of seven shared memory parallel application programs on a simulated distributed-shared-memory machine. The results (1) confirm the importance of gang CPU scheduling, (2) show that a page-faulting processor should spin rather than invoke a parallel context switch, (3) show that the parallel programs frequently touch most of their data, and (4) indicate that memory, not just CPUs, must be ``gang scheduled``. Overall, the experiments demonstrate that demand paging has limited value on current parallel machines because of the applications` synchronization and memory reference patterns and the machines` high page-fault and parallel-context-switch overheads.

  8. Fast, contention-free combining tree barriers for shared-memory multiprocessors

    SciTech Connect

    Scott, M.L. ); Mellor-Crummey, J.M. )

    1994-08-01

    In a previous article, Gupta and Hill introduced an adaptive combining tree algorithm for busy-wait barrier synchronization on shared-memory multiprocessors. The intent of the algorithm was to achieve a barrier in logarithmic time when processes arrive simultaneously, and in constant time after the last arrival when arrival times are skewed. A fuzzy version of the algorithm allows a process to perform useful work between the point at which it notifies other processes of its arrival and the point at which it waits for all other processes to arrive. Unfortunately, adaptive combining tree barriers as originally devised perform a large amount of work at each node of the tree, including the acquisition and release of locks. They also perform an unbounded number of accesses to nonlocal locations, inducing large amounts of memory and interconnect contention. We present new adaptive combining tree barriers that eliminate these problems. We compare the performance of the new algorithms to that of other fast barriers on a 64-node BBN Butterfly 1 multiprocessor, a 35-node BBN TC2000, and a 126-node KSR 1. The results reveal scenarios in which our algorithms outperform all known alternatives, and suggest that both adaptation and the combination of fuzziness with tree-style synchronization will be of increasing importance on future generations of shared-memory multiprocessors.

  9. Implementation of two projection methods on a shared memory multiprocessor - DEC VAX 6240

    NASA Technical Reports Server (NTRS)

    Kamath, C.; Weeratunga, S.

    1990-01-01

    The relative performance of two iterative schemes, based on projection techniques, is compared on a shared memory multiprocessor - VAX 6240. The CG accelerated Block-SSOR method and the CG accelerated Symmetric-Kaczmarz method are considered for the solution of large sparse nonsymmetric systems of linear equations. It is shown that the regular structure of many matrices can be exploited by the CG-accelerated Block-SSOR method to provide good speedup in a multiprocessing environment. However, the CG accelerated Symmetric-Kaczmarz method, while being a viable alternative on a scalar machine, is unable to benefit from multiprocessing.

  10. Performance Modeling and Measurement of Parallelized Code for Distributed Shared Memory Multiprocessors

    NASA Technical Reports Server (NTRS)

    Waheed, Abdul; Yan, Jerry

    1998-01-01

    This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. With increasing popularity of shared address space architectures, it is essential to understand their performance impact on programs that benefit from shared memory multiprocessing. We present a simple model to characterize the performance of programs that are parallelized using compiler directives for shared memory multiprocessing. We parallelized the sequential implementation of NAS benchmarks using native Fortran77 compiler directives for an Origin2000, which is a DSM system based on a cache-coherent Non Uniform Memory Access (ccNUMA) architecture. We report measurement based performance of these parallelized benchmarks from four perspectives: efficacy of parallelization process; scalability; parallelization overhead; and comparison with hand-parallelized and -optimized version of the same benchmarks. Our results indicate that sequential programs can conveniently be parallelized for DSM systems using compiler directives but realizing performance gains as predicted by the performance model depends primarily on minimizing architecture-specific data locality overhead.

  11. Reader set encoding for directory of shared cache memory in multiprocessor system

    DOEpatents

    Ahn, Dnaiel; Ceze, Luis H.; Gara, Alan; Ohmacht, Martin; Xiaotong, Zhuang

    2014-06-10

    In a parallel processing system with speculative execution, conflict checking occurs in a directory lookup of a cache memory that is shared by all processors. In each case, the same physical memory address will map to the same set of that cache, no matter which processor originated that access. The directory includes a dynamic reader set encoding, indicating what speculative threads have read a particular line. This reader set encoding is used in conflict checking. A bitset encoding is used to specify particular threads that have read the line.

  12. Shared performance monitor in a multiprocessor system

    DOEpatents

    Chiu, George; Gara, Alan G; Salapura, Valentina

    2014-12-02

    A performance monitoring unit (PMU) and method for monitoring performance of events occurring in a multiprocessor system. The multiprocessor system comprises a plurality of processor devices units, each processor device for generating signals representing occurrences of events in the processor device, and, a single shared counter resource for performance monitoring. The performance monitor unit is shared by all processor cores in the multiprocessor system. The PMU is further programmed to monitor event signals issued from non-processor devices.

  13. Sparse Gaussian elimination with controlled fill-in on a shared memory multiprocessor

    NASA Technical Reports Server (NTRS)

    Alaghband, Gita; Jordan, Harry F.

    1989-01-01

    It is shown that in sparse matrices arising from electronic circuits, it is possible to do computations on many diagonal elements simultaneously. A technique for obtaining an ordered compatible set directly from the ordered incompatible table is given. The ordering is based on the Markowitz number of the pivot candidates. This technique generates a set of compatible pivots with the property of generating few fills. A novel heuristic algorithm is presented that combines the idea of an order-compatible set with a limited binary tree search to generate several sets of compatible pivots in linear time. An elimination set for reducing the matrix is generated and selected on the basis of a minimum Markowitz sum number. The parallel pivoting technique presented is a stepwise algorithm and can be applied to any submatrix of the original matrix. Thus, it is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds. Parameters are suggested to obtain a balance between parallelism and fill-ins. Results of applying the proposed algorithms on several large application matrices using the HEP multiprocessor (Kowalik, 1985) are presented and analyzed.

  14. Interleaved synchronous bus access protocol for a shared memory multi-processor system

    SciTech Connect

    Moore, W.T.

    1989-01-10

    A method is described for providing asynchronous processors with inter-processor communication and access to several memory modules over a common bus which includes a first bus and a second bus, comprising: providing clock pulses on the common bus, each pulse having a period; asserting a request signal and placing priority signal on the common bus; polling the processors during the first period to determine whether the processors request access to the common bus and to determine which one processor has priority; sending a destination address from the one processor to a destination during a second period, the destination being chosen from the processors and the several memory modules; performing one of reading input data between the destination and the processor; multiplexing priority and reading input data signals on the first bus, and multiplexing address and writing output data signals on the second bus; generating poll inhibit signals prior to each reading input data signal and prior to each memory address signal preceding a writing output data operation; and queuing the input data in a first-in-first-out manner for each of the processors when the input data indicates an interprocessor interrupt.

  15. Shared performance monitor in a multiprocessor system

    DOEpatents

    Chiu, George; Gara, Alan G.; Salapura, Valentina

    2012-07-24

    A performance monitoring unit (PMU) and method for monitoring performance of events occurring in a multiprocessor system. The multiprocessor system comprises a plurality of processor devices units, each processor device for generating signals representing occurrences of events in the processor device, and, a single shared counter resource for performance monitoring. The performance monitor unit is shared by all processor cores in the multiprocessor system. The PMU comprises: a plurality of performance counters each for counting signals representing occurrences of events from one or more the plurality of processor units in the multiprocessor system; and, a plurality of input devices for receiving the event signals from one or more processor devices of the plurality of processor units, the plurality of input devices programmable to select event signals for receipt by one or more of the plurality of performance counters for counting, wherein the PMU is shared between multiple processing units, or within a group of processors in the multiprocessing system. The PMU is further programmed to monitor event signals issued from non-processor devices.

  16. A general model for memory interference in a multiprocessor system with memory hierarchy

    NASA Technical Reports Server (NTRS)

    Taha, Badie A.; Standley, Hilda M.

    1989-01-01

    The problem of memory interference in a multiprocessor system with a hierarchy of shared buses and memories is addressed. The behavior of the processors is represented by a sequence of memory requests with each followed by a determined amount of processing time. A statistical queuing network model for determining the extent of memory interference in multiprocessor systems with clusters of memory hierarchies is presented. The performance of the system is measured by the expected number of busy memory clusters. The results of the analytic model are compared with simulation results, and the correlation between them is found to be very high.

  17. Preliminary basic performance analysis of the Cedar multiprocessor memory system

    NASA Technical Reports Server (NTRS)

    Gallivan, K.; Jalby, W.; Turner, S.; Veidenbaum, A.; Wijshoff, H.

    1991-01-01

    Some preliminary basic results on the performance of the Cedar multiprocessor memory system are presented. Empirical results are presented and used to calibrate a memory system simulator which is then used to discuss the scalability of the system.

  18. Scalable Triadic Analysis of Large-Scale Graphs: Multi-Core vs. Multi-Processor vs. Multi-Threaded Shared Memory Architectures

    SciTech Connect

    Chin, George; Marquez, Andres; Choudhury, Sutanay; Feo, John T.

    2012-09-01

    Triadic analysis encompasses a useful set of graph mining methods that is centered on the concept of a triad, which is a subgraph of three nodes and the configuration of directed edges across the nodes. Such methods are often applied in the social sciences as well as many other diverse fields. Triadic methods commonly operate on a triad census that counts the number of triads of every possible edge configuration in a graph. Like other graph algorithms, triadic census algorithms do not scale well when graphs reach tens of millions to billions of nodes. To enable the triadic analysis of large-scale graphs, we developed and optimized a triad census algorithm to efficiently execute on shared memory architectures. We will retrace the development and evolution of a parallel triad census algorithm. Over the course of several versions, we continually adapted the code’s data structures and program logic to expose more opportunities to exploit parallelism on shared memory that would translate into improved computational performance. We will recall the critical steps and modifications that occurred during code development and optimization. Furthermore, we will compare the performances of triad census algorithm versions on three specific systems: Cray XMT, HP Superdome, and AMD multi-core NUMA machine. These three systems have shared memory architectures but with markedly different hardware capabilities to manage parallelism.

  19. Program partitioning for NUMA multiprocessor computer systems. [Nonuniform memory access

    SciTech Connect

    Wolski, R.M.; Feo, J.T. )

    1993-11-01

    Program partitioning and scheduling are essential steps in programming non-shared-memory computer systems. Partitioning is the separation of program operations into sequential tasks, and scheduling is the assignment of tasks to processors. To be effective, automatic methods require an accurate representation of the model of computation and the target architecture. Current partitioning methods assume today's most prevalent models -- macro dataflow and a homogeneous/two-level multicomputer system. Based on communication channels, neither model represents well the emerging class of NUMA multiprocessor computer systems consisting of hierarchical read/write memories. Consequently, the partitions generated by extant methods do not execute well on these systems. In this paper, the authors extend the conventional graph representation of the macro-dataflow model to enable mapping heuristics to consider the complex communication options supported by NUMA architectures. They describe two such heuristics. Simulated execution times of program graphs show that the model and heuristics generate higher quality program mappings than current methods for NUMA architectures.

  20. Using Pin as a Memory Reference Generator for Multiprocessor Simulation

    SciTech Connect

    McCurdy, C

    2005-10-22

    In this paper we describe how we have used Pin to generate a multithreaded reference stream for simulation of a multiprocessor on a uniprocessor. We have taken special care to model as accurately as possible the effects of cache coherence protocol state, and lock and barrier synchronization on the performance of multithreaded applications running on multiprocessor hardware. We first describe a simplified version of the algorithm, which uses semaphores to synchronize instrumented application threads and the simulator on every memory reference. We then describe modifications to that algorithm to model the microarchitectural features of the Itanium2 that affect the timing of memory reference issue. An experimental evaluation determines that while cycle-accurate multithreaded simulation is possible using our approach, the use of semaphores has a negative impact on the performance of the simulator.

  1. Low Latency Messages on Distributed Memory Multiprocessors

    DOE PAGESBeta

    Rosing, Matt; Saltz, Joel

    1995-01-01

    This article describes many of the issues in developing an efficient interface for communication on distributed memory machines. Although the hardware component of message latency is less than 1 ws on many distributed memory machines, the software latency associated with sending and receiving typed messages is on the order of 50 μs. The reason for this imbalance is that the software interface does not match the hardware. By changing the interface to match the hardware more closely, applications with fine grained communication can be put on these machines. This article describes several tests performed and many of the issues involvedmore » in supporting low latency messages on distributed memory machines.« less

  2. Low latency messages on distributed memory multiprocessors

    NASA Technical Reports Server (NTRS)

    Rosing, Matthew; Saltz, Joel

    1993-01-01

    Many of the issues in developing an efficient interface for communication on distributed memory machines are described and a portable interface is proposed. Although the hardware component of message latency is less than one microsecond on many distributed memory machines, the software latency associated with sending and receiving typed messages is on the order of 50 microseconds. The reason for this imbalance is that the software interface does not match the hardware. By changing the interface to match the hardware more closely, applications with fine grained communication can be put on these machines. Based on several tests that were run on the iPSC/860, an interface that will better match current distributed memory machines is proposed. The model used in the proposed interface consists of a computation processor and a communication processor on each node. Communication between these processors and other nodes in the system is done through a buffered network. Information that is transmitted is either data or procedures to be executed on the remote processor. The dual processor system is better suited for efficiently handling asynchronous communications compared to a single processor system. The ability to send data or procedure is very flexible for minimizing message latency, based on the type of communication being performed. The test performed and the proposed interface are described.

  3. Software Coherence in Multiprocessor Memory Systems. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Bolosky, William Joseph

    1993-01-01

    Processors are becoming faster and multiprocessor memory interconnection systems are not keeping up. Therefore, it is necessary to have threads and the memory they access as near one another as possible. Typically, this involves putting memory or caches with the processors, which gives rise to the problem of coherence: if one processor writes an address, any other processor reading that address must see the new value. This coherence can be maintained by the hardware or with software intervention. Systems of both types have been built in the past; the hardware-based systems tended to outperform the software ones. However, the ratio of processor to interconnect speed is now so high that the extra overhead of the software systems may no longer be significant. This issue is explored both by implementing a software maintained system and by introducing and using the technique of offline optimal analysis of memory reference traces. It finds that in properly built systems, software maintained coherence can perform comparably to or even better than hardware maintained coherence. The architectural features necessary for efficient software coherence to be profitable include a small page size, a fast trap mechanism, and the ability to execute instructions while remote memory references are outstanding.

  4. Generation-based memory synchronization in a multiprocessor system with weakly consistent memory accesses

    DOEpatents

    Ohmacht, Martin

    2014-09-09

    In a multiprocessor system, a central memory synchronization module coordinates memory synchronization requests responsive to memory access requests in flight, a generation counter, and a reclaim pointer. The central module communicates via point-to-point communication. The module includes a global OR reduce tree for each memory access requesting device, for detecting memory access requests in flight. An interface unit is implemented associated with each processor requesting synchronization. The interface unit includes multiple generation completion detectors. The generation count and reclaim pointer do not pass one another.

  5. Denelcor HEP multiprocessor simulator

    SciTech Connect

    Dunigan, T.H.

    1986-06-01

    The structure and use of a simulator for the Denelcor HEP multiprocessor are described. The simulator provides a multitasking environment for the development of parallel programs in C or FORTRAN using a library of subroutines that simulate the parallel programming constructs available on the HEP, a shared-memory multiprocessor. The simulator also provides a trace file that can be used for debugging, performance analysis, or graphical display. 7 refs., 4 figs.

  6. Multi-ring performance of the Kendall square multiprocessor

    SciTech Connect

    Dunigan, T.H.

    1994-03-01

    Performance of the hierarchical shared-memory system of the Kendall Square Research multiprocessor is measured and characterized. The performance of prefetch is measured. Latency, bandwidth, and contention are analyzed on a 4-ring, 128 processor system. Scalability comparisons are made with other shared-memory and distributed-memory multiprocessors.

  7. Optical RAM-enabled cache memory and optical routing for chip multiprocessors: technologies and architectures

    NASA Astrophysics Data System (ADS)

    Pleros, Nikos; Maniotis, Pavlos; Alexoudi, Theonitsa; Fitsios, Dimitris; Vagionas, Christos; Papaioannou, Sotiris; Vyrsokinos, K.; Kanellos, George T.

    2014-03-01

    The processor-memory performance gap, commonly referred to as "Memory Wall" problem, owes to the speed mismatch between processor and electronic RAM clock frequencies, forcing current Chip Multiprocessor (CMP) configurations to consume more than 50% of the chip real-estate for caching purposes. In this article, we present our recent work spanning from Si-based integrated optical RAM cell architectures up to complete optical cache memory architectures for Chip Multiprocessor configurations. Moreover, we discuss on e/o router subsystems with up to Tb/s routing capacity for cache interconnection purposes within CMP configurations, currently pursued within the FP7 PhoxTrot project.

  8. Vienna FORTRAN: A FORTRAN language extension for distributed memory multiprocessors

    NASA Technical Reports Server (NTRS)

    Chapman, Barbara; Mehrotra, Piyush; Zima, Hans

    1991-01-01

    Exploiting the performance potential of distributed memory machines requires a careful distribution of data across the processors. Vienna FORTRAN is a language extension of FORTRAN which provides the user with a wide range of facilities for such mapping of data structures. However, programs in Vienna FORTRAN are written using global data references. Thus, the user has the advantage of a shared memory programming paradigm while explicitly controlling the placement of data. The basic features of Vienna FORTRAN are presented along with a set of examples illustrating the use of these features.

  9. Kendall Square multiprocessor: Early experiences and performance

    SciTech Connect

    Dunigan, T.H.

    1992-04-01

    Initial performance results and early experiences are reported for the Kendall Square Research multiprocessor. The basic architecture of the shared-memory multiprocessor is described, and computational and I/O performance is measured for both serial and parallel programs. Experiences in porting various applications are described.

  10. Recoverable distributed shared virtual memory

    NASA Technical Reports Server (NTRS)

    Wu, Kun-Lung; Fuchs, W. Kent

    1990-01-01

    The problem of rollback recovery in distributed shared virtual environments, in which the shared memory is implemented in software in a loosely coupled distributed multicomputer system, is examined. A user-transparent checkpointing recovery scheme and a new twin-page disk storage management technique are presented for implementing recoverable distributed shared virtual memory. The checkpointing scheme can be integrated with the memory coherence protocol for managing the shared virtual memory. The twin-page disk design allows checkpointing to proceed in an incremental fashion without an explicit undo at the time of recovery. The recoverable distributed shared virtual memory allows the system to restart computation from a checkpoint without a global restart.

  11. A single-assignment language in a distributed memory multiprocessor

    NASA Technical Reports Server (NTRS)

    Evripidou, P.; Najjar, W.; Gaudiot, J.-L.

    1989-01-01

    The implementation of the single-assignment programming language SISAL (McGraw et al., 1985) on a Symult 2010 parallel computer is described. The advantages of single-assignment languages over imperative languages in a multiprocessor environment are reviewed; the characteristics of SISAL are summarized; the program-graph generation and dynamic data partitioning procedures are explained; and the application of SISAL in constructing a concurrent iterative multigrid algorithm is discussed in detail and illustrated with diagrams.

  12. Conditional load and store in a shared memory

    DOEpatents

    Blumrich, Matthias A; Ohmacht, Martin

    2015-02-03

    A method, system and computer program product for implementing load-reserve and store-conditional instructions in a multi-processor computing system. The computing system includes a multitude of processor units and a shared memory cache, and each of the processor units has access to the memory cache. In one embodiment, the method comprises providing the memory cache with a series of reservation registers, and storing in these registers addresses reserved in the memory cache for the processor units as a result of issuing load-reserve requests. In this embodiment, when one of the processor units makes a request to store data in the memory cache using a store-conditional request, the reservation registers are checked to determine if an address in the memory cache is reserved for that processor unit. If an address in the memory cache is reserved for that processor, the data are stored at this address.

  13. Combinatorial reliability analysis of multiprocessor computers

    SciTech Connect

    Hwang, K.; Tian-Pong Chang

    1982-12-01

    The authors propose a combinatorial method to evaluate the reliability of multiprocessor computers. Multiprocessor structures are classified as crossbar switch, time-shared buses, and multiport memories. Closed-form reliability expressions are derived via combinatorial path enumeration on the probabilistic-graph representation of a multiprocessor system. The method can analyze the reliability performance of real systems like C.mmp, Tandem 16, and Univac 1100/80. User-oriented performance levels are defined for measuring the performability of degradable multiprocessor systems. For a regularly structured multiprocessor system, it is fast and easy to use this technique for evaluating system reliability with statistically independent component reliabilities. System availability can be also evaluated by this reliability study. 6 references.

  14. Memory access in shared virtual memory

    SciTech Connect

    Berrendorf, R. )

    1992-01-01

    Shared virtual memory (SVM) is a virtual memory layer with a single address space on top of a distributed real memory on parallel computers. We examine the behavior and performance of SVM running a parallel program with medium-grained, loop-level parallelism on top of it. A simulator for the underlying parallel architecture can be used to examine the behavior of SVM more deeply. The influence of several parameters, such as the number of processors, page size, cold or warm start, and restricted page replication, is studied.

  15. Memory access in shared virtual memory

    SciTech Connect

    Berrendorf, R.

    1992-09-01

    Shared virtual memory (SVM) is a virtual memory layer with a single address space on top of a distributed real memory on parallel computers. We examine the behavior and performance of SVM running a parallel program with medium-grained, loop-level parallelism on top of it. A simulator for the underlying parallel architecture can be used to examine the behavior of SVM more deeply. The influence of several parameters, such as the number of processors, page size, cold or warm start, and restricted page replication, is studied.

  16. Multiprocessor execution of functional programs

    SciTech Connect

    Goldberg, B. )

    1988-10-01

    Functional languages have recently gained attention as vehicles for programming in a concise and element manner. In addition, it has been suggested that functional programming provides a natural methodology for programming multiprocessor computers. This paper describes research that was performed to demonstrate that multiprocessor execution of functional programs on current multiprocessors is feasible, and results in a significant reduction in their execution times. Two implementations of the functional language ALFL were built on commercially available multiprocessors. Alfalfa is an implementation on the Intel iPSC hypercube multiprocessor, and Buckwheat is an implementation on the Encore Multimax shared-memory multiprocessor. Each implementation includes a compiler that performs automatic decomposition of ALFL programs and a run-time system that supports their execution. The compiler is responsible for detecting the inherent parallelism in a program, and decomposing the program into a collection of tasks, called serial combinators, that can be executed in parallel. The abstract machine model supported by Alfalfa and Buckwheat is called heterogeneous graph reduction, which is a hybrid of graph reduction and conventional stack-oriented execution. This model supports parallelism, lazy evaluation, and higher order functions while at the same time making efficient use of the processors in the system. The Alfalfa and Buckwheat runtime systems support dynamic load balancing, interprocessor communication (if required), and storage management. A large number of experiments were performed on Alfalfa and Buckwheat for a variety of programs. The results of these experiments, as well as the conclusions drawn from them, are presented.

  17. Shared Memory Parallelization of an Implicit ADI-type CFD Code

    NASA Technical Reports Server (NTRS)

    Hauser, Th.; Huang, P. G.

    1999-01-01

    A parallelization study designed for ADI-type algorithms is presented using the OpenMP specification for shared-memory multiprocessor programming. Details of optimizations specifically addressed to cache-based computer architectures are described and performance measurements for the single and multiprocessor implementation are summarized. The paper demonstrates that optimization of memory access on a cache-based computer architecture controls the performance of the computational algorithm. A hybrid MPI/OpenMP approach is proposed for clusters of shared memory machines to further enhance the parallel performance. The method is applied to develop a new LES/DNS code, named LESTool. A preliminary DNS calculation of a fully developed channel flow at a Reynolds number of 180, Re(sub tau) = 180, has shown good agreement with existing data.

  18. Solution of large nonlinear quasistatic structural mechanics problems on distributed-memory multiprocessor computers

    SciTech Connect

    Blanford, M.

    1997-12-31

    Most commercially-available quasistatic finite element programs assemble element stiffnesses into a global stiffness matrix, then use a direct linear equation solver to obtain nodal displacements. However, for large problems (greater than a few hundred thousand degrees of freedom), the memory size and computation time required for this approach becomes prohibitive. Moreover, direct solution does not lend itself to the parallel processing needed for today`s multiprocessor systems. This talk gives an overview of the iterative solution strategy of JAS3D, the nonlinear large-deformation quasistatic finite element program. Because its architecture is derived from an explicit transient-dynamics code, it does not ever assemble a global stiffness matrix. The author describes the approach he used to implement the solver on multiprocessor computers, and shows examples of problems run on hundreds of processors and more than a million degrees of freedom. Finally, he describes some of the work he is presently doing to address the challenges of iterative convergence for ill-conditioned problems.

  19. A multiprocessor computer simulation model employing a feedback scheduler/allocator for memory space and bandwidth matching and TMR processing

    NASA Technical Reports Server (NTRS)

    Bradley, D. B.; Irwin, J. D.

    1974-01-01

    A computer simulation model for a multiprocessor computer is developed that is useful for studying the problem of matching multiprocessor's memory space, memory bandwidth and numbers and speeds of processors with aggregate job set characteristics. The model assumes an input work load of a set of recurrent jobs. The model includes a feedback scheduler/allocator which attempts to improve system performance through higher memory bandwidth utilization by matching individual job requirements for space and bandwidth with space availability and estimates of bandwidth availability at the times of memory allocation. The simulation model includes provisions for specifying precedence relations among the jobs in a job set, and provisions for specifying precedence execution of TMR (Triple Modular Redundant and SIMPLEX (non redundant) jobs.

  20. The performance of disk arrays in shared-memory database machines

    NASA Technical Reports Server (NTRS)

    Katz, Randy H.; Hong, Wei

    1993-01-01

    In this paper, we examine how disk arrays and shared memory multiprocessors lead to an effective method for constructing database machines for general-purpose complex query processing. We show that disk arrays can lead to cost-effective storage systems if they are configured from suitably small formfactor disk drives. We introduce the storage system metric data temperature as a way to evaluate how well a disk configuration can sustain its workload, and we show that disk arrays can sustain the same data temperature as a more expensive mirrored-disk configuration. We use the metric to evaluate the performance of disk arrays in XPRS, an operational shared-memory multiprocessor database system being developed at the University of California, Berkeley.

  1. A simple modern correctness condition for a space-based high-performance multiprocessor

    NASA Technical Reports Server (NTRS)

    Probst, David K.; Li, Hon F.

    1992-01-01

    A number of U.S. national programs, including space-based detection of ballistic missile launches, envisage putting significant computing power into space. Given sufficient progress in low-power VLSI, multichip-module packaging and liquid-cooling technologies, we will see design of high-performance multiprocessors for individual satellites. In very high speed implementations, performance depends critically on tolerating large latencies in interprocessor communication; without latency tolerance, performance is limited by the vastly differing time scales in processor and data-memory modules, including interconnect times. The modern approach to tolerating remote-communication cost in scalable, shared-memory multiprocessors is to use a multithreaded architecture, and alter the semantics of shared memory slightly, at the price of forcing the programmer either to reason about program correctness in a relaxed consistency model or to agree to program in a constrained style. The literature on multiprocessor correctness conditions has become increasingly complex, and sometimes confusing, which may hinder its practical application. We propose a simple modern correctness condition for a high-performance, shared-memory multiprocessor; the correctness condition is based on a simple interface between the multiprocessor architecture and a high-performance, shared-memory multiprocessor; the correctness condition is based on a simple interface between the multiprocessor architecture and the parallel programming system.

  2. Scheduling and process migration in partitioned multiprocessors

    SciTech Connect

    Gait, J. )

    1990-03-01

    A partitioned multiprocessor (PM) has a shared global bus and nonshared local memories. This paper studies a process scheduler, called the two-tier scheduler (TTS), for a PM. In a PM local scheduling amortizes the cost of loading processes in local memory. Global scheduling migrates processes to balance load. A tunable time quantum is adjusted so the average process completes execution on the processor on which it is first scheduled, and only relatively long lived processes are rescheduled globally.

  3. Firefly: A multiprocessor workstation

    SciTech Connect

    Thacker, C.P.; Stewart, L.C.; Satterthwaite, E.H.

    1988-08-01

    Firefly is a shared memory multiprocessor workstation developed at the Digital Equipment Corporation Systems Research Center (SRC). A Firefly system consists of from one to nine VLSI VAX processors, each with a floating point accelerator and a cache. The caches are coherent, so that all processors see a consistent view of main memory. The Firefly runs a software system that emulates the Ultrix system call interface, and in addition provides support for multiprocessing through multiple threads of control in a single address space. Communication is provided uniformly through the use of remote procedure call. The authors describe the goals, hardware, software system, and performance of the Firefly, and discuss the extent to which SRC has been successful in providing software to take advantage of multi-processing.

  4. Performance modeling and measurement of real-time multiprocessors with time-shared buses

    SciTech Connect

    Woodbury, M.H.; Shin, K.G.

    1988-02-01

    A closed queueing network model is constructed to address workload effects on computer performance for a highly reliable unibus multiprocessor used in real-time control. The queueing model consists of multiserver nodes and a nonpreemptive priority queue. Use of this model requires partitioning the workload into task classes. The time average steady-state solution of the queuing model directly produces useful results that are necessary in performance evaluation. The model is experimentally justified with the Fault-Tolerant Multiprocessor (FTMP) located at the NASA AIRLAB. Extensive experiments are performed on FTMP with a synthetic workload generator (SWG) to directly measure performance parameters, such as processor idle time, system bus contention, and task processing times. These measurements determine values for parameters in the queueing model. Experimental and analytic results are then compared.

  5. Performing an allreduce operation using shared memory

    DOEpatents

    Archer, Charles J; Dozsa, Gabor; Ratterman, Joseph D; Smith, Brian E

    2014-06-10

    Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.

  6. Performing an allreduce operation using shared memory

    DOEpatents

    Archer, Charles J.; Dozsa, Gabor; Ratterman, Joseph D.; Smith, Brian E.

    2012-04-17

    Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.

  7. Dynamically controlling false sharing in distributed shared memory

    SciTech Connect

    Freeh, V.W.; Andrews, G.R.

    1996-12-31

    Distributed shared memory (DSM) alleviates the need to program message passing explicitly on a distributed-memory machine. In order to reduce memory latency, a DSM replicates copies of data. This paper examines several current approaches to controlling thrashing caused by false sharing in a DSM. Then it introduces a novel memory consistency protocol, writer-owns, which detects and eliminates false sharing at run time. In iterative computations, where the data is accessed similarly every iteration, the writer-owns protocol can have tremendous benefits because the overhead of eliminating false sharing is only incurred once. Performance results show that the writer-owns protocol is competitive with and often better than existing approaches.

  8. Recoverable distributed shared virtual memory - Memory coherence and storage structures

    NASA Technical Reports Server (NTRS)

    Wu, Kun-Lung; Fuchs, W. Kent

    1989-01-01

    This paper examines the problem of implementing rollback recovery in multicomputer distributed shared virtual memory environments, in which the shared memory is implemented in software and exists only virtually. A user-transparent checkpointing recovery scheme and new twin-page disk storage management are presented to implement a recoverable distributed shared virtual memory. The checkpointing scheme is integrated with the shared virtual memory management. The twin-page disk approach allows incremental checkpointing without an explicit undo at the time of recovery. A single consistent checkpoint state is maintained on stable disk storage. The recoverable distributed shared virtual memory allows the system to restart computation from a previous checkpoint due to a processor failure without a global restart.

  9. Multiprocessor execution of functional programs

    SciTech Connect

    Goldberg, B.F.

    1988-01-01

    Functional languages have recently gained attention as vehicles for programming in a concise and elegant manner. In addition, it has been suggested that functional programming provides a natural methodology for programming multiprocessor computers. This dissertation demonstrates that multiprocessor execution of functional programs is feasible, and results in a significant reduction in their execution times. Two implementations of the functional language ALFL were built on commercially available multiprocessors. ALFL is an implementation on the Intel iPSC hypercube multiprocessor, and Buckwheat is an implementation on the Encore Multimax shared-memory multiprocessor. Each implementation includes a compiler that performs automatic decomposition of ALFL programs. The compiler is responsible for detecting the inherent parallelism in a program, and decomposing the program into a collection of tasks, called serial combinators, that can be executed in parallel. One of the primary goals of the compiler is to generate serial combinators exhibiting the coarsest granularity possibly without sacrificing useful parallelism. This dissertation describes the algorithms used by the compiler to analyze, decompose, and optimize functional programs. The abstract machine model supported by Alfalfa and Buckwheat is called heterogeneous graph reduction, which is a hybrid of graph reduction and conventional stack-oriented execution. This model supports parallelism, lazy evaluation, and higher order functions while at the same time making efficient use of the processors in the system. The Alfalfa and Buckwheat run-time systems support dynamic load balancing, interprocessor communication (if required) and storage management. A large number of experiments were performed on Alfalfa and Buckwheat for a variety of programs. The results of these experiments, as well as the conclusions drawn from them, are presented.

  10. Direct access inter-process shared memory

    DOEpatents

    Brightwell, Ronald B; Pedretti, Kevin; Hudson, Trammell B

    2013-10-22

    A technique for directly sharing physical memory between processes executing on processor cores is described. The technique includes loading a plurality of processes into the physical memory for execution on a corresponding plurality of processor cores sharing the physical memory. An address space is mapped to each of the processes by populating a first entry in a top level virtual address table for each of the processes. The address space of each of the processes is cross-mapped into each of the processes by populating one or more subsequent entries of the top level virtual address table with the first entry in the top level virtual address table from other processes.

  11. Multiprocessor architectural study

    NASA Technical Reports Server (NTRS)

    Kosmala, A. L.; Stanten, S. F.; Vandever, W. H.

    1972-01-01

    An architectural design study was made of a multiprocessor computing system intended to meet functional and performance specifications appropriate to a manned space station application. Intermetrics, previous experience, and accumulated knowledge of the multiprocessor field is used to generate a baseline philosophy for the design of a future SUMC* multiprocessor. Interrupts are defined and the crucial questions of interrupt structure, such as processor selection and response time, are discussed. Memory hierarchy and performance is discussed extensively with particular attention to the design approach which utilizes a cache memory associated with each processor. The ability of an individual processor to approach its theoretical maximum performance is then analyzed in terms of a hit ratio. Memory management is envisioned as a virtual memory system implemented either through segmentation or paging. Addressing is discussed in terms of various register design adopted by current computers and those of advanced design.

  12. Debugging in a multi-processor environment

    SciTech Connect

    Spann, J.M.

    1981-09-29

    The Supervisory Control and Diagnostic System (SCDS) for the Mirror Fusion Test Facility (MFTF) consists of nine 32-bit minicomputers arranged in a tightly coupled distributed computer system utilizing a share memory as the data exchange medium. Debugging of more than one program in the multi-processor environment is a difficult process. This paper describes what new tools were developed and how the testing of software is performed in the SCDS for the MFTF project.

  13. Comparative Study of Message Passing and Shared Memory Parallel Programming Models in Neural Network Training

    SciTech Connect

    Vitela, J.; Gordillo, J.; Cortina, L; Hanebutte, U.

    1999-12-14

    It is presented a comparative performance study of a coarse grained parallel neural network training code, implemented in both OpenMP and MPI, standards for shared memory and message passing parallel programming environments, respectively. In addition, these versions of the parallel training code are compared to an implementation utilizing SHMEM the native SGI/CRAY environment for shared memory programming. The multiprocessor platform used is a SGI/Cray Origin 2000 with up to 32 processors. It is shown that in this study, the native CRAY environment outperforms MPI for the entire range of processors used, while OpenMP shows better performance than the other two environments when using more than 19 processors. In this study, the efficiency is always greater than 60% regardless of the parallel programming environment used as well as of the number of processors.

  14. A shared memory environment for hypercubes

    SciTech Connect

    Agarwala, A.; Das, C.R.

    1994-12-31

    This paper describes the design and implementation of a shared virtual memory (SVM) system for the nCUBE 2 hypercube multicomputer. The SVM system provides the user a single coherent address space across all nodes. It is implemented at the user level in a C programming environment using high level constructs to support data sharing. Shared variables are treated as objects rather than pages. We have improved upon an existing algorithm for maintaining coherency in the SVM system, thus achieving a reduction in the number of inter-node messages required in coherency maintenance. Detailed timing analysis is conducted to analyze the feasibility of this shared environment. Experimental results indicate the parallel programs running under an SVM system show linear speedup, suggesting that SVM systems could provide an effective programming environment for the next generation of distributed memory parallel computers. A bottleneck of this implementation seems to be the expensive interrupt handling by the nCUBE 2 kernel.

  15. Rollback-recovery techniques and architectural support for multiprocessor systems

    SciTech Connect

    Chiang Chungyang.

    1991-01-01

    The author proposes efficient and robust fault diagnosis and rollback-recovery techniques to enhance system availability as well as performance in both distributed-memory and shared-bus shared-memory multiprocessor systems. Architectural support for the proposed rollback-recovery technique in a bus-based shared-memory multiprocessor system is also investigated to adaptively fine tune the proposed rollback-recovery technique in this type of system. A comparison of the performance of the proposed techniques with other existing techniques is made, a topic on which little quantitative information is available in the literature. New diagnosis concepts are introduced to show that the author's diagnosis technique yields higher diagnosis coverage and facilitates the performance evaluation of various fault-diagnosis techniques.

  16. Parallel Navier-Stokes computations on shared and distributed memory architectures

    NASA Technical Reports Server (NTRS)

    Hayder, M. Ehtesham; Jayasimha, D. N.; Pillay, Sasi Kumar

    1995-01-01

    We study a high order finite difference scheme to solve the time accurate flow field of a jet using the compressible Navier-Stokes equations. As part of our ongoing efforts, we have implemented our numerical model on three parallel computing platforms to study the computational, communication, and scalability characteristics. The platforms chosen for this study are a cluster of workstations connected through fast networks (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and a distributed memory multiprocessor (the IBM SPI). Our focus in this study is on the LACE testbed. We present some results for the Cray YMP and the IBM SP1 mainly for comparison purposes. On the LACE testbed, we study: (1) the communication characteristics of Ethernet, FDDI, and the ALLNODE networks and (2) the overheads induced by the PVM message passing library used for parallelizing the application. We demonstrate that clustering of workstations is effective and has the potential to be computationally competitive with supercomputers at a fraction of the cost.

  17. Exploring Shared Memory Protocols in FLASH

    SciTech Connect

    Horowitz, Mark; Kunz, Robert; Hall, Mary; Lucas, Robert; Chame, Jacqueline

    2007-04-01

    ABSTRACT The goal of this project was to improve the performance of large scientific and engineering applications through collaborative hardware and software mechanisms to manage the memory hierarchy of non-uniform memory access time (NUMA) shared-memory machines, as well as their component individual processors. In spite of the programming advantages of shared-memory platforms, obtaining good performance for large scientific and engineering applications on such machines can be challenging. Because communication between processors is managed implicitly by the hardware, rather than expressed by the programmer, application performance may suffer from unintended communication – communication that the programmer did not consider when developing his/her application. In this project, we developed and evaluated a collection of hardware, compiler, languages and performance monitoring tools to obtain high performance on scientific and engineering applications on NUMA platforms by managing communication through alternative coherence mechanisms. Alternative coherence mechanisms have often been discussed as a means for reducing unintended communication, although architecture implementations of such mechanisms are quite rare. This report describes an actual implementation of a set of coherence protocols that support coherent, non-coherent and write-update accesses for a CC-NUMA shared-memory architecture, the Stanford FLASH machine. Such an approach has the advantages of using alternative coherence only where it is beneficial, and also provides an evolutionary migration path for improving application performance. We present data on two computations, RandomAccess from the HPC Challenge benchmarks and a forward solver derived from LS-DYNA, showing the performance advantages of the alternative coherence mechanisms. For RandomAccess, the non-coherent and write-update versions can outperform the coherent version by factors of 5 and 2.5, respectively. In LS-DYNA, we obtain

  18. Checkpointing Shared Memory Programs at the Application-level

    SciTech Connect

    Bronevetsky, G; Schulz, M; Szwed, P; Marques, D; Pingali, K

    2004-09-08

    Trends in high-performance computing are making it necessary for long-running applications to tolerate hardware faults. The most commonly used approach is checkpoint and restart(CPR)-the state of the computation is saved periodically on disk, and when a failure occurs, the computation is restarted from the last saved state. At present, it is the responsibility of the programmer to instrument applications for CPR. Our group is investigating the use of compiler technology to instrument codes to make them self-checkpointing and self-restarting, thereby providing an automatic solution to the problem of making long-running scientific applications resilient to hardware faults. Our previous work focused on message-passing programs. In this paper, we describe such a system for shared-memory programs running on symmetric multiprocessors. The system has two components: (i)a pre-compiler for source-to-source modification of applications, and (ii) a runtime system that implements a protocol for coordinating CPR among the threads of the parallel application. For the sake of concreteness, we focus on a non-trivial subset of OpenMP that includes barriers and locks. One of the advantages of this approach is that the ability to tolerate faults becomes embedded within the application itself, so applications become self-checkpointing and self-restarting on any platform. We demonstrate this by showing that our transformed benchmarks can checkpoint and restart on three different platforms (Windows/x86, Linux/x86, and Tru64/Alpha). Our experiments show that the overhead introduced by this approach is usually quite small; they also suggest ways in which the current implementation can be tuned to reduced overheads further.

  19. C-MOS array design techniques: SUMC multiprocessor system study

    NASA Technical Reports Server (NTRS)

    Clapp, W. A.; Helbig, W. A.; Merriam, A. S.

    1972-01-01

    The current capabilities of LSI techniques for speed and reliability, plus the possibilities of assembling large configurations of LSI logic and storage elements, have demanded the study of multiprocessors and multiprocessing techniques, problems, and potentialities. Evaluated are three previous systems studies for a space ultrareliable modular computer multiprocessing system, and a new multiprocessing system is proposed that is flexibly configured with up to four central processors, four 1/0 processors, and 16 main memory units, plus auxiliary memory and peripheral devices. This multiprocessor system features a multilevel interrupt, qualified S/360 compatibility for ground-based generation of programs, virtual memory management of a storage hierarchy through 1/0 processors, and multiport access to multiple and shared memory units.

  20. Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 1: FTMP principles of operation

    NASA Technical Reports Server (NTRS)

    Smith, T. B., Jr.; Lala, J. H.

    1983-01-01

    The basic organization of the fault tolerant multiprocessor, (FTMP) is that of a general purpose homogeneous multiprocessor. Three processors operate on a shared system (memory and I/O) bus. Replication and tight synchronization of all elements and hardware voting is employed to detect and correct any single fault. Reconfiguration is then employed to repair a fault. Multiple faults may be tolerated as a sequence of single faults with repair between fault occurrences.

  1. Distributed shared memory for roaming large volumes.

    PubMed

    Castanié, Laurent; Mion, Christophe; Cavin, Xavier; Lévy, Bruno

    2006-01-01

    We present a cluster-based volume rendering system for roaming very large volumes. This system allows to move a gigabyte-sized probe inside a total volume of several tens or hundreds of gigabytes in real-time. While the size of the probe is limited by the total amount of texture memory on the cluster, the size of the total data set has no theoretical limit. The cluster is used as a distributed graphics processing unit that both aggregates graphics power and graphics memory. A hardware-accelerated volume renderer runs in parallel on the cluster nodes and the final image compositing is implemented using a pipelined sort-last rendering algorithm. Meanwhile, volume bricking and volume paging allow efficient data caching. On each rendering node, a distributed hierarchical cache system implements a global software-based distributed shared memory on the cluster. In case of a cache miss, this system first checks page residency on the other cluster nodes instead of directly accessing local disks. Using two Gigabit Ethernet network interfaces per node, we accelerate data fetching by a factor of 4 compared to directly accessing local disks. The system also implements asynchronous disk access and texture loading, which makes it possible to overlap data loading, volume slicing and rendering for optimal volume roaming. PMID:17080865

  2. Shared virtual memory and generalized speedup

    NASA Technical Reports Server (NTRS)

    Sun, Xian-He; Zhu, Jianping

    1994-01-01

    Generalized speedup is defined as parallel speed over sequential speed. The generalized speedup and its relation with other existing performance metrics, such as traditional speedup, efficiency, scalability, etc., are carefully studied. In terms of the introduced asymptotic speed, it was shown that the difference between the generalized speedup and the traditional speedup lies in the definition of the efficiency of uniprocessor processing, which is a very important issue in shared virtual memory machines. A scientific application was implemented on a KSR-1 parallel computer. Experimental and theoretical results show that the generalized speedup is distinct from the traditional speedup and provides a more reasonable measurement. In the study of different speedups, various causes of superlinear speedup are also presented.

  3. Final Report: Programming Models for Shared Memory Clusters

    SciTech Connect

    May, J.; de Supinski, B.; Pudliner, B.; Taylor, S.; Baden, S.

    2000-01-04

    Most large parallel computers now built use a hybrid architecture called a shared memory cluster. In this design, a computer consists of several nodes connected by an interconnection network. Each node contains a pool of memory and multiple processors that share direct access to it. Because shared memory clusters combine architectural features of shared memory computers and distributed memory computers, they support several different styles of parallel programming or programming models. (Further information on the design of these systems and their programming models appears in Section 2.) The purpose of this project was to investigate the programming models available on these systems and to answer three questions: (1) How easy to use are the different programming models in real applications? (2) How do the hardware and system software on different computers affect the performance of these programming models? (3) What are the performance characteristics of different programming models for typical LLNL applications on various shared memory clusters?

  4. Reducing communication costs in the conjugate gradient algorithm on distributed memory multiprocessors

    SciTech Connect

    D'Azevedo, E.F.; Romine, C.H.

    1992-09-01

    The standard formulation of the conjugate gradient algorithm involves two inner product computations. The results of these two inner products are needed to update the search direction and the computed solution. In a distributed memory parallel environment, the computation and subsequent distribution of these two values requires two separate communication and synchronization phases. In this paper, we present a mathematically equivalent rearrangement of the standard algorithm that reduces the number of communication phases. We give a second derivation of the modified conjugate gradient algorithm in terms of the natural relationship with the underlying Lanczos process. We also present empirical evidence of the stability of this modified algorithm.

  5. Impact of Load Balancing on Unstructured Adaptive Grid Computations for Distributed-Memory Multiprocessors

    NASA Technical Reports Server (NTRS)

    Sohn, Andrew; Biswas, Rupak; Simon, Horst D.

    1996-01-01

    The computational requirements for an adaptive solution of unsteady problems change as the simulation progresses. This causes workload imbalance among processors on a parallel machine which, in turn, requires significant data movement at runtime. We present a new dynamic load-balancing framework, called JOVE, that balances the workload across all processors with a global view. Whenever the computational mesh is adapted, JOVE is activated to eliminate the load imbalance. JOVE has been implemented on an IBM SP2 distributed-memory machine in MPI for portability. Experimental results for two model meshes demonstrate that mesh adaption with load balancing gives more than a sixfold improvement over one without load balancing. We also show that JOVE gives a 24-fold speedup on 64 processors compared to sequential execution.

  6. A parallel numerical simulation for supersonic flows using zonal overlapped grids and local time steps for common and distributed memory multiprocessors

    SciTech Connect

    Patel, N.R.; Sturek, W.B.; Hiromoto, R.

    1989-01-01

    Parallel Navier-Stokes codes are developed to solve both two- dimensional and three-dimensional flow fields in and around ramjet and nose tip configurations. A multi-zone overlapped grid technique is used to extend an explicit finite-difference method to more complicated geometries. Parallel implementations are developed for execution on both distributed and common-memory multiprocessor architectures. For the steady-state solutions, the use of the local time-step method has the inherent advantage of reducing the communications overhead commonly incurred by parallel implementations. Computational results of the codes are given for a series of test problems. The parallel partitioning of computational zones is also discussed. 5 refs., 18 figs.

  7. Solution of the Euler and Navier-Stokes equations on MIMD distributed memory multiprocessors using cyclic reduction

    SciTech Connect

    Curchitser, E.N.; Pelz, R.B.; Marconi, F. Grumman Aerospace Corp., Bethpage, NY )

    1992-01-01

    The Euler and Navier-Stokes equations are solved for the steady, two-dimensional flow over a NACA 0012 airfoil using a 1024 node nCUBE/2 multiprocessor. Second-order, upwind-discretized difference equations are solved implicitly using ADI factorization. Parallel cyclic reduction is employed to solve the block tridiagonal systems. For realistic problems, communication times are negligible compared to calculation times. The processors are tightly synchronized, and their loads are well balanced. When the flux Jacobians flux are frozen, the wall-clock time for one implicit timestep is about equal to that of a multistage explicit scheme. 10 refs.

  8. The hierarchical spatial decomposition of three-dimensional particle- in-cell plasma simulations on MIMD distributed memory multiprocessors

    SciTech Connect

    Walker, D.W.

    1992-07-01

    The hierarchical spatial decomposition method is a promising approach to decomposing the particles and computational grid in parallel particle-in-cell application codes, since it is able to maintain approximate dynamic load balance while keeping communication costs low. In this paper we investigate issues in implementing a hierarchical spatial decomposition on a hypercube multiprocessor. Particular attention is focused on the communication needed to update guard ring data, and on the load balancing method. The hierarchical approach is compared with other dynamic load balancing schemes.

  9. Supporting shared data structures on distributed memory architectures

    NASA Technical Reports Server (NTRS)

    Koelbel, Charles; Mehrotra, Piyush; Vanrosendale, John

    1990-01-01

    Programming nonshared memory systems is more difficult than programming shared memory systems, since there is no support for shared data structures. Current programming languages for distributed memory architectures force the user to decompose all data structures into separate pieces, with each piece owned by one of the processors in the machine, and with all communication explicitly specified by low-level message-passing primitives. A new programming environment is presented for distributed memory architectures, providing a global name space and allowing direct access to remote parts of data values. The analysis and program transformations required to implement this environment are described, and the efficiency of the resulting code on the NCUBE/7 and IPSC/2 hypercubes are described.

  10. A comparison of distributed memory and virtual shared memory parallel programming models

    SciTech Connect

    Keane, J.A.; Grant, A.J.; Xu, M.Q.

    1993-04-01

    The virtues of the different parallel programming models, shared memory and distributed memory, have been much debated. Conventionally the debate could be reduced to programming convenience on the one hand, and high salability factors on the other. More recently the debate has become somewhat blurred with the provision of virtual shared memory models built on machines with physically distributed memory. The intention of such models/machines is to provide scalable shared memory, i.e. to provide both programmer convenience and high salability. In this paper, the different models are considered from experiences gained with a number of system ranging from applications in both commerce and science to languages and operating systems. Case studies are introduced as appropriate.

  11. Parallel implementation and evaluation of motion estimation system algorithms on a distributed memory multiprocessor using knowledge based mappings

    NASA Technical Reports Server (NTRS)

    Choudhary, Alok Nidhi; Leung, Mun K.; Huang, Thomas S.; Patel, Janak H.

    1989-01-01

    Several techniques to perform static and dynamic load balancing techniques for vision systems are presented. These techniques are novel in the sense that they capture the computational requirements of a task by examining the data when it is produced. Furthermore, they can be applied to many vision systems because many algorithms in different systems are either the same, or have similar computational characteristics. These techniques are evaluated by applying them on a parallel implementation of the algorithms in a motion estimation system on a hypercube multiprocessor system. The motion estimation system consists of the following steps: (1) extraction of features; (2) stereo match of images in one time instant; (3) time match of images from different time instants; (4) stereo match to compute final unambiguous points; and (5) computation of motion parameters. It is shown that the performance gains when these data decomposition and load balancing techniques are used are significant and the overhead of using these techniques is minimal.

  12. Efficient multiprocessor architecture for digital signal processing

    SciTech Connect

    Auguin, M.; Boeri, F.

    1982-01-01

    There is a continuing pressure of better processing performances in numerical signal processing. Effective utilization of LSI semiconductor technology allows the consideration of multiprocessor architectures. The problem of interconnecting the components of the architecture arises. The authors describe a control algorithm of the Benes interconnection network in a asynchronous multiprocessor system. A simulation study of the time-shared bus, of the omega network, of the benes network and of the crossbar network gives a comparison of performances. 8 references.

  13. Distributed simulation using a real-time shared memory network

    NASA Technical Reports Server (NTRS)

    Simon, Donald L.; Mattern, Duane L.; Wong, Edmond; Musgrave, Jeffrey L.

    1993-01-01

    The Advanced Control Technology Branch of the NASA Lewis Research Center performs research in the area of advanced digital controls for aeronautic and space propulsion systems. This work requires the real-time implementation of both control software and complex dynamical models of the propulsion system. We are implementing these systems in a distributed, multi-vendor computer environment. Therefore, a need exists for real-time communication and synchronization between the distributed multi-vendor computers. A shared memory network is a potential solution which offers several advantages over other real-time communication approaches. A candidate shared memory network was tested for basic performance. The shared memory network was then used to implement a distributed simulation of a ramjet engine. The accuracy and execution time of the distributed simulation was measured and compared to the performance of the non-partitioned simulation. The ease of partitioning the simulation, the minimal time required to develop for communication between the processors and the resulting execution time all indicate that the shared memory network is a real-time communication technique worthy of serious consideration.

  14. Performance Analysis of Multilevel Parallel Applications on Shared Memory Architectures

    NASA Technical Reports Server (NTRS)

    Biegel, Bryan A. (Technical Monitor); Jost, G.; Jin, H.; Labarta J.; Gimenez, J.; Caubet, J.

    2003-01-01

    Parallel programming paradigms include process level parallelism, thread level parallelization, and multilevel parallelism. This viewgraph presentation describes a detailed performance analysis of these paradigms for Shared Memory Architecture (SMA). This analysis uses the Paraver Performance Analysis System. The presentation includes diagrams of a flow of useful computations.

  15. Analysis of fork-join program response times on multiprocessors

    SciTech Connect

    Towsley, D.; Stankovic, J.A. . Dept. of Computer and Information Science); Rommel, C.G. )

    1990-07-01

    In this paper, the authors develop analytic models for a shared memory multiprocessor that executes fork-join parallel programs. Here a fork-join program is one that consists of a set of n {ge} 1 parallel tasks. All of the tasks of a program arrive simultaneously to the system and the job is assumed to complete when the last task completes. They develop and analyze models for two processor sharing policies, called task scheduling processor sharing and job scheduling processor sharing. The first policy schedules tasks independently of each other and allows parallel execution of an individual program, whereas the second policy schedules each job as a unit and thereby does not allow parallel execution of an individual program.

  16. Performance Evaluation of Remote Memory Access (RMA) Programming on Shared Memory Parallel Computers

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Jost, Gabriele; Biegel, Bryan A. (Technical Monitor)

    2002-01-01

    The purpose of this study is to evaluate the feasibility of remote memory access (RMA) programming on shared memory parallel computers. We discuss different RMA based implementations of selected CFD application benchmark kernels and compare them to corresponding message passing based codes. For the message-passing implementation we use MPI point-to-point and global communication routines. For the RMA based approach we consider two different libraries supporting this programming model. One is a shared memory parallelization library (SMPlib) developed at NASA Ames, the other is the MPI-2 extensions to the MPI Standard. We give timing comparisons for the different implementation strategies and discuss the performance.

  17. Parallel processing and medium-scale multiprocessors

    SciTech Connect

    Wouk, A.

    1989-01-01

    For some time, the community interested in large-scale scientific computing has been attempting to come to terms with parallel computation using a number of processors sufficient to make their concurrent utilization interesting, challenging, and, in the long run, beneficial. Unexpected consequences of parallelization have been discovered. It is possible to obtain reduced performance, both relative and absolute, from an increased number of processors, as a result of inappropriate use of resources in a multiprocessor environment. This exemplifies one of the paradoxes which result from our cultural bias towards sequential thought processes. As a consequence there is a bias for sequential styles of program development in a multiprocessor environment. The authors have learned that the problem of automatic optimization in compilation of parallel programs is computationally hard. Early hopes that automatic, optimal parallelization of sequentially conceived programs would be as achievable as earlier automatic vectorization had been, have been dashed. The authors lack the insights and folklore which are needed to develop useful methodologies and heuristics in the area of parallel computation. The authors are embarked on a voyage of exploration of this new territory, and the work described in this volume can provide helpful guidance. The authors have to explore fully the differences between distributed memory systems, shared memory systems, and combinations, as well as the relative applicability of SIMD and MIMD architectures. Based on the information obtained in such exploration, useful steps towards efficient utilization of many processors should become possible. This paper covers several areas: systems programming, parallel/language/programming systems, and applications programming.

  18. Analysis and comparison of cache coherence protocols for a packet-switched multiprocessor

    SciTech Connect

    Yang, Q.; Bhuyan, L.N.; Liu, B.C.

    1989-08-01

    The use of private caches in a multiprocessor system causes inconsistency of the shared data among the caches and among caches and the main memory. A large number of protocols have been proposed to solve this coherence problem. In this paper, the authors develop analytical models for seven existing cache protocols, namely: Write-Once, Write-Through, Synapse, Berkeley, Illinois, Firefly, and Dragon. The protocols are implemented on a multiprocessor with a packet-switched shared bus. The models are based on queueing networks that consist of both open and closed classes of customers. The models incorporate the requests for invalidation signals, write-through, and write-back operations and the solution is based on the mean value analysis (MVA) algorithm. Performance comparison among these protocols under various system parameters is carried out based on our models.

  19. Multi-processor including data flow accelerator module

    DOEpatents

    Davidson, George S.; Pierce, Paul E.

    1990-01-01

    An accelerator module for a data flow computer includes an intelligent memory. The module is added to a multiprocessor arrangement and uses a shared tagged memory architecture in the data flow computer. The intelligent memory module assigns locations for holding data values in correspondence with arcs leading to a node in a data dependency graph. Each primitive computation is associated with a corresponding memory cell, including a number of slots for operands needed to execute a primitive computation, a primitive identifying pointer, and linking slots for distributing the result of the cell computation to other cells requiring that result as an operand. Circuitry is provided for utilizing tag bits to determine automatically when all operands required by a processor are available and for scheduling the primitive for execution in a queue. Each memory cell of the module may be associated with any of the primitives, and the particular primitive to be executed by the processor associated with the cell is identified by providing an index, such as the cell number for the primitive, to the primitive lookup table of starting addresses. The module thus serves to perform functions previously performed by a number of sections of data flow architectures and coexists with conventional shared memory therein. A multiprocessing system including the module operates in a hybrid mode, wherein the same processing modules are used to perform some processing in a sequential mode, under immediate control of an operating system, while performing other processing in a data flow mode.

  20. Rollback Hardware For Time Warp Multiprocessor Systems

    NASA Technical Reports Server (NTRS)

    Robb, Michael J.; Buzzell, Calvin A.

    1996-01-01

    Rollback Chip (RBC) module is computer circuit board containing special-purpose memory circuits for use in multiprocessor computer system. Designed to help realize speedup potential of parallel processing for simulation of discrete events by use of Time Warp operating system.

  1. Neural networks and MIMD-multiprocessors

    NASA Technical Reports Server (NTRS)

    Vanhala, Jukka; Kaski, Kimmo

    1990-01-01

    Two artificial neural network models are compared. They are the Hopfield Neural Network Model and the Sparse Distributed Memory model. Distributed algorithms for both of them are designed and implemented. The run time characteristics of the algorithms are analyzed theoretically and tested in practice. The storage capacities of the networks are compared. Implementations are done using a distributed multiprocessor system.

  2. Visual Tutoring System for Programming Multiprocessor Networks.

    ERIC Educational Resources Information Center

    Trichina, Elena

    1996-01-01

    Describes a visual tutoring system for programming distributive-memory multiprocessor networks. Highlights include difficulties of parallel programming, and three instructional modes in the system, including a hypertext-like lecture, a question-answer mode, and an expert aid mode. (Author/LRW)

  3. Experimenting With Multiprocessor Simulator Concepts

    NASA Technical Reports Server (NTRS)

    Blech, Richard A.; Williams, Anthony D.

    1989-01-01

    Multiple microcomputer system used to investigate application of parallel processing to real-time simulation. With dual-base architecture, each microcomputer communicates with corresponding microcomputer on opposite bus through dual-port interface memory. Transfers of data to and from front-end processor occur on interactive information bus. Transfers of data related to simulation calculations occur on real-time-information bus. System, called the real-time multiprocessor simulator (RTMPS), is tool for developing low-cost, portable, user-friendly simulators.

  4. Performance Analysis of Multilevel Parallel Applications on Shared Memory Architectures

    NASA Technical Reports Server (NTRS)

    Jost, Gabriele; Jin, Haoqiang; Labarta, Jesus; Gimenez, Judit; Caubet, Jordi; Biegel, Bryan A. (Technical Monitor)

    2002-01-01

    In this paper we describe how to apply powerful performance analysis techniques to understand the behavior of multilevel parallel applications. We use the Paraver/OMPItrace performance analysis system for our study. This system consists of two major components: The OMPItrace dynamic instrumentation mechanism, which allows the tracing of processes and threads and the Paraver graphical user interface for inspection and analyses of the generated traces. We describe how to use the system to conduct a detailed comparative study of a benchmark code implemented in five different programming paradigms applicable for shared memory

  5. A garbage collection algorithm for shared memory parallel processors

    SciTech Connect

    Crammond, J. )

    1988-12-01

    This paper describes a technique for adapting the Morris sliding garbage collection algorithm to execute on parallel machines with shared memory. The algorithm is described within the framework of an implementation of the parallel logic language Parlog. However, the algorithm is a general one and can easily be adapted to parallel Prolog systems and to other languages. The performance of the algorithm executing a few simple Parlog benchmarks is analyzed. Finally, it is shown how the technique for parallelizing the sequential algorithm can be adapted for a semi-space copying algorithm.

  6. High Performance, Dependable Multiprocessor

    NASA Technical Reports Server (NTRS)

    Ramos, Jeremy; Samson, John R.; Troxel, Ian; Subramaniyan, Rajagopal; Jacobs, Adam; Greco, James; Cieslewski, Grzegorz; Curreri, John; Fischer, Michael; Grobelny, Eric; George, Alan; Aggarwal, Vikas; Patel, Minesh; Some, Raphael

    2006-01-01

    With the ever increasing demand for higher bandwidth and processing capacity of today's space exploration, space science, and defense missions, the ability to efficiently apply commercial-off-the-shelf (COTS) processors for on-board computing is now a critical need. In response to this need, NASA's New Millennium Program office has commissioned the development of Dependable Multiprocessor (DM) technology for use in payload and robotic missions. The Dependable Multiprocessor technology is a COTS-based, power efficient, high performance, highly dependable, fault tolerant cluster computer. To date, Honeywell has successfully demonstrated a TRL4 prototype of the Dependable Multiprocessor [I], and is now working on the development of a TRLS prototype. For the present effort Honeywell has teamed up with the University of Florida's High-performance Computing and Simulation (HCS) Lab, and together the team has demonstrated major elements of the Dependable Multiprocessor TRLS system.

  7. Multiprocessors and runtime compilation

    NASA Technical Reports Server (NTRS)

    Saltz, Joel; Berryman, Harry; Wu, Janet

    1990-01-01

    Runtime preprocessing plays a major role in many efficient algorithms in computer science, as well as playing an important role in exploiting multiprocessor architectures. Examples are given that elucidate the importance of runtime preprocessing and show how these optimizations can be integrated into compilers. To support the arguments, transformations implemented in prototype multiprocessor compilers are described and benchmarks from the iPSC2/860, the CM-2, and the Encore Multimax/320 are presented.

  8. Parallel discrete event simulation: A shared memory approach

    NASA Technical Reports Server (NTRS)

    Reed, Daniel A.; Malony, Allen D.; Mccredie, Bradley D.

    1987-01-01

    With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models.

  9. Ensuring correct rollback recovery in distributed shared memory systems

    NASA Technical Reports Server (NTRS)

    Janssens, Bob; Fuchs, W. Kent

    1995-01-01

    Distributed shared memory (DSM) implemented on a cluster of workstations is an increasingly attractive platform for executing parallel scientific applications. Checkpointing and rollback techniques can be used in such a system to allow the computation to progress in spite of the temporary failure of one or more processing nodes. This paper presents the design of an independent checkpointing method for DSM that takes advantage of DSM's specific properties to reduce error-free and rollback overhead. The scheme reduces the dependencies that need to be considered for correct rollback to those resulting from transfers of pages. Furthermore, in-transit messages can be recovered without the use of logging. We extend the scheme to a DSM implementation using lazy release consistency, where the frequency of dependencies is further reduced.

  10. Reducing Interprocessor Dependence in Recoverable Distributed Shared Memory

    NASA Technical Reports Server (NTRS)

    Janssens, Bob; Fuchs, W. Kent

    1994-01-01

    Checkpointing techniques in parallel systems use dependency tracking and/or message logging to ensure that a system rolls back to a consistent state. Traditional dependency tracking in distributed shared memory (DSM) systems is expensive because of high communication frequency. In this paper we show that, if designed correctly, a DSM system only needs to consider dependencies due to the transfer of blocks of data, resulting in reduced dependency tracking overhead and reduced potential for rollback propagation. We develop an ownership timestamp scheme to tolerate the loss of block state information and develop a passive server model of execution where interactions between processors are considered atomic. With our scheme, dependencies are significantly reduced compared to the traditional message-passing model.

  11. A Parallel Saturation Algorithm on Shared Memory Architectures

    NASA Technical Reports Server (NTRS)

    Ezekiel, Jonathan; Siminiceanu

    2007-01-01

    Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.

  12. Translation techniques for distributed-shared memory programming models

    SciTech Connect

    Fuller, Douglas James

    2005-08-01

    The high performance computing community has experienced an explosive improvement in distributed-shared memory hardware. Driven by increasing real-world problem complexity, this explosion has ushered in vast numbers of new systems. Each new system presents new challenges to programmers and application developers. Part of the challenge is adapting to new architectures with new performance characteristics. Different vendors release systems with widely varying architectures that perform differently in different situations. Furthermore, since vendors need only provide a single performance number (total MFLOPS, typically for a single benchmark), they only have strong incentive initially to optimize the API of their choice. Consequently, only a fraction of the available APIs are well optimized on most systems. This causes issues porting and writing maintainable software, let alone issues for programmers burdened with mastering each new API as it is released. Also, programmers wishing to use a certain machine must choose their API based on the underlying hardware instead of the application. This thesis argues that a flexible, extensible translator for distributed-shared memory APIs can help address some of these issues. For example, a translator might take as input code in one API and output an equivalent program in another. Such a translator could provide instant porting for applications to new systems that do not support the application's library or language natively. While open-source APIs are abundant, they do not perform optimally everywhere. A translator would also allow performance testing using a single base code translated to a number of different APIs. Most significantly, this type of translator frees programmers to select the most appropriate API for a given application based on the application (and developer) itself instead of the underlying hardware.

  13. A Comparison of Shared Memory Parallel Programming Models

    SciTech Connect

    Mogill, Jace A; Haglin, David J

    2010-05-24

    The dominant parallel programming models for shared memory computers, Pthreads and OpenMP, are both thread-centric in that they are based on explicit management of tasks and enforce data dependencies and output ordering through task management. By comparison, the Cray XMT programming model is data-centric where the primary concern of the programmer is managing data dependencies, allowing threads to progress in a data flow fashion. The XMT implements this programming model by associating tag bits with each word of memory, affording efficient fine grained data synchronization independent of the number of processors or how tasks are scheduled. When task management is implicit and synchronization is abundant, efficient, and easy to use, programmers have viable alternatives to traditional thread-centric algorithms. In this paper we compare the amount of available parallelism relative to the amount of work in a variety of different algorithms and data structures when synchronization does not need to be rationed, as well as identify opportunities for platform and performance portability of the data-centric programming model on multi-core processors.

  14. Coupled cluster algorithms for networks of shared memory parallel processors

    NASA Astrophysics Data System (ADS)

    Bentz, Jonathan L.; Olson, Ryan M.; Gordon, Mark S.; Schmidt, Michael W.; Kendall, Ricky A.

    2007-05-01

    As the popularity of using SMP systems as the building blocks for high performance supercomputers increases, so too increases the need for applications that can utilize the multiple levels of parallelism available in clusters of SMPs. This paper presents a dual-layer distributed algorithm, using both shared-memory and distributed-memory techniques to parallelize a very important algorithm (often called the "gold standard") used in computational chemistry, the single and double excitation coupled cluster method with perturbative triples, i.e. CCSD(T). The algorithm is presented within the framework of the GAMESS [M.W. Schmidt, K.K. Baldridge, J.A. Boatz, S.T. Elbert, M.S. Gordon, J.J. Jensen, S. Koseki, N. Matsunaga, K.A. Nguyen, S. Su, T.L. Windus, M. Dupuis, J.A. Montgomery, General atomic and molecular electronic structure system, J. Comput. Chem. 14 (1993) 1347-1363]. (General Atomic and Molecular Electronic Structure System) program suite and the Distributed Data Interface [M.W. Schmidt, G.D. Fletcher, B.M. Bode, M.S. Gordon, The distributed data interface in GAMESS, Comput. Phys. Comm. 128 (2000) 190]. (DDI), however, the essential features of the algorithm (data distribution, load-balancing and communication overhead) can be applied to more general computational problems. Timing and performance data for our dual-level algorithm is presented on several large-scale clusters of SMPs.

  15. Efficient partitioning and assignment on programs for multiprocessor execution

    NASA Technical Reports Server (NTRS)

    Standley, Hilda M.

    1993-01-01

    The general problem studied is that of segmenting or partitioning programs for distribution across a multiprocessor system. Efficient partitioning and the assignment of program elements are of great importance since the time consumed in this overhead activity may easily dominate the computation, effectively eliminating any gains made by the use of the parallelism. In this study, the partitioning of sequentially structured programs (written in FORTRAN) is evaluated. Heuristics, developed for similar applications are examined. Finally, a model for queueing networks with finite queues is developed which may be used to analyze multiprocessor system architectures with a shared memory approach to the problem of partitioning. The properties of sequentially written programs form obstacles to large scale (at the procedure or subroutine level) parallelization. Data dependencies of even the minutest nature, reflecting the sequential development of the program, severely limit parallelism. The design of heuristic algorithms is tied to the experience gained in the parallel splitting. Parallelism obtained through the physical separation of data has seen some success, especially at the data element level. Data parallelism on a grander scale requires models that accurately reflect the effects of blocking caused by finite queues. A model for the approximation of the performance of finite queueing networks is developed. This model makes use of the decomposition approach combined with the efficiency of product form solutions.

  16. Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments

    SciTech Connect

    Jin, Shuangshuang; Chen, Yousu; Wu, Di; Diao, Ruisheng; Huang, Zhenyu

    2015-12-09

    Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Message Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.

  17. Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver

    NASA Astrophysics Data System (ADS)

    Moustafa, Salli; Dutka-Malen, Ivan; Plagne, Laurent; Ponçot, Angélique; Ramet, Pierre

    2014-06-01

    This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that implements two nested levels of parallelism (multicore+SIMD) on shared memory computation nodes. DOMINO is written in C++, a multi-paradigm programming language that enables the use of powerful and generic parallel programming tools such as Intel TBB and Eigen. These two libraries allow us to combine multi-thread parallelism with vector operations in an efficient and yet portable way. As a result, DOMINO can exploit the full power of modern multi-core processors and is able to tackle very large simulations, that usually require large HPC clusters, using a single computing node. For example, DOMINO solves a 3D full core PWR eigenvalue problem involving 26 energy groups, 288 angular directions (S16), 46 × 106 spatial cells and 1 × 1012 DoFs within 11 hours on a single 32-core SMP node. This represents a sustained performance of 235 GFlops and 40:74% of the SMP node peak performance for the DOMINO sweep implementation. The very high Flops/Watt ratio of DOMINO makes it a very interesting building block for a future many-nodes nuclear simulation tool.

  18. Low latency memory access and synchronization

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Heidelberger, Philip; Hoenicke, Dirk; Ohmacht, Martin; Steinmacher-Burow, Burkhard D.; Takken, Todd E.; Vranas, Pavlos M.

    2007-02-06

    A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.

  19. Low latency memory access and synchronization

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Heidelberger, Philip; Hoenicke, Dirk; Ohmacht, Martin; Steinmacher-Burow, Burkhard D.; Takken, Todd E. , Vranas; Pavlos M.

    2010-10-19

    A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Bach processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.

  20. Compiled execution of the reduce-or process model on multiprocessors

    SciTech Connect

    Ramkumar, B.; Kale, L.V. )

    1989-01-01

    In this paper, the authors describe the abstract machine developed for the reduce - or process model (ROPM) and its implementation on a variety of multiprocessors. In keeping with the objective behind the ROPM, the abstract machine is suitable for execution on both shared and nonshared memory machines. It uses structure sharing unlike most of the abstract machines based on the WAM. This is due to significant benefits in a nonshared memory context. It has currently been implemented on the Encore Multimax, the Sequent Symmetry, the Alliant, and the InteliPSC/2. The authors provide preliminary performance data of our implementations on these machines in this paper. The benchmarks chosen illustrate the range of programs which ROPM can parallelize - and, or, as well as and/or parallel programs are effectively parallelized and speeded up.

  1. Multiprocessor programming environment

    SciTech Connect

    Smith, M.B.; Fornaro, R.

    1988-12-01

    Programming tools and techniques have been well developed for traditional uniprocessor computer systems. The focus of this research project is on the development of a programming environment for a high speed real time heterogeneous multiprocessor system, with special emphasis on languages and compilers. The new tools and techniques will allow a smooth transition for programmers with experience only on single processor systems.

  2. MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives

    SciTech Connect

    Graham, Richard L; Shipman, Galen M

    2008-01-01

    With local core counts on the rise, taking advantage of shared memory to optimize collective operations can improve performance. We study several on-host shared memory optimized algorithms for MPI Bcast, MPI Reduce, and MPI Allreduce, using tree-based, and reduce-scatter algorithms. For small data operations with relatively large synchronization costs fan-in/fan-out algorithms generally perform best. For large messages data manipulation constitute the largest cost and reduce-scatter algorithms are best for reductions. These optimization improve performance by up to a factor of three. Memory and cache sharing effect require deliberate process layout and careful radix selection for tree-based methods

  3. Distributed parallel messaging for multiprocessor systems

    DOEpatents

    Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M; Steinmacher-Burrow, Burhard; Sugawara, Yutaka

    2013-06-04

    A method and apparatus for distributed parallel messaging in a parallel computing system. The apparatus includes, at each node of a multiprocessor network, multiple injection messaging engine units and reception messaging engine units, each implementing a DMA engine and each supporting both multiple packet injection into and multiple reception from a network, in parallel. The reception side of the messaging unit (MU) includes a switch interface enabling writing of data of a packet received from the network to the memory system. The transmission side of the messaging unit, includes switch interface for reading from the memory system when injecting packets into the network.

  4. Solutions and debugging for data consistency in multiprocessors with noncoherent caches

    SciTech Connect

    Bernstein, D.; Mendelson, B.; Breternitz, M. Jr.; Gheith, A.M.

    1995-02-01

    We analyze two important problems that arise in shared-memory multiprocessor systems. The stale data problem involves ensuring that data items in local memory of individual processors are current, independent of writes done by other processors. False sharing occurs when two processors have copies of the same shared data block but update different portions of the block. The false sharing problem involves guaranteeing that subsequent writes are properly combined. In modern architectures these problems are usually solved in hardware, by exploiting mechanisms for hardware controlled cache consistency. This leads to more expensive and nonscalable designs. Therefore, we are concentrating on software methods for ensuring cache consistency that would allow for affordable and scalable multiprocessing systems. Unfortunately, providing software control is nontrivial, both for the compiler writer and for the application programmer. For this reason we are developing a debugging environment that will facilitate the development of compiler-based techniques and will help the programmer to tune his or her application using explicit cache management mechanisms. We extend the notion of a race condition for IBM Shared Memory System POWER/4, taking into consideration its noncoherent caches, and propose techniques for detection of false sharing problems. Identification of the stale data problem is discussed as well, and solutions are suggested.

  5. Considerations for Multiprocessor Topologies

    NASA Technical Reports Server (NTRS)

    Byrd, Gregory T.; Delagi, Bruce A.

    1987-01-01

    Choosing a multiprocessor interconnection topology may depend on high-level considerations, such as the intended application domain and the expected number of processors. It certainly depends on low-level implementation details, such as packaging and communications protocols. The authors first use rough measures of cost and performance to characterize several topologies. They then examine how implementation details can affect the realizable performance of a topology.

  6. Implementation of a parallel unstructured Euler solver on shared and distributed memory architectures

    NASA Technical Reports Server (NTRS)

    Mavriplis, D. J.; Das, Raja; Saltz, Joel; Vermeland, R. E.

    1992-01-01

    An efficient three dimensional unstructured Euler solver is parallelized on a Cray Y-MP C90 shared memory computer and on an Intel Touchstone Delta distributed memory computer. This paper relates the experiences gained and describes the software tools and hardware used in this study. Performance comparisons between two differing architectures are made.

  7. Working Memory Span Development: A Time-Based Resource-Sharing Model Account

    ERIC Educational Resources Information Center

    Barrouillet, Pierre; Gavens, Nathalie; Vergauwe, Evie; Gaillard, Vinciane; Camos, Valerie

    2009-01-01

    The time-based resource-sharing model (P. Barrouillet, S. Bernardin, & V. Camos, 2004) assumes that during complex working memory span tasks, attention is frequently and surreptitiously switched from processing to reactivate decaying memory traces before their complete loss. Three experiments involving children from 5 to 14 years of age…

  8. A shared neural ensemble links distinct contextual memories encoded close in time.

    PubMed

    Cai, Denise J; Aharoni, Daniel; Shuman, Tristan; Shobe, Justin; Biane, Jeremy; Song, Weilin; Wei, Brandon; Veshkini, Michael; La-Vu, Mimi; Lou, Jerry; Flores, Sergio E; Kim, Isaac; Sano, Yoshitake; Zhou, Miou; Baumgaertel, Karsten; Lavi, Ayal; Kamata, Masakazu; Tuszynski, Mark; Mayford, Mark; Golshani, Peyman; Silva, Alcino J

    2016-06-01

    Recent studies suggest that a shared neural ensemble may link distinct memories encoded close in time. According to the memory allocation hypothesis, learning triggers a temporary increase in neuronal excitability that biases the representation of a subsequent memory to the neuronal ensemble encoding the first memory, such that recall of one memory increases the likelihood of recalling the other memory. Here we show in mice that the overlap between the hippocampal CA1 ensembles activated by two distinct contexts acquired within a day is higher than when they are separated by a week. Several findings indicate that this overlap of neuronal ensembles links two contextual memories. First, fear paired with one context is transferred to a neutral context when the two contexts are acquired within a day but not across a week. Second, the first memory strengthens the second memory within a day but not across a week. Older mice, known to have lower CA1 excitability, do not show the overlap between ensembles, the transfer of fear between contexts, or the strengthening of the second memory. Finally, in aged mice, increasing cellular excitability and activating a common ensemble of CA1 neurons during two distinct context exposures rescued the deficit in linking memories. Taken together, these findings demonstrate that contextual memories encoded close in time are linked by directing storage into overlapping ensembles. Alteration of these processes by ageing could affect the temporal structure of memories, thus impairing efficient recall of related information. PMID:27251287

  9. A shared neural ensemble links distinct contextual memories encoded close in time

    NASA Astrophysics Data System (ADS)

    Cai, Denise J.; Aharoni, Daniel; Shuman, Tristan; Shobe, Justin; Biane, Jeremy; Song, Weilin; Wei, Brandon; Veshkini, Michael; La-Vu, Mimi; Lou, Jerry; Flores, Sergio E.; Kim, Isaac; Sano, Yoshitake; Zhou, Miou; Baumgaertel, Karsten; Lavi, Ayal; Kamata, Masakazu; Tuszynski, Mark; Mayford, Mark; Golshani, Peyman; Silva, Alcino J.

    2016-06-01

    Recent studies suggest that a shared neural ensemble may link distinct memories encoded close in time. According to the memory allocation hypothesis, learning triggers a temporary increase in neuronal excitability that biases the representation of a subsequent memory to the neuronal ensemble encoding the first memory, such that recall of one memory increases the likelihood of recalling the other memory. Here we show in mice that the overlap between the hippocampal CA1 ensembles activated by two distinct contexts acquired within a day is higher than when they are separated by a week. Several findings indicate that this overlap of neuronal ensembles links two contextual memories. First, fear paired with one context is transferred to a neutral context when the two contexts are acquired within a day but not across a week. Second, the first memory strengthens the second memory within a day but not across a week. Older mice, known to have lower CA1 excitability, do not show the overlap between ensembles, the transfer of fear between contexts, or the strengthening of the second memory. Finally, in aged mice, increasing cellular excitability and activating a common ensemble of CA1 neurons during two distinct context exposures rescued the deficit in linking memories. Taken together, these findings demonstrate that contextual memories encoded close in time are linked by directing storage into overlapping ensembles. Alteration of these processes by ageing could affect the temporal structure of memories, thus impairing efficient recall of related information.

  10. A system for simulating shared memory in heterogeneous distributed-memory networks with specialization for robotics applications

    SciTech Connect

    Jones, J.P.; Bangs, A.L.; Butler, P.L.

    1991-01-01

    Hetero Helix is a programming environment which simulates shared memory on a heterogeneous network of distributed-memory computers. The machines in the network may vary with respect to their native operating systems and internal representation of numbers. Hetero Helix presents a simple programming model to developers, and also considers the needs of designers, system integrators, and maintainers. The key software technology underlying Hetero Helix is the use of a compiler'' which analyzes the data structures in shared memory and automatically generates code which translates data representations from the format native to each machine into a common format, and vice versa. The design of Hetero Helix was motivated in particular by the requirements of robotics applications. Hetero Helix has been used successfully in an integration effort involving 27 CPUs in a heterogeneous network and a body of software totaling roughly 100,00 lines of code. 25 refs., 6 figs.

  11. Forgetting the unforgettable through conversation: socially shared retrieval-induced forgetting of September 11 memories.

    PubMed

    Coman, Alin; Manier, David; Hirst, William

    2009-05-01

    A speaker's selective recounting of memories shared with a listener will induce both the speaker and the listener to forget unmentioned, related material more than unmentioned, unrelated material. We extended this finding of within-individual and socially shared retrieval-induced forgetting to well-rehearsed, emotionally intense memories that are similar for the speaker and listener, but differ in specifics. A questionnaire probed participants' memory of the September 11 terrorist attacks. Questions and responses were grouped into category-exemplar structures. Then, participants selectively rehearsed their answers (using a structured interview in Experiment 1 and a joint recounting between pairs in Experiment 2). In subsequent recognition tests, response times yielded evidence of within-individual retrieval-induced forgetting and socially shared retrieval-induced forgetting. This result indicates that conversations can alter memories of speakers and listeners in similar ways, even when the memories differ. We discuss socially shared retrieval-induced forgetting as a mechanism for the formation of collective memories. PMID:19476592

  12. Matrix factorization on a hypercube multiprocessor

    SciTech Connect

    Geist, G.A.; Heath, M.T.

    1985-08-01

    This paper is concerned with parallel algorithms for matrix factorization on distributed-memory, message-passing multiprocessors, with special emphasis on the hypercube. Both Cholesky factorization of symmetric positive definite matrices and LU factorization of nonsymmetric matrices using partial pivoting are considered. The use of the resulting triangular factors to solve systems of linear equations by forward and back substitutions is also considered. Efficiencies of various parallel computational approaches are compared in terms of empirical results obtained on an Intel iPSC hypercube. 19 refs., 6 figs., 2 tabs.

  13. Automatic Generation of Directive-Based Parallel Programs for Shared Memory Parallel Systems

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Yan, Jerry; Frumkin, Michael

    2000-01-01

    The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. Due to its ease of programming and its good performance, the technique has become very popular. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate directive-based, OpenMP, parallel programs. We outline techniques used in the implementation of the tool and present test results on the NAS parallel benchmarks and ARC3D, a CFD application. This work demonstrates the great potential of using computer-aided tools to quickly port parallel programs and also achieve good performance.

  14. Programmable partitioning for high-performance coherence domains in a multiprocessor system

    DOEpatents

    Blumrich, Matthias A.; Salapura, Valentina

    2011-01-25

    A multiprocessor computing system and a method of logically partitioning a multiprocessor computing system are disclosed. The multiprocessor computing system comprises a multitude of processing units, and a multitude of snoop units. Each of the processing units includes a local cache, and the snoop units are provided for supporting cache coherency in the multiprocessor system. Each of the snoop units is connected to a respective one of the processing units and to all of the other snoop units. The multiprocessor computing system further includes a partitioning system for using the snoop units to partition the multitude of processing units into a plurality of independent, memory-consistent, adjustable-size processing groups. Preferably, when the processor units are partitioned into these processing groups, the partitioning system also configures the snoop units to maintain cache coherency within each of said groups.

  15. Socially shared mourning: construction and consumption of collective memory

    NASA Astrophysics Data System (ADS)

    Harju, Anu

    2015-04-01

    Social media, such as YouTube, is increasingly a site of collective remembering where personal tributes to celebrity figures become sites of public mourning. YouTube, especially, is rife with celebrity commemorations. Examining fans' online mourning practices on YouTube, this paper examines video tributes dedicated to the late Steve Jobs, with a focus on collective remembering and collective construction of memory. Combining netnography with critical discourse analysis, the analysis focuses on the user comments where the past unfolds in interaction and meanings are negotiated and contested. The paper argues that celebrity death may, for avid fans, be a source of disenfranchised grief, a type of grief characterised by inadequate social support, usually arising from lack of empathy for the loss. The paper sheds light on the functions digital memorials have for mourning fans (and fandom) and argues that social media sites have come to function as spaces of negotiation, legitimisation and alleviation of disenfranchised grief. It is also suggested that when it comes to disenfranchised grief, and grief work generally, the concept of community be widened to include communities of weak ties, a typical form of communal belonging on social media.

  16. A New Shared-Memory Programming Paradigm for Molecular Dynamics Simulations on the Intel Paragon

    SciTech Connect

    D'Azevedo, E.F.

    1995-01-01

    This report describes the use of shared memory emulation with DOLIB (Distributed Object Library) to simplify parallel programming on the Intel Paragon. A molecular dynamics application is used as an example to illustrate the use of the DOLIB shared memory library. SOTON PAR, a parallel molecular dynamics code with explicit message-passing using a Lennard-Jones 6-12 potential, is rewritten using DOLIB primitives. The resulting code has no explicit message primitives and resembles a serial code. The new code can perform dynamic load balancing and achieves better performance than the original parallel code with explicit message-passing.

  17. A new shared-memory programming paradigm for molecular dynamics simulations on the Intel Paragon

    SciTech Connect

    D`Azevedo, E.F.; Romine, C.H.

    1994-12-01

    This report describes the use of shared memory emulation with DOLIB (Distributed Object Library) to simplify parallel programming on the Intel Paragon. A molecular dynamics application is used as an example to illustrate the use of the DOLIB shared memory library. SOTON-PAR, a parallel molecular dynamics code with explicit message-passing using a Lennard-Jones 6-12 potential, is rewritten using DOLIB primitives. The resulting code has no explicit message primitives and resembles a serial code. The new code can perform dynamic load balancing and achieves better performance than the original parallel code with explicit message-passing.

  18. Shared Representations in Language Processing and Verbal Short-Term Memory: The Case of Grammatical Gender

    ERIC Educational Resources Information Center

    Schweppe, Judith; Rummer, Ralf

    2007-01-01

    The general idea of language-based accounts of short-term memory is that retention of linguistic materials is based on representations within the language processing system. In the present sentence recall study, we address the question whether the assumption of shared representations holds for morphosyntactic information (here: grammatical gender…

  19. Functions of Memory Sharing and Mother-Child Reminiscing Behaviors: Individual and Cultural Variations

    ERIC Educational Resources Information Center

    Kulkofsky, Sarah; Wang, Qi; Koh, Jessie Bee Kim

    2009-01-01

    This study examined maternal beliefs about the functions of memory sharing and the relations between these beliefs and mother-child reminiscing behaviors in a cross-cultural context. Sixty-three European American and 47 Chinese mothers completed an open-ended questionnaire concerning their beliefs about the functions of parent-child memory…

  20. Method and apparatus for single-stepping coherence events in a multiprocessor system under software control

    DOEpatents

    Blumrich, Matthias A.; Salapura, Valentina

    2010-11-02

    An apparatus and method are disclosed for single-stepping coherence events in a multiprocessor system under software control in order to monitor the behavior of a memory coherence mechanism. Single-stepping coherence events in a multiprocessor system is made possible by adding one or more step registers. By accessing these step registers, one or more coherence requests are processed by the multiprocessor system. The step registers determine if the snoop unit will operate by proceeding in a normal execution mode, or operate in a single-step mode.

  1. Optical Shared Memory Computing and Multiple Access Protocols for Photonic Networks

    NASA Astrophysics Data System (ADS)

    Li, Kuang-Yu.

    In this research we investigate potential applications of optics in massively parallel computer systems, especially focusing on design issues in three-dimensional optical data storage and free-space photonic networks. An optical implementation of a shared memory uses a single photorefractive crystal and can realize the set of memory modules in a digital shared memory computer. A complete instruction set consists of R sc EAD, W sc RITE, S sc ELECTIVE E sc RASE, and R sc EFRESH, which can be applied to any memory module independent of (and in parallel with) instructions to the other memory modules. In addition, a memory module can execute a sequence of R sc EAD operations simultaneously with the execution of a W sc RITE operation to accommodate differences in optical recording and readout times common to optical volume storage media. An experimental shared memory system is demonstrated and its projected performance is analyzed. A multiplexing technique is presented to significantly reduce both grating- and beam-degeneracy crosstalk in volume holographic systems, by incorporating space, angle, and wavelength as the multiplexing parameters. In this approach, each hologram, which results from the interference between a single input node and an object array, partially overlaps with the other holograms in its neighborhood. This technique can offer improved interconnection density, optical throughput, signal fidelity, and space-bandwidth product utilization. Design principles and numerical simulation results are presented. A free-space photonic cellular hypercube parallel computer, with emphasis on the design of a collisionless multiple access protocol, is presented. This design incorporates wavelength-, space-, and time-multiplexing to achieve multiple access, wavelength reuse, dense connectivity, collisionless communications, and a simple control mechanism. Analytic models based on semi-Markov processes are employed to analyze this protocol. The performance of the

  2. Visual and Spatial Working Memory Are Not that Dissociated after All: A Time-Based Resource-Sharing Account

    ERIC Educational Resources Information Center

    Vergauwe, Evie; Barrouillet, Pierre; Camos, Valerie

    2009-01-01

    Examinations of interference between visual and spatial materials in working memory have suggested domain- and process-based fractionations of visuo-spatial working memory. The present study examined the role of central time-based resource sharing in visuo-spatial working memory and assessed its role in obtained interference patterns. Visual and…

  3. Principles for problem aggregation and assignment in medium scale multiprocessors

    NASA Technical Reports Server (NTRS)

    Nicol, David M.; Saltz, Joel H.

    1987-01-01

    One of the most important issues in parallel processing is the mapping of workload to processors. This paper considers a large class of problems having a high degree of potential fine grained parallelism, and execution requirements that are either not predictable, or are too costly to predict. The main issues in mapping such a problem onto medium scale multiprocessors are those of aggregation and assignment. We study a method of parameterized aggregation that makes few assumptions about the workload. The mapping of aggregate units of work onto processors is uniform, and exploits locality of workload intensity to balance the unknown workload. In general, a finer aggregate granularity leads to a better balance at the price of increased communication/synchronization costs; the aggregation parameters can be adjusted to find a reasonable granularity. The effectiveness of this scheme is demonstrated on three model problems: an adaptive one-dimensional fluid dynamics problem with message passing, a sparse triangular linear system solver on both a shared memory and a message-passing machine, and a two-dimensional time-driven battlefield simulation employing message passing. Using the model problems, the tradeoffs are studied between balanced workload and the communication/synchronization costs. Finally, an analytical model is used to explain why the method balances workload and minimizes the variance in system behavior.

  4. A prototype functional language implementation for hierarchical-memory architectures

    SciTech Connect

    Wolski, R.; Feo, J.; Cann, D.

    1991-06-05

    The first implementation of Sisal was designed for general shared-memory architectures. Since then, we have optimized the system for vector and coherent-cache multiprocessors. Coherent-cache systems can be thought of as simple, two-level hierarchical memory systems, where the memory hierarchy is managed by the hardware. The compiler and run-time system for such an architecture needs to maintain data locality so that the processor caches are used as much as possible. In this paper, we extend the coherent-cache implementation to include explicit compiler and run-time control for medium-grain and coarse-grain hierarchical-memory architectures. We implemented the extended system on the BBN Butterfly using interleaved shared memory exclusively for the purposes of data sharing and exploiting the per-processor local memories. We give preliminary performance results for this extended system. 10 refs., 7 figs.

  5. Parallel calculations on shared memory, NUMA-based computers using MATLAB

    NASA Astrophysics Data System (ADS)

    Krotkiewski, Marcin; Dabrowski, Marcin

    2014-05-01

    Achieving satisfactory computational performance in numerical simulations on modern computer architectures can be a complex task. Multi-core design makes it necessary to parallelize the code. Efficient parallelization on NUMA (Non-Uniform Memory Access) shared memory architectures necessitates explicit placement of the data in the memory close to the CPU that uses it. In addition, using more than 8 CPUs (~100 cores) requires a cluster solution of interconnected nodes, which involves (expensive) communication between the processors. It takes significant effort to overcome these challenges even when programming in low-level languages, which give the programmer full control over data placement and work distribution. Instead, many modelers use high-level tools such as MATLAB, which severely limit the optimization/tuning options available. Nonetheless, the advantage of programming simplicity and a large available code base can tip the scale in favor of MATLAB. We investigate whether MATLAB can be used for efficient, parallel computations on modern shared memory architectures. A common approach to performance optimization of MATLAB programs is to identify a bottleneck and migrate the corresponding code block to a MEX file implemented in, e.g. C. Instead, we aim at achieving a scalable parallel performance of MATLABs core functionality. Some of the MATLABs internal functions (e.g., bsxfun, sort, BLAS3, operations on vectors) are multi-threaded. Achieving high parallel efficiency of those may potentially improve the performance of significant portion of MATLABs code base. Since we do not have MATLABs source code, our performance tuning relies on the tools provided by the operating system alone. Most importantly, we use custom memory allocation routines, thread to CPU binding, and memory page migration. The performance tests are carried out on multi-socket shared memory systems (2- and 4-way Intel-based computers), as well as a Distributed Shared Memory machine with 96 CPU

  6. HyperForest: A high performance multi-processor architecture for real-time intelligent systems

    SciTech Connect

    Garcia, P. Jr.; Rebeil, J.P.; Pollard, H.

    1997-04-01

    Intelligent Systems are characterized by the intensive use of computer power. The computer revolution of the last few years is what has made possible the development of the first generation of Intelligent Systems. Software for second generation Intelligent Systems will be more complex and will require more powerful computing engines in order to meet real-time constraints imposed by new robots, sensors, and applications. A multiprocessor architecture was developed that merges the advantages of message-passing and shared-memory structures: expendability and real-time compliance. The HyperForest architecture will provide an expandable real-time computing platform for computationally intensive Intelligent Systems and open the doors for the application of these systems to more complex tasks in environmental restoration and cleanup projects, flexible manufacturing systems, and DOE`s own production and disassembly activities.

  7. Sequoia: A fault-tolerant tightly coupled multiprocessor for transaction processing

    SciTech Connect

    Bernstein, P.A.

    1988-02-01

    The Sequoia computer is a tightly coupled multiprocessor, and thus attains the performance advantages of this style of architecture. It avoids most of the fault-tolerance disadvantages of tight coupling by using a new fault-tolerance design. The Sequoia architecture is similar to other multimicroprocessor architectures, such as those of Encore and Sequent, in that it gives dozens of microprocessors shared access to a large main memory. It resembles the Stratus architecture in its extensive use of hardware fault-detection techniques. It resembles Stratus and Auragen in its ability to quickly recover all processes after a single point failure, transparently to the user. However, Sequoia is unique in its combination of a large-scale tightly coupled architecture with a hardware approach to fault tolerance. This article gives an overview of how the hardware architecture and operating systems (OS) work together to provide a high degree of fault tolerance with good system performance.

  8. Using memory in the Cedar system

    SciTech Connect

    McGrath, R.E.; Emrath, P.

    1987-01-01

    The design of the virtual memory system for the Cedar multiprocessor under construction at the University of Illinois is discussed. The Cedar architecture features a hierarchy of memory, some shared by all processors, and some shared by subsets of processors. The Xylem operating system is based on Alliant Computer Systems CONCENTRIX(TM) operating system, which is based on 4.2BSD UNIX(TM). Xylem supports multi-tasking and demand paging of parts of the memory hierarchy into a linear virtual address space. Memory may be private to a task or shared between all the tasks. The locality and attributes of a page may be modified during the execution of a program. Examples of how these mechanisms can be used are discussed. 14 figs.

  9. Vascular system modeling in parallel environment - distributed and shared memory approaches

    PubMed Central

    Jurczuk, Krzysztof; Kretowski, Marek; Bezy-Wendling, Johanne

    2011-01-01

    The paper presents two approaches in parallel modeling of vascular system development in internal organs. In the first approach, new parts of tissue are distributed among processors and each processor is responsible for perfusing its assigned parts of tissue to all vascular trees. Communication between processors is accomplished by passing messages and therefore this algorithm is perfectly suited for distributed memory architectures. The second approach is designed for shared memory machines. It parallelizes the perfusion process during which individual processing units perform calculations concerning different vascular trees. The experimental results, performed on a computing cluster and multi-core machines, show that both algorithms provide a significant speedup. PMID:21550891

  10. Embedded Multiprocessor Technology for VHSIC Insertion

    NASA Technical Reports Server (NTRS)

    Hayes, Paul J.

    1990-01-01

    Viewgraphs on embedded multiprocessor technology for VHSIC insertion are presented. The objective was to develop multiprocessor system technology providing user-selectable fault tolerance, increased throughput, and ease of application representation for concurrent operation. The approach was to develop graph management mapping theory for proper performance, model multiprocessor performance, and demonstrate performance in selected hardware systems.

  11. IMPACC: A Tightly Integrated MPI+OpenACC Framework Exploiting Shared Memory Parallelism

    SciTech Connect

    Lee, Seyong; Vetter, Jeffrey S

    2016-01-01

    We propose IMPACC, an MPI+OpenACC framework for heterogeneous accelerator clusters. IMPACC tightly integrates MPI and OpenACC, while exploiting the shared memory parallelism in the target system. IMPACC dynamically adapts the input MPI+OpenACC applications on the target heterogeneous accelerator clusters to fully exploit target system-specific features. IMPACC provides the programmers with the unified virtual address space, automatic NUMA-friendly task-device mapping, efficient integrated communication routines, seamless streamlining of asynchronous executions, and transparent memory sharing. We have implemented IMPACC and evaluated its performance using three heterogeneous accelerator systems, including Titan supercomputer. Results show that IMPACC can achieve easier programming, higher performance, and better scalability than the current MPI+OpenACC model.

  12. Fault tolerant onboard packet switch architecture for communication satellites: Shared memory per beam approach

    NASA Technical Reports Server (NTRS)

    Shalkhauser, Mary JO; Quintana, Jorge A.; Soni, Nitin J.

    1994-01-01

    The NASA Lewis Research Center is developing a multichannel communication signal processing satellite (MCSPS) system which will provide low data rate, direct to user, commercial communications services. The focus of current space segment developments is a flexible, high-throughput, fault tolerant onboard information switching processor. This information switching processor (ISP) is a destination-directed packet switch which performs both space and time switching to route user information among numerous user ground terminals. Through both industry study contracts and in-house investigations, several packet switching architectures were examined. A contention-free approach, the shared memory per beam architecture, was selected for implementation. The shared memory per beam architecture, fault tolerance insertion, implementation, and demonstration plans are described.

  13. Performance measures for multiprocessor controllers

    NASA Technical Reports Server (NTRS)

    Krishna, C. M.; Shin, K. G.

    1982-01-01

    Performance measures to characterize fault tolerant multiprocessors used in the control of critical processes are considered. Our performance indices are based on controller response time. By relating this to the needs of the application, we have been able to derive indices that faithfully reflect the performance of the multiprocessor in the context of the application, that permit the objective comparison of rival computer systems, and that can either be definitively estimated or objectively measured. An example of a controller in an idealized satellite application is provided.

  14. Global arrays: A portable {open_quotes}shared-memory{close_quotes} programming model for distributed memory computers

    SciTech Connect

    Harrison, R.J.; Nieplocha, J.; Littlefield, R.J.

    1994-11-01

    Portability, efficiency, and ease of coding are all important considerations in choosing the programming model for a scalable parallel application. The message-passing programming model is widely used because of its portability, yet some applications are too complex to code in it while also trying to maintain a balanced computation load and avoid redundant computations. The shared-memory programming model simplifies coding, but it is not portable and often provides little control over interprocessor data transfer costs. This paper describes a new approach, called Global Arrays (GA), that combines the better features of both other models, leading to both simple coding and efficient execution. The key concept of GA is that it provides a portable interface through which each process in a MIMD parallel program can asynchronously access logical blocks of physically distributed matrices, with no need for explicit cooperation by other processes. The authors have implemented GA libraries on a variety of computer systems, including the Intel DELTA and Paragon, the IBM SP-1 (all message-passers), the Kendall Square KSR-2 (a nonuniform access shared-memory machine), and networks of Unix workstations. They discuss the design and implementation of these libraries, report their performance, illustrate the use of GA in the context of computational chemistry applications, and describe the use of a GA performance visualization tool.

  15. Shared and Distributed Memory Parallel Security Analysis of Large-Scale Source Code and Binary Applications

    SciTech Connect

    Quinlan, D; Barany, G; Panas, T

    2007-08-30

    Many forms of security analysis on large scale applications can be substantially automated but the size and complexity can exceed the time and memory available on conventional desktop computers. Most commercial tools are understandably focused on such conventional desktop resources. This paper presents research work on the parallelization of security analysis of both source code and binaries within our Compass tool, which is implemented using the ROSE source-to-source open compiler infrastructure. We have focused on both shared and distributed memory parallelization of the evaluation of rules implemented as checkers for a wide range of secure programming rules, applicable to desktop machines, networks of workstations and dedicated clusters. While Compass as a tool focuses on source code analysis and reports violations of an extensible set of rules, the binary analysis work uses the exact same infrastructure but is less well developed into an equivalent final tool.

  16. Data traffic reduction schemes for Cholesky factorization on asynchronous multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Naik, Vijay K.; Patrick, Merrell L.

    1989-01-01

    Communication requirements of Cholesky factorization of dense and sparse symmetric, positive definite matrices are analyzed. The communication requirement is characterized by the data traffic generated on multiprocessor systems with local and shared memory. Lower bound proofs are given to show that when the load is uniformly distributed the data traffic associated with factoring an n x n dense matrix using n to the alpha power (alpha less than or equal 2) processors is omega(n to the 2 + alpha/2 power). For n x n sparse matrices representing a square root of n x square root of n regular grid graph the data traffic is shown to be omega(n to the 1 + alpha/2 power), alpha less than or equal 1. Partitioning schemes that are variations of block assignment scheme are described and it is shown that the data traffic generated by these schemes are asymptotically optimal. The schemes allow efficient use of up to O(n to the 2nd power) processors in the dense case and up to O(n) processors in the sparse case before the total data traffic reaches the maximum value of O(n to the 3rd power) and O(n to the 3/2 power), respectively. It is shown that the block based partitioning schemes allow a better utilization of the data accessed from shared memory and thus reduce the data traffic than those based on column-wise wrap around assignment schemes.

  17. The Terracorrelator: a shared memory HPC facility for real-time seismological cross-correlation analyses

    NASA Astrophysics Data System (ADS)

    Atkinson, Malcolm; Bell, Andrew; Curtis, Andrew; Entwistle, Elizabeth; Filgueira, Rosa; Krause, Amrey; Main, Ian; Meles, Giovani; Miniter, Mike; Zhao, Youqian

    2015-04-01

    Earthquakes and volcanic eruptions may in some instances be preceded or accompanied by changes in the geophysical properties of the Earth, such as seismic velocities or event rates. The development of reliable probabilistic forecasting methods for these hazards requires real-time analysis of seismic data and truly prospective forecasting and testing to reduce bias. However, potential forecasting techniques, including seismic interferometry and earthquake "repeater" analysis, require a large number of waveform cross-correlations; this is computationally intensive, and is particularly challenging in real-time. Here we describe the "Terracorrelator", a new high performance computing facility at the University of Edinburgh designed for real-time cross-correlational analyses. The machine consists of two 2TB shared memory nodes for cross-correlation and post-processing, and two Intel Xeon Phi nodes for pre-processing. The Terracorrelator has been tested on a seismic interferometry case study using ObsPy for seismic operations and processing, and Dispel4Py for writing and executing the workflow. The workflow is distributed automatically for parallel processing in a shared memory multicore environment. Preliminary results have demonstrated that data from 1000 seismic stations can be pre-processed, and each station cross-correlated with all others (499500 cross-correlations) in hourly or daily intervals sufficiently quickly to keep ahead of new data arriving, on one of the shared memory nodes. The second node is therefore free to perform interpretative analysis on the outputs, for example to look at changes in the resulting correlations. These promising results suggest that it will be possible to undertake real-time interferometric analysis using Sure~1000 stations, and to test the predictive power of current seismic velocity changes for future hazard occurrence.

  18. A multiprocessor operating system simulator

    SciTech Connect

    Johnston, G.M.; Campbell, R.H. . Dept. of Computer Science)

    1988-01-01

    This paper describes a multiprocessor operating system simulator that was developed by the authors in the Fall of 1987. The simulator was built in response to the need to provide students with an environment in which to build and test operating system concepts as part of the coursework of a third-year undergraduate operating systems course. Written in C++, the simulator uses the co-routine style task package that is distributed with the AT and T C++ Translator to provide a hierarchy of classes that represents a broad range of operating system software and hardware components. The class hierarchy closely follows that of the Choices family of operating systems for loosely and tightly coupled multiprocessors. During an operating system course, these classes are refined and specialized by students in homework assignments to facilitate experimentation with different aspects of operating system design and policy decisions. The current implementation runs on the IBM RT PC under 4.3bsd UNIX.

  19. A Multiprocessor Operating System Simulator

    NASA Technical Reports Server (NTRS)

    Johnston, Gary M.; Campbell, Roy H.

    1988-01-01

    This paper describes a multiprocessor operating system simulator that was developed by the authors in the Fall semester of 1987. The simulator was built in response to the need to provide students with an environment in which to build and test operating system concepts as part of the coursework of a third-year undergraduate operating systems course. Written in C++, the simulator uses the co-routine style task package that is distributed with the AT&T C++ Translator to provide a hierarchy of classes that represents a broad range of operating system software and hardware components. The class hierarchy closely follows that of the 'Choices' family of operating systems for loosely- and tightly-coupled multiprocessors. During an operating system course, these classes are refined and specialized by students in homework assignments to facilitate experimentation with different aspects of operating system design and policy decisions. The current implementation runs on the IBM RT PC under 4.3bsd UNIX.

  20. Reproducibility in a multiprocessor system

    DOEpatents

    Bellofatto, Ralph A; Chen, Dong; Coteus, Paul W; Eisley, Noel A; Gara, Alan; Gooding, Thomas M; Haring, Rudolf A; Heidelberger, Philip; Kopcsay, Gerard V; Liebsch, Thomas A; Ohmacht, Martin; Reed, Don D; Senger, Robert M; Steinmacher-Burow, Burkhard; Sugawara, Yutaka

    2013-11-26

    Fixing a problem is usually greatly aided if the problem is reproducible. To ensure reproducibility of a multiprocessor system, the following aspects are proposed; a deterministic system start state, a single system clock, phase alignment of clocks in the system, system-wide synchronization events, reproducible execution of system components, deterministic chip interfaces, zero-impact communication with the system, precise stop of the system and a scan of the system state.

  1. An experimental distributed microprocessor implementation with a shared memory communications and control medium

    NASA Technical Reports Server (NTRS)

    Mejzak, R. S.

    1980-01-01

    The distributed processing concept is defined in terms of control primitives, variables, and structures and their use in performing a decomposed discrete Fourier transform (DET) application function. The design assumes interprocessor communications to be anonymous. In this scheme, all processors can access an entire common database by employing control primitives. Access to selected areas within the common database is random, enforced by a hardware lock, and determined by task and subtask pointers. This enables the number of processors to be varied in the configuration without any modifications to the control structure. Decompositional elements of the DFT application function in terms of tasks and subtasks are also described. The experimental hardware configuration consists of IMSAI 8080 chassis which are independent, 8 bit microcomputer units. These chassis are linked together to form a multiple processing system by means of a shared memory facility. This facility consists of hardware which provides a bus structure to enable up to six microcomputers to be interconnected. It provides polling and arbitration logic so that only one processor has access to shared memory at any one time.

  2. Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

    NASA Astrophysics Data System (ADS)

    Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide

    2015-09-01

    The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.

  3. An implementation of SISAL for distributed-memory architectures

    SciTech Connect

    Beard, P.C.

    1995-06-01

    This thesis describes a new implementation of the implicitly parallel functional programming language SISAL, for massively parallel processor supercomputers. The Optimizing SISAL Compiler (OSC), developed at Lawrence Livermore National Laboratory, was originally designed for shared-memory multiprocessor machines and has been adapted to distributed-memory architectures. OSC has been relatively portable between shared-memory architectures, because they are architecturally similar, and OSC generates portable C code. However, distributed-memory architectures are not standardized -- each has a different programming model. Distributed-memory SISAL depends on a layer of software that provides a portable, distributed, shared-memory abstraction. This layer is provided by Split-C, a dialect of the C programming language developed at U.C. Berkeley, which has demonstrated good performance on distributed-memory architectures. Split-C provides important capabilities for good performance: support for program-specific distributed data structures, and split-phase memory operations. Distributed data structures help achieve good memory locality, while split-phase memory operations help tolerate the longer communication latencies inherent in distributed-memory architectures. The distributed-memory SISAL compiler and run-time system takes advantage of these capabilities. The results of these efforts is a compiler that runs identically on the Thinking Machines Connection Machine (CM-5), and the Meiko Computing Surface (CS-2).

  4. Parallel Reduction of Large Radar Interferometry Scenes on a Mid-scale, Symmetric Multiprocessor Mainframe Computer

    NASA Astrophysics Data System (ADS)

    Harcke, L. J.; Zebker, H. A.

    2006-12-01

    We report on experiences in processing repeat-orbit interferometry data sets on a mid-scale multiprocessor mainframe computer. Newer applications of interferometric and polarimetric data processing, such as permanent scatterer deformation monitoring, require the generation of many tens of repeat-pass interferometry data pairs, perhaps 30 to 50, to provide sufficient input to the deformation model. Moving existing radar processing techniques toward massively parallel computation provides a path to coping with such large data sets, which can consist of 30 to 50 gigabytes (GB) of raw data. In June 2006, the Stanford School of Earth Sciences dedicated a new computation center for general research use. Two large machines compose the center: a single-node, symmetric multiprocessor (SMP) machine with 48 processor cores and a single 192~GB memory, and a 64 node distributed cluster containing 128 processor cores with at least 2~GB of memory per node. Distributed processing of the matched filter for synthetic aperture radar image formation requires a high communication-to-computation ratio. Experiments performed over a decade ago on distributed memory supercomputers, and repeated a half-decade ago on commodity workstation clusters, both demonstrated saturation of inter-node communication links. For this reason, we chose to parallelize the interferometric processor on the shared memory computer using the OpenMP programming standard. We find, not unexpectedly, that the input/output stage of processing standard 100-by-100~kilometer ERS-1 scenes quickly dominates the total computation time, and that only modest increases in processing time are achieved after 8 to 16 processor cores are brought to bear on a single data set. The input and output data sit in single, serially accessed disk files, creating a bottleneck for overall throughput. This points to a scheme for efficient partitioning of mid-size (24 to 48~core) machines for reducing large Earth science data sets, where 3 to

  5. Multiprocessor computer overset grid method and apparatus

    DOEpatents

    Barnette, Daniel W.; Ober, Curtis C.

    2003-01-01

    A multiprocessor computer overset grid method and apparatus comprises associating points in each overset grid with processors and using mapped interpolation transformations to communicate intermediate values between processors assigned base and target points of the interpolation transformations. The method allows a multiprocessor computer to operate with effective load balance on overset grid applications.

  6. Coscheduling Technique for Symmetric Multiprocessor Clusters

    SciTech Connect

    Yoo, A B; Jette, M A

    2000-09-18

    Coscheduling is essential for obtaining good performance in a time-shared symmetric multiprocessor (SMP) cluster environment. However, the most common technique, gang scheduling, has limitations such as poor scalability and vulnerability to faults mainly due to explicit synchronization between its components. A decentralized approach called dynamic coscheduling (DCS) has been shown to be effective for network of workstations (NOW), but this technique is not suitable for the workloads on a very large SMP-cluster with thousands of processors. Furthermore, its implementation can be prohibitively expensive for such a large-scale machine. IN this paper, they propose a novel coscheduling technique based on the DCS approach which can achieve coscheduling on very large SMP-clusters in a scalable, efficient, and cost-effective way. In the proposed technique, each local scheduler achieves coscheduling based upon message traffic between the components of parallel jobs. Message trapping is carried out at the user-level, eliminating the need for unsupported hardware or device-level programming. A sending process attaches its status to outgoing messages so local schedulers on remote nodes can make more intelligent scheduling decisions. Once scheduled, processes are guaranteed some minimum period of time to execute. This provides an opportunity to synchronize the parallel job's components across all nodes and achieve good program performance. The results from a performance study reveal that the proposed technique is a promising approach that can reduce response time significantly over uncoordinated time-sharing and batch scheduling.

  7. Real-time topological image smoothing on shared memory parallel machines

    NASA Astrophysics Data System (ADS)

    Mahmoudi, Ramzi; Akil, Mohamed

    2011-03-01

    Smoothing filter is the method of choice for image preprocessing and pattern recognition. We present a new concurrent method for smoothing 2D object in binary case. Proposed method provides a parallel computation while preserving the topology by using homotopic transformations. We introduce an adapted parallelization strategy called split, distribute and merge (SDM) strategy which allows efficient parallelization of a large class of topological operators including, mainly, smoothing, skeletonization, and watershed algorithms. To achieve a good speedup, we cared about task scheduling. Distributed work during smoothing process is done by a variable number of threads. Tests on 2D binary image (512*512), using shared memory parallel machine (SMPM) with 8 CPU cores (2× Xeon E5405 running at frequency of 2 GHz), showed an enhancement of 5.2 thus a cadency of 32 images per second is achieved.

  8. Exploring the use of I/O nodes for computation in a MIMD multiprocessor

    NASA Technical Reports Server (NTRS)

    Kotz, David; Cai, Ting

    1995-01-01

    As parallel systems move into the production scientific-computing world, the emphasis will be on cost-effective solutions that provide high throughput for a mix of applications. Cost effective solutions demand that a system make effective use of all of its resources. Many MIMD multiprocessors today, however, distinguish between 'compute' and 'I/O' nodes, the latter having attached disks and being dedicated to running the file-system server. This static division of responsibilities simplifies system management but does not necessarily lead to the best performance in workloads that need a different balance of computation and I/O. Of course, computational processes sharing a node with a file-system service may receive less CPU time, network bandwidth, and memory bandwidth than they would on a computation-only node. In this paper we begin to examine this issue experimentally. We found that high performance I/O does not necessarily require substantial CPU time, leaving plenty of time for application computation. There were some complex file-system requests, however, which left little CPU time available to the application. (The impact on network and memory bandwidth still needs to be determined.) For applications (or users) that cannot tolerate an occasional interruption, we recommend that they continue to use only compute nodes. For tolerant applications needing more cycles than those provided by the compute nodes, we recommend that they take full advantage of both compute and I/O nodes for computation, and that operating systems should make this possible.

  9. Improved multiprocessor garbage collection algorithms

    SciTech Connect

    Newman, I.A.; Stallard, R.P.; Woodward, M.C.

    1983-01-01

    Outlines the results of an investigation of existing multiprocessor garbage collection algorithms and introduces two new algorithms which significantly improve some aspects of the performance of their predecessors. The two algorithms arise from different starting assumptions. One considers the case where the algorithm will terminate successfully whatever list structure is being processed and assumes that the extra data space should be minimised. The other seeks a very fast garbage collection time for list structures that do not contain loops. Results of both theoretical and experimental investigations are given to demonstrate the efficacy of the algorithms. 7 references.

  10. Simulating an aerospace multiprocessor. [for space guidance computers

    NASA Technical Reports Server (NTRS)

    Mallach, E. G.

    1976-01-01

    The paper describes a simulator which was used to evaluate the architecture of an aerospace multiprocessor. The simulator models interactions among the processors, memories, the central data bus, and a possible 'job stack'. Special features of the simulator are discussed, including the use of explicitly coded and individually distinguishable 'job models' instead of a statistically defined 'job mix' and a specialized Job Model Definition Language to automate the detailed coding of the models. Some results are presented which show that when the simulator was employed in conjunction with queuing theory and Markov-process analysis, more insight into system behavior was obtained than would have been with any one technique alone.

  11. Multiprocessor system with multiple concurrent modes of execution

    SciTech Connect

    Ahn, Daniel; Ceze, Luis H; Chen, Dong; Gara, Alan; Heidelberger, Philip; Ohmacht, Martin

    2013-12-31

    A multiprocessor system supports multiple concurrent modes of speculative execution. Speculation identification numbers (IDs) are allocated to speculative threads from a pool of available numbers. The pool is divided into domains, with each domain being assigned to a mode of speculation. Modes of speculation include TM, TLS, and rollback. Allocation of the IDs is carried out with respect to a central state table and using hardware pointers. The IDs are used for writing different versions of speculative results in different ways of a set in a cache memory.

  12. Aho-Corasick String Matching on Shared and Distributed Memory Parallel Architectures

    SciTech Connect

    Tumeo, Antonino; Villa, Oreste; Chavarría-Miranda, Daniel

    2012-03-01

    String matching is at the core of many critical applications, including network intrusion detection systems, search engines, virus scanners, spam filters, DNA and protein sequencing, and data mining. For all of these applications string matching requires a combination of (sometimes all) the following characteristics: high and/or predictable performance, support for large data sets and flexibility of integration and customization. Many software based implementations targeting conventional cache-based microprocessors fail to achieve high and predictable performance requirements, while Field-Programmable Gate Array (FPGA) implementations and dedicated hardware solutions fail to support large data sets (dictionary sizes) and are difficult to integrate and customize. The advent of multicore, multithreaded, and GPU-based systems is opening the possibility for software based solutions to reach very high performance at a sustained rate. This paper compares several software-based implementations of the Aho-Corasick string searching algorithm for high performance systems. We discuss the implementation of the algorithm on several types of shared-memory high-performance architectures (Niagara 2, large x86 SMPs and Cray XMT), distributed memory with homogeneous processing elements (InfiniBand cluster of x86 multicores) and heterogeneous processing elements (InfiniBand cluster of x86 multicores with NVIDIA Tesla C10 GPUs). We describe in detail how each solution achieves the objectives of supporting large dictionaries, sustaining high performance, and enabling customization and flexibility using various data sets.

  13. Pushing away the communication bottleneck with optical interconnects in symmetric multiprocessors

    NASA Astrophysics Data System (ADS)

    Hlayhel, Wissam; Collet, Jacques H.; Rochange, Christine; Litaize, Daniel

    2000-05-01

    We analyze the bandwidth needed for transmitting the addresses in future symmetric multiprocessor machines (SMP), constructed around a shared bus due to the critical obligation to preserve the coherence of the memory hierarchy. We show that an address-transaction bandwidth as high as several hundreds of Gbit/s will be necessary not to slow down the execution of most applications in large SMP's. This communication bandwidth seems incompatible with the operation constraints of shared electrical busses, making necessary the search for other implementations of the address transmission network. We consider the introduction of optical interconnects (OI) in this context. We review several solutions, in the ascending order of complexity of the optical subsystems as one critical issue concerns the degree of sophistication of the optical solutions and their cost. We first consider simple point to point OI's for a SMP chipset. The interest for OI's comes from the low energy consumption and from the possibility, in the future, to integrate several thousands of optical input/outputs per electronic chip. The we consider the implementation of an optical bus that is a multipoint optical line involving more optical functionality. We discuss the possibility of multiple accesses to the bus, and the constraints related to the necessity to maintain the coherence of caches.

  14. Performance and Application of Parallel OVERFLOW Codes on Distributed and Shared Memory Platforms

    NASA Technical Reports Server (NTRS)

    Djomehri, M. Jahed; Rizk, Yehia M.

    1999-01-01

    The presentation discusses recent studies on the performance of the two parallel versions of the aerodynamics CFD code, OVERFLOW_MPI and _MLP. Developed at NASA Ames, the serial version, OVERFLOW, is a multidimensional Navier-Stokes flow solver based on overset (Chimera) grid technology. The code has recently been parallelized in two ways. One is based on the explicit message-passing interface (MPI) across processors and uses the _MPI communication package. This approach is primarily suited for distributed memory systems and workstation clusters. The second, termed the multi-level parallel (MLP) method, is simple and uses shared memory for all communications. The _MLP code is suitable on distributed-shared memory systems. For both methods, the message passing takes place across the processors or processes at the advancement of each time step. This procedure is, in effect, the Chimera boundary conditions update, which is done in an explicit "Jacobi" style. In contrast, the update in the serial code is done in more of the "Gauss-Sidel" fashion. The programming efforts for the _MPI code is more complicated than for the _MLP code; the former requires modification of the outer and some inner shells of the serial code, whereas the latter focuses only on the outer shell of the code. The _MPI version offers a great deal of flexibility in distributing grid zones across a specified number of processors in order to achieve load balancing. The approach is capable of partitioning zones across multiple processors or sending each zone and/or cluster of several zones into a single processor. The message passing across the processors consists of Chimera boundary and/or an overlap of "halo" boundary points for each partitioned zone. The MLP version is a new coarse-grain parallel concept at the zonal and intra-zonal levels. A grouping strategy is used to distribute zones into several groups forming sub-processes which will run in parallel. The total volume of grid points in each

  15. Improvement of multiprocessing performance by using optical centralized shared bus

    NASA Astrophysics Data System (ADS)

    Han, Xuliang; Chen, Ray T.

    2004-06-01

    With the ever-increasing need to solve larger and more complex problems, multiprocessing is attracting more and more research efforts. One of the challenges facing the multiprocessor designers is to fulfill in an effective manner the communications among the processes running in parallel on multiple multiprocessors. The conventional electrical backplane bus provides narrow bandwidth as restricted by the physical limitations of electrical interconnects. In the electrical domain, in order to operate at high frequency, the backplane topology has been changed from the simple shared bus to the complicated switched medium. However, the switched medium is an indirect network. It cannot support multicast/broadcast as effectively as the shared bus. Besides the additional latency of going through the intermediate switching nodes, signal routing introduces substantial delay and considerable system complexity. Alternatively, optics has been well known for its interconnect capability. Therefore, it has become imperative to investigate how to improve multiprocessing performance by utilizing optical interconnects. From the implementation standpoint, the existing optical technologies still cannot fulfill the intelligent functions that a switch fabric should provide as effectively as their electronic counterparts. Thus, an innovative optical technology that can provide sufficient bandwidth capacity, while at the same time, retaining the essential merits of the shared bus topology, is highly desirable for the multiprocessing performance improvement. In this paper, the optical centralized shared bus is proposed for use in the multiprocessing systems. This novel optical interconnect architecture not only utilizes the beneficial characteristics of optics, but also retains the desirable properties of the shared bus topology. Meanwhile, from the architecture standpoint, it fits well in the centralized shared-memory multiprocessing scheme. Therefore, a smooth migration with substantial

  16. Multiprocessor Neural Network in Healthcare.

    PubMed

    Godó, Zoltán Attila; Kiss, Gábor; Kocsis, Dénes

    2015-01-01

    A possible way of creating a multiprocessor artificial neural network is by the use of microcontrollers. The RISC processors' high performance and the large number of I/O ports mean they are greatly suitable for creating such a system. During our research, we wanted to see if it is possible to efficiently create interaction between the artifical neural network and the natural nervous system. To achieve as much analogy to the living nervous system as possible, we created a frequency-modulated analog connection between the units. Our system is connected to the living nervous system through 128 microelectrodes. Two-way communication is provided through A/D transformation, which is even capable of testing psychopharmacons. The microcontroller-based analog artificial neural network can play a great role in medical singal processing, such as ECG, EEG etc. PMID:26152990

  17. Multiprocessor performance modeling with ADAS

    NASA Technical Reports Server (NTRS)

    Hayes, Paul J.; Andrews, Asa M.

    1989-01-01

    A graph managing strategy referred to as the Algorithm to Architecture Mapping Model (ATAMM) appears useful for the time-optimized execution of application algorithm graphs in embedded multiprocessors and for the performance prediction of graph designs. This paper reports the modeling of ATAMM in the Architecture Design and Assessment System (ADAS) to make an independent verification of ATAMM's performance prediction capability and to provide a user framework for the evaluation of arbitrary algorithm graphs. Following an overview of ATAMM and its major functional rules are descriptions of the ADAS model of ATAMM, methods to enter an arbitrary graph into the model, and techniques to analyze the simulation results. The performance of a 7-node graph example is evaluated using the ADAS model and verifies the ATAMM concept by substantiating previously published performance results.

  18. ATAMM enhancement and multiprocessor performance evaluation

    NASA Technical Reports Server (NTRS)

    Stoughton, John W.; Mielke, Roland R.; Som, Sukhamoy; Obando, Rodrigo; Malekpour, Mahyar R.; Jones, Robert L., III; Mandala, Brij Mohan V.

    1991-01-01

    ATAMM (Algorithm To Architecture Mapping Model) enhancement and multiprocessor performance evaluation is discussed. The following topics are included: the ATAMM model; ATAMM enhancement; ADM (Advanced Development Model) implementation of ATAMM; and ATAMM support tools.

  19. Parallel computational steering for HPC applications using HDF5 files in distributed shared memory.

    PubMed

    Biddiscombe, John; Soumagne, Jerome; Oger, Guillaume; Guibert, David; Piccinali, Jean-Guillaume

    2012-06-01

    Interfacing a GUI driven visualization/analysis package to an HPC application enables a supercomputer to be used as an interactive instrument. We achieve this by replacing the IO layer in the HDF5 library with a custom driver which transfers data in parallel between simulation and analysis. Our implementation using ParaView as the interface, allows a flexible combination of parallel simulation, concurrent parallel analysis, and GUI client, either on the same or separate machines. Each MPI job may use different core counts or hardware configurations, allowing fine tuning of the amount of resources dedicated to each part of the workload. By making use of a distributed shared memory file, one may read data from the simulation, modify it using ParaView pipelines, write it back, to be reused by the simulation (or vice versa). This allows not only simple parameter changes, but complete remeshing of grids, or operations involving regeneration of field values over the entire domain. To avoid the problem of manually customizing the GUI for each application that is to be steered, we make use of XML templates that describe outputs from the simulation (and inputs back to it) to automatically generate GUI controls for manipulation of the simulation. PMID:22350196

  20. Shared Etiology of Phonological Memory and Vocabulary Deficits in School-Age Children

    PubMed Central

    Peterson, Robin L.; Pennington, Bruce F.; Samuelsson, Stefan; Byrne, Brian; Olson, Richard K.

    2012-01-01

    Purpose The goal of this study was to investigate the etiologic basis for the association between deficits in phonological memory (PM) and vocabulary in school-age children. Method Children with deficits in PM or vocabulary were identified within the International Longitudinal Twin Study (ILTS). The ILTS includes 1,045 twin pairs from the United States, Australia, and Scandinavia aged 5 to 8 years. We applied the DeFries-Fulker regression method to determine whether problems in PM and vocabulary tend to co-occur because of overlapping genes, overlapping environmental risk factors, or both. Results Among children with isolated PM deficits, we found significant bivariate heritability of PM and vocabulary weaknesses both within and across time. However, when probands were selected for a vocabulary deficit, there was no evidence for bivariate heritability. In this case, the PM-vocabulary relationship appeared to owe to common shared environmental experiences. Conclusions The findings are consistent with previous research on the heritability of specific language impairment and suggest that there are etiologic subgroups of children with poor vocabulary for different reasons, one more influenced by genes and another more influenced by environment. PMID:23275423

  1. MLP: A Parallel Programming Alternative to MPI for New Shared Memory Parallel Systems

    NASA Technical Reports Server (NTRS)

    Taft, James R.

    1999-01-01

    Recent developments at the NASA AMES Research Center's NAS Division have demonstrated that the new generation of NUMA based Symmetric Multi-Processing systems (SMPs), such as the Silicon Graphics Origin 2000, can successfully execute legacy vector oriented CFD production codes at sustained rates far exceeding processing rates possible on dedicated 16 CPU Cray C90 systems. This high level of performance is achieved via shared memory based Multi-Level Parallelism (MLP). This programming approach, developed at NAS and outlined below, is distinct from the message passing paradigm of MPI. It offers parallelism at both the fine and coarse grained level, with communication latencies that are approximately 50-100 times lower than typical MPI implementations on the same platform. Such latency reductions offer the promise of performance scaling to very large CPU counts. The method draws on, but is also distinct from, the newly defined OpenMP specification, which uses compiler directives to support a limited subset of multi-level parallel operations. The NAS MLP method is general, and applicable to a large class of NASA CFD codes.

  2. Testing and operating a multiprocessor chip with processor redundancy

    DOEpatents

    Bellofatto, Ralph E; Douskey, Steven M; Haring, Rudolf A; McManus, Moyra K; Ohmacht, Martin; Schmunkamp, Dietmar; Sugavanam, Krishnan; Weatherford, Bryan J

    2014-10-21

    A system and method for improving the yield rate of a multiprocessor semiconductor chip that includes primary processor cores and one or more redundant processor cores. A first tester conducts a first test on one or more processor cores, and encodes results of the first test in an on-chip non-volatile memory. A second tester conducts a second test on the processor cores, and encodes results of the second test in an external non-volatile storage device. An override bit of a multiplexer is set if a processor core fails the second test. In response to the override bit, the multiplexer selects a physical-to-logical mapping of processor IDs according to one of: the encoded results in the memory device or the encoded results in the external storage device. On-chip logic configures the processor cores according to the selected physical-to-logical mapping.

  3. Multiprocessor architecture to handle TJ-II VXI-based digitization channels

    NASA Astrophysics Data System (ADS)

    Crémy, C.; Vega, J.; Sánchez, E.; Dulya, C. M.; Portas, A.

    1999-01-01

    The data acquisition System (DAS) of the TJ-II stellerator provides up to 300 digitization channels integrated in register-based VXI modules designed in CIEMAT Laboratories. The modules are embedded into six 13-slot VXI chassis connected to the TJ-II DAS central computer by means of a dual LAN topology. During normal operation, remote control of the VXI systems and channel setup are accomplished through an Ethernet LAN, while two FDDI rings are dedicated to postdischarge fast data transfer. The former network link is performed by the bus controller whereas the latter one is provided through a FDDI node controller installed in the mainframe, thus creating a multiprocessor architecture. Dedicated software, running on the VxWorks operating system, has been developed to provide handling of the VXI systems including the following facilities: mainframe information readout, channel setup, real time digitization handling, and data transfer. This software, implemented in C++, is distributed over the two CPUs. Interprocessor communication for synchronization purposes is based on a backplane shared memory pool.

  4. Parallelization of a multiregion flow and transport code using software emulated global shared memory and high performance FORTRAN

    SciTech Connect

    D`Azevedo, E.F.; Gwo, Jin-Ping

    1997-02-01

    The objectives of this research are (1) to parallelize a suite of multiregion groundwater flow and solute transport codes that use Galerkin and Lagrangian- Eulerian finite element methods, (2) to test the compatibility of a global shared memory emulation software with a High Performance FORTRAN (HPF) compiler, and (3) to obtain performance characteristics and scalability of the parallel codes. The suite of multiregion flow and transport codes, 3DMURF and 3DMURT, were parallelized using the DOLIB shared memory emulation, in conjunction with the PGI HPF compiler, to run on the Intel Paragons at the Oak Ridge National Laboratory (ORNL) and a network of workstations. The novelty of this effort is first in the use of HPF and global shared memory emulation concurrently to facilitate the conversion of a serial code to a parallel code, and secondly the shared memory library enables efficient implementation of Lagrangian particle tracking along flow characteristics. The latter allows long-time-step-size simulation with particle tracking and dynamic particle redistribution for load balancing, thereby reducing the number of time steps needed for most transient problems. The parallel codes were applied to a pumping well problem to test the efficiency of the domain decomposition and particle tracking algorithms. The full problem domain consists of over 200,000 degrees of freedom with highly nonlinear soil property functions. Relatively good scalability was obtained for a preliminary test run on the Intel Paragons at the Center for Computational Sciences (CCS), ORNL. However, due to the difficulties we encountered in the PGI HPF compiler, as of the writing of this manuscript we are able to report results from 3DMURF only.

  5. Thread mapping using system-level model for shared memory multicores

    NASA Astrophysics Data System (ADS)

    Mitra, Reshmi

    Exploring thread-to-core mapping options for a parallel application on a multicore architecture is computationally very expensive. For the same algorithm, the mapping strategy (MS) with the best response time may change with data size and thread counts. The primary challenge is to design a fast, accurate and automatic framework for exploring these MSs for large data-intensive applications. This is to ensure that the users can explore the design space within reasonable machine hours, without thorough understanding on how the code interacts with the platform. Response time is related to the cycles per instructions retired (CPI), taking into account both active and sleep states of the pipeline. This work establishes a hybrid approach, based on Markov Chain Model (MCM) and Model Tree (MT) for system-level steady state CPI prediction. It is designed for shared memory multicore processors with coarse-grained multithreading. The thread status is represented by the MCM states. The program characteristics are modeled as the transition probabilities, representing the system moving between active and suspended thread states. The MT model extrapolates these probabilities for the actual application size (AS) from the smaller AS performance. This aspect of the framework, along with, the use of mathematical expressions for the actual AS performance information, results in a tremendous reduction in the CPI prediction time. The framework is validated using an electromagnetics application. The average performance prediction error for steady state CPI results with 12 different MSs is less than 1%. The total run time of model is of the order of minutes, whereas the actual application execution time is in terms of days.

  6. Linear solvers on multiprocessor machines

    SciTech Connect

    Kalogerakis, M.A.

    1986-01-01

    Two new methods are introduced for the parallel solution of banded linear systems on multiprocessor machines. Moreover, some new techniques are obtained as variations of the two methods that are applicable to special instances of the problem. Comparisons with the best known methods are performed, from which it is concluded that the two methods are superior, while their variations for special instances are, in general, competitive and in some cases best. In the process, some new results on the parallel prefix problem are obtained and a new design for this problem is presented that is suitable for VLSI implementation. Furthermore, a general model is introduced for the analysis and classification of methods that are based on row transformations of matrices. It is seen that most known methods are included in this model. It is demonstrated that this model may be used as a basis for the analysis as well as the generation of important aspects of those methods, such as their arithmetic complexity and interprocessor communication requirements.

  7. Iterative algorithms for tridiagonal matrices on a WSI-multiprocessor

    NASA Astrophysics Data System (ADS)

    Gajski, D. D.; Sameh, A. H.; Wisniewski, J. A.

    With the rapid advances in semiconductor technology, the construction of Wafer Scale Integration (WSI)-multiprocessors consisting of a large number of processors is now feasible. The implementation of some basis linear algebra algorithms on such multiprocessors is illustrated.

  8. Multiprocessor smalltalk: Implementation, performance, and analysis

    SciTech Connect

    Pallas, J.I.

    1990-01-01

    Multiprocessor Smalltalk demonstrates the value of object-oriented programming on a multiprocessor. Its implementation and analysis shed light on three areas: concurrent programming in an object oriented language without special extensions, implementation techniques for adapting to multiprocessors, and performance factors in the resulting system. Adding parallelism to Smalltalk code is easy, because programs already use control abstractions like iterators. Smalltalk's basic control and concurrency primitives (lambda expressions, processes and semaphores) can be used to build parallel control abstractions, including parallel iterators, parallel objects, atomic objects, and futures. Language extensions for concurrency are not required. This implementation demonstrates that it is possible to build an efficient parallel object-oriented programming system and illustrates techniques for doing so. Three modification tools-serialization, replication, and reorganization-adapted the Berkeley Smalltalk interpreter to the Firefly multiprocessor. Multiprocessor Smalltalk's performance shows that the combination of multiprocessing and object-oriented programming can be effective: speedups (relative to the original serial version) exceed 2.0 for five processors on all the benchmarks; the median efficiency is 48%. Analysis shows both where performance is lost and how to improve and generalize the experimental results. Changes in the interpreter to support concurrency add at most 12% overhead; better access to per-process variables could eliminate much of that. Changes in the user code to express concurrency add as much as 70% overhead; this overhead could be reduced to 54% if blocks (lambda expressions) were reentrant. Performance is also lost when the program cannot keep all five processors busy.

  9. Multiprocessor switch with selective pairing

    SciTech Connect

    Gara, Alan; Gschwind, Michael K; Salapura, Valentina

    2014-03-11

    System, method and computer program product for a multiprocessing system to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). Each paired microprocessor or processor cores that provide one highly reliable thread for high-reliability connect with a system components such as a memory "nest" (or memory hierarchy), an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. The memory nest is attached to a selective pairing facility via a switch or a bus

  10. A combined PLC and CPU approach to multiprocessor control

    SciTech Connect

    Harris, J.J.; Broesch, J.D.; Coon, R.M.

    1995-10-01

    A sophisticated multiprocessor control system has been developed for use in the E-Power Supply System Integrated Control (EPSSIC) on the DIII-D tokamak. EPSSIC provides control and interlocks for the ohmic heating coil power supply and its associated systems. Of particular interest is the architecture of this system: both a Programmable Logic Controller (PLC) and a Central Processor Unit (CPU) have been combined on a standard VME bus. The PLC and CPU input and output signals are routed through signal conditioning modules, which provide the necessary voltage and ground isolation. Additionally these modules adapt the signal levels to that of the VME I/O boards. One set of I/O signals is shared between the two processors. The resulting multiprocessor system provides a number of advantages: redundant operation for mission critical situations, flexible communications using conventional TCP/IP protocols, the simplicity of ladder logic programming for the majority of the control code, and an easily maintained and expandable non-proprietary system.

  11. Associative-memory representations emerge as shared spatial patterns of theta activity spanning the primate temporal cortex.

    PubMed

    Nakahara, Kiyoshi; Adachi, Ken; Kawasaki, Keisuke; Matsuo, Takeshi; Sawahata, Hirohito; Majima, Kei; Takeda, Masaki; Sugiyama, Sayaka; Nakata, Ryota; Iijima, Atsuhiko; Tanigawa, Hisashi; Suzuki, Takafumi; Kamitani, Yukiyasu; Hasegawa, Isao

    2016-01-01

    Highly localized neuronal spikes in primate temporal cortex can encode associative memory; however, whether memory formation involves area-wide reorganization of ensemble activity, which often accompanies rhythmicity, or just local microcircuit-level plasticity, remains elusive. Using high-density electrocorticography, we capture local-field potentials spanning the monkey temporal lobes, and show that the visual pair-association (PA) memory is encoded in spatial patterns of theta activity in areas TE, 36, and, partially, in the parahippocampal cortex, but not in the entorhinal cortex. The theta patterns elicited by learned paired associates are distinct between pairs, but similar within pairs. This pattern similarity, emerging through novel PA learning, allows a machine-learning decoder trained on theta patterns elicited by a particular visual item to correctly predict the identity of those elicited by its paired associate. Our results suggest that the formation and sharing of widespread cortical theta patterns via learning-induced reorganization are involved in the mechanisms of associative memory representation. PMID:27282247

  12. Associative-memory representations emerge as shared spatial patterns of theta activity spanning the primate temporal cortex

    PubMed Central

    Nakahara, Kiyoshi; Adachi, Ken; Kawasaki, Keisuke; Matsuo, Takeshi; Sawahata, Hirohito; Majima, Kei; Takeda, Masaki; Sugiyama, Sayaka; Nakata, Ryota; Iijima, Atsuhiko; Tanigawa, Hisashi; Suzuki, Takafumi; Kamitani, Yukiyasu; Hasegawa, Isao

    2016-01-01

    Highly localized neuronal spikes in primate temporal cortex can encode associative memory; however, whether memory formation involves area-wide reorganization of ensemble activity, which often accompanies rhythmicity, or just local microcircuit-level plasticity, remains elusive. Using high-density electrocorticography, we capture local-field potentials spanning the monkey temporal lobes, and show that the visual pair-association (PA) memory is encoded in spatial patterns of theta activity in areas TE, 36, and, partially, in the parahippocampal cortex, but not in the entorhinal cortex. The theta patterns elicited by learned paired associates are distinct between pairs, but similar within pairs. This pattern similarity, emerging through novel PA learning, allows a machine-learning decoder trained on theta patterns elicited by a particular visual item to correctly predict the identity of those elicited by its paired associate. Our results suggest that the formation and sharing of widespread cortical theta patterns via learning-induced reorganization are involved in the mechanisms of associative memory representation. PMID:27282247

  13. Numerical Linear Algebra On The CEDAR Multiprocessor

    NASA Astrophysics Data System (ADS)

    Meier, Ulrike; Sameh, Ahmed

    1988-01-01

    In this paper we describe in some detail the architectural features of the CEDAR. multiprocessor. We also discuss strategies for implementation of dense matrix computations, and present performance results on one cluster for a variety of linear system solvers, eigenvalue problem solvers, as well as algorithms for solving linear least squares problems.

  14. Spaceborne VHSIC multiprocessor system for AI applications

    NASA Technical Reports Server (NTRS)

    Lum, Henry, Jr.; Shrobe, Howard E.; Aspinall, John G.

    1988-01-01

    A multiprocessor system, under design for space-station applications, makes use of the latest generation symbolic processor and packaging technology. The result will be a compact, space-qualified system two to three orders of magnitude more powerful than present-day symbolic processing systems.

  15. Fault-tolerant interconnection networks for multiprocessor systems

    SciTech Connect

    Nassar, H.M.

    1989-01-01

    Interconnection networks represent the backbone of multiprocessor systems. A failure in the network, therefore, could seriously degrade the system performance. For this reason, fault tolerance has been regarded as a major consideration in interconnection network design. This thesis presents two novel techniques to provide fault tolerance capabilities to three major networks: the Beneline network and the Clos network. First, the Simple Fault Tolerance Technique (SFT) is presented. The SFT technique is in fact the result of merging two widely known interconnection mechanisms: a normal interconnection network and a shared bus. This technique is most suitable for networks with small switches, such as the Baseline network and the Benes network. For the Clos network, whose switches may be large for the SFT, another technique is developed to produce the Fault-Tolerant Clos (FTC) network. In the FTC, one switch is added to each stage. The two techniques are described and thoroughly analyzed.

  16. Multi-core and Many-core Shared-memory Parallel Raycasting Volume Rendering Optimization and Tuning

    SciTech Connect

    Howison, Mark

    2012-01-31

    Given the computing industry trend of increasing processing capacity by adding more cores to a chip, the focus of this work is tuning the performance of a staple visualization algorithm, raycasting volume rendering, for shared-memory parallelism on multi-core CPUs and many-core GPUs. Our approach is to vary tunable algorithmic settings, along with known algorithmic optimizations and two different memory layouts, and measure performance in terms of absolute runtime and L2 memory cache misses. Our results indicate there is a wide variation in runtime performance on all platforms, as much as 254% for the tunable parameters we test on multi-core CPUs and 265% on many-core GPUs, and the optimal configurations vary across platforms, often in a non-obvious way. For example, our results indicate the optimal configurations on the GPU occur at a crossover point between those that maintain good cache utilization and those that saturate computational throughput. This result is likely to be extremely difficult to predict with an empirical performance model for this particular algorithm because it has an unstructured memory access pattern that varies locally for individual rays and globally for the selected viewpoint. Our results also show that optimal parameters on modern architectures are markedly different from those in previous studies run on older architectures. And, given the dramatic performance variation across platforms for both optimal algorithm settings and performance results, there is a clear benefit for production visualization and analysis codes to adopt a strategy for performance optimization through auto-tuning. These benefits will likely become more pronounced in the future as the number of cores per chip and the cost of moving data through the memory hierarchy both increase.

  17. Shared and distinct contributions of rostrolateral prefrontal cortex to analogical reasoning and episodic memory retrieval.

    PubMed

    Westphal, Andrew J; Reggente, Nicco; Ito, Kaori L; Rissman, Jesse

    2016-03-01

    Rostrolateral prefrontal cortex (RLPFC) is widely appreciated to support higher cognitive functions, including analogical reasoning and episodic memory retrieval. However, these tasks have typically been studied in isolation, and thus it is unclear whether they involve common or distinct RLPFC mechanisms. Here, we introduce a novel functional magnetic resonance imaging (fMRI) task paradigm to compare brain activity during reasoning and memory tasks while holding bottom-up perceptual stimulation and response demands constant. Univariate analyses on fMRI data from twenty participants identified a large swath of left lateral prefrontal cortex, including RLPFC, that showed common engagement on reasoning trials with valid analogies and memory trials with accurately retrieved source details. Despite broadly overlapping recruitment, multi-voxel activity patterns within left RLPFC reliably differentiated these two trial types, highlighting the presence of at least partially distinct information processing modes. Functional connectivity analyses demonstrated that while left RLPFC showed consistent coupling with the fronto-parietal control network across tasks, its coupling with other cortical areas varied in a task-dependent manner. During the memory task, this region strengthened its connectivity with the default mode and memory retrieval networks, whereas during the reasoning task it coupled more strongly with a nearby left prefrontal region (BA 45) associated with semantic processing, as well as with a superior parietal region associated with visuospatial processing. Taken together, these data suggest a domain-general role for left RLPFC in monitoring and/or integrating task-relevant knowledge representations and showcase how its function cannot solely be attributed to episodic memory or analogical reasoning computations. Hum Brain Mapp 37:896-912, 2016. © 2015 Wiley Periodicals, Inc. PMID:26663572

  18. Shared Etiology of Phonological Memory and Vocabulary Deficits in School-Age Children

    ERIC Educational Resources Information Center

    Peterson, Robin L.; Pennington, Bruce F.; Samuelsson, Stefan; Byrne, Brian; Olson, Richard K.

    2013-01-01

    Purpose: The goal of this study was to investigate the etiologic basis for the association between deficits in phonological memory (PM) and vocabulary in school-age children. Method: Children with deficits in PM or vocabulary were identified within the International Longitudinal Twin Study (ILTS; Samuelsson et al., 2005). The ILTS includes 1,045…

  19. Fault detection, isolation and reconfiguration in FTMP Methods and experimental results. [fault tolerant multiprocessor

    NASA Technical Reports Server (NTRS)

    Lala, J. H.

    1983-01-01

    The Fault-Tolerant Multiprocessor (FTMP) is a highly reliable computer designed to meet a goal of 10 to the -10th failures per hour and built with the objective of flying an active-control transport aircraft. Fault detection, identification, and recovery software is described, and experimental results obtained by injecting faults in the pin level in the FTMP are presented. Over 21,000 faults were injected in the CPU, memory, bus interface circuits, and error detection, masking, and error reporting circuits of one LRU of the multiprocessor. Detection, isolation, and reconfiguration times were recorded for each fault, and the results were found to agree well with earlier assumptions made in reliability modeling.

  20. Satisfiability Test with Synchronous Simulated Annealing on the Fujitsu AP1000 Massively-Parallel Multiprocessor

    NASA Technical Reports Server (NTRS)

    Sohn, Andrew; Biswas, Rupak

    1996-01-01

    Solving the hard Satisfiability Problem is time consuming even for modest-sized problem instances. Solving the Random L-SAT Problem is especially difficult due to the ratio of clauses to variables. This report presents a parallel synchronous simulated annealing method for solving the Random L-SAT Problem on a large-scale distributed-memory multiprocessor. In particular, we use a parallel synchronous simulated annealing procedure, called Generalized Speculative Computation, which guarantees the same decision sequence as sequential simulated annealing. To demonstrate the performance of the parallel method, we have selected problem instances varying in size from 100-variables/425-clauses to 5000-variables/21,250-clauses. Experimental results on the AP1000 multiprocessor indicate that our approach can satisfy 99.9 percent of the clauses while giving almost a 70-fold speedup on 500 processors.

  1. Neural substrates of shared attention as social memory: A hyperscanning functional magnetic resonance imaging study.

    PubMed

    Koike, Takahiko; Tanabe, Hiroki C; Okazaki, Shuntaro; Nakagawa, Eri; Sasaki, Akihiro T; Shimada, Koji; Sugawara, Sho K; Takahashi, Haruka K; Yoshihara, Kazufumi; Bosch-Bayard, Jorge; Sadato, Norihiro

    2016-01-15

    During a dyadic social interaction, two individuals can share visual attention through gaze, directed to each other (mutual gaze) or to a third person or an object (joint attention). Shared attention is fundamental to dyadic face-to-face interaction, but how attention is shared, retained, and neutrally represented in a pair-specific manner has not been well studied. Here, we conducted a two-day hyperscanning functional magnetic resonance imaging study in which pairs of participants performed a real-time mutual gaze task followed by a joint attention task on the first day, and mutual gaze tasks several days later. The joint attention task enhanced eye-blink synchronization, which is believed to be a behavioral index of shared attention. When the same participant pairs underwent mutual gaze without joint attention on the second day, enhanced eye-blink synchronization persisted, and this was positively correlated with inter-individual neural synchronization within the right inferior frontal gyrus. Neural synchronization was also positively correlated with enhanced eye-blink synchronization during the previous joint attention task session. Consistent with the Hebbian association hypothesis, the right inferior frontal gyrus had been activated both by initiating and responding to joint attention. These results indicate that shared attention is represented and retained by pair-specific neural synchronization that cannot be reduced to the individual level. PMID:26514295

  2. LDRD final report : managing shared memory data distribution in hybrid HPC applications.

    SciTech Connect

    Merritt, Alexander M.; Pedretti, Kevin Thomas Tauke

    2010-09-01

    MPI is the dominant programming model for distributed memory parallel computers, and is often used as the intra-node programming model on multi-core compute nodes. However, application developers are increasingly turning to hybrid models that use threading within a node and MPI between nodes. In contrast to MPI, most current threaded models do not require application developers to deal explicitly with data locality. With increasing core counts and deeper NUMA hierarchies seen in the upcoming LANL/SNL 'Cielo' capability supercomputer, data distribution poses an upper boundary on intra-node scalability within threaded applications. Data locality therefore has to be identified at runtime using static memory allocation policies such as first-touch or next-touch, or specified by the application user at launch time. We evaluate several existing techniques for managing data distribution using micro-benchmarks on an AMD 'Magny-Cours' system with 24 cores among 4 NUMA domains and argue for the adoption of a dynamic runtime system implemented at the kernel level, employing a novel page table replication scheme to gather per-NUMA domain memory access traces.

  3. Shared Memory Performance of Multi-Computer Terminals in Distributed Information Systems.

    ERIC Educational Resources Information Center

    Reddi, Arumalla V.

    1984-01-01

    Presents a system model for transmission of input data that is coming from terminals of users in a limited user resource-sharing environment. Performance of a mini/microcomputer receiving mixture of picture-phone terminal data is analyzed with constant service times, synchronous transmission, and single-server interruptions through first-order…

  4. Autobiographical Memory Sharing in Everyday Life: Characteristics of a Good Story

    ERIC Educational Resources Information Center

    Baron, Jacqueline M.; Bluck, Susan

    2009-01-01

    Storytelling is a ubiquitous human activity that occurs across the lifespan as part of everyday life. Studies from three disparate literatures suggest that older adults (as compared to younger adults) are (a) less likely to recall story details, (b) more likely to go off-target when sharing stories, and, in contrast, (c) more likely to receive…

  5. The fault-tolerant multiprocessor computer

    NASA Technical Reports Server (NTRS)

    Smith, T. B., III (Editor); Lala, J. H. (Editor); Goldberg, J. (Editor); Kautz, W. H. (Editor); Melliar-Smith, P. M. (Editor); Green, M. W. (Editor); Levitt, K. N. (Editor); Schwartz, R. L. (Editor); Weinstock, C. B. (Editor); Palumbo, D. L. (Editor)

    1986-01-01

    The development and evaluation of fault-tolerant computer architectures and software-implemented fault tolerance (SIFT) for use in advanced NASA vehicles and potentially in flight-control systems are described in a collection of previously published reports prepared for NASA. Topics addressed include the principles of fault-tolerant multiprocessor (FTMP) operation; processor and slave regional designs; FTMP executive, facilities, acceptance-test/diagnostic, applications, and support software; FTM reliability and availability models; SIFT hardware design; and SIFT validation and verification.

  6. Pipeline multiprocessor architecture for high speed cell image analysis

    SciTech Connect

    Castleman, K.R.; Price, K.H.; Eskenazi, R.; Ovadya, M.M.; Navon, M.A.

    1983-10-01

    A pipeline multiple-microprocessor architecture for high-speed digital image processing is being developed. The goal is a compact, fast, and low-cost pap smear analyzer for cervical cancer detection. Each processor communicates with one or two upstream processors and from one to 13 downstream processors via shared memory. Each of the identical pipeline modules (PC boards) has a Motorla MC6809 microprocessor with a 2 megabyte memory management unit, two 64kbyte dual-port image memories (shared with upstream processors) and one 64kbyte dual-port program memory (shared with a host computer). Intermodule communication is achieved by ribbon cables connected to connectors at the top of the boards. This allows considerable flexibility in configuring the system. This architecture should facilitate efficient (fast, low-cost) implementations of complex single-purpose image processing systems.

  7. Developmental improvements in the resolution and capacity of visual working memory share a common source.

    PubMed

    Simmering, Vanessa R; Miller, Hilary E

    2016-08-01

    The nature of visual working memory (VWM) representations is currently a source of debate between characterizations as slot-like versus a flexibly-divided pool of resources. Recently, a dynamic neural field model has been proposed as an alternative account that focuses more on the processes by which VWM representations are formed, maintained, and used in service of behavior. This dynamic model has explained developmental increases in VWM capacity and resolution through strengthening excitatory and inhibitory connections. Simulations of developmental improvements in VWM resolution suggest that one important change is the accuracy of comparisons between items held in memory and new inputs. Thus, the ability to detect changes is a critical component of developmental improvements in VWM performance across tasks, leading to the prediction that capacity and resolution should correlate during childhood. Comparing 5- to 8-year-old children's performance across color discrimination and change detection tasks revealed the predicted correlation between estimates of VWM capacity and resolution, supporting the hypothesis that increasing connectivity underlies improvements in VWM during childhood. These results demonstrate the importance of formalizing the processes that support the use of VWM, rather than focusing solely on the nature of representations. We conclude by considering our results in the broader context of VWM development. PMID:27329264

  8. Memory.

    ERIC Educational Resources Information Center

    McKean, Kevin

    1983-01-01

    Discusses current research (including that involving amnesiacs and snails) into the nature of the memory process, differentiating between and providing examples of "fact" memory and "skill" memory. Suggests that three brain parts (thalamus, fornix, mammilary body) are involved in the memory process. (JN)

  9. Directions in parallel programming: HPF, shared virtual memory and object parallelism in pC++

    NASA Technical Reports Server (NTRS)

    Bodin, Francois; Priol, Thierry; Mehrotra, Piyush; Gannon, Dennis

    1994-01-01

    Fortran and C++ are the dominant programming languages used in scientific computation. Consequently, extensions to these languages are the most popular for programming massively parallel computers. We discuss two such approaches to parallel Fortran and one approach to C++. The High Performance Fortran Forum has designed HPF with the intent of supporting data parallelism on Fortran 90 applications. HPF works by asking the user to help the compiler distribute and align the data structures with the distributed memory modules in the system. Fortran-S takes a different approach in which the data distribution is managed by the operating system and the user provides annotations to indicate parallel control regions. In the case of C++, we look at pC++ which is based on a concurrent aggregate parallel model.

  10. The change probability effect: incidental learning, adaptability, and shared visual working memory resources.

    PubMed

    van Lamsweerde, Amanda E; Beck, Melissa R

    2011-12-01

    Statistical properties in the visual environment can be used to improve performance on visual working memory (VWM) tasks. The current study examined the ability to incidentally learn that a change is more likely to occur to a particular feature dimension (shape, color, or location) and use this information to improve change detection performance for that dimension (the change probability effect). Participants completed a change detection task in which one change type was more probable than others. Change probability effects were found for color and shape changes, but not location changes, and intentional strategies did not improve the effect. Furthermore, the change probability effect developed and adapted to new probability information quickly. Finally, in some conditions, an improvement in change detection performance for a probable change led to an impairment in change detection for improbable changes. PMID:21963330

  11. An Analysis of an Improved Bus-Based Multiprocessor Architecture

    NASA Technical Reports Server (NTRS)

    Ricks, Kenneth G.; Wells, B. Earl

    1998-01-01

    This paper analyses the effectiveness of a hybrid multiprocessing/multicomputing architecture that is based upon a single-board-computer multiprocessor (SBCM) architecture. Based upon empirical analysis using discrete event simulations and Monte Carlo techniques, this hybrid architecture, called the enhanced single-board-computer multiprocessor (ESBCM), is shown to have improved performance and scalability characteristics over current SBCM designs.

  12. Use of a multiprocessor for control of a robotic system

    NASA Technical Reports Server (NTRS)

    Klein, C. A.; Wahawisan, W.

    1982-01-01

    In the case of complex industrial operations, the use of a multiprocessor for the control of a robotic system has a number of potential advantages over an employment of a single-processor design. In addition to the increased speed of parallel computation, multiprocessors can provide greater reliability. However, the control of a system with a multiprocessor is much more complicated, and requires resolution of a number of design criteria. The degree of coupling, the processor interconnection configuration, and the method of fault detection and correction are factors in the design of a multiprocessor which need to be considered for each particular application. In connection with the present investigation, a five-unit multiprocessor was built to experiment with the multiprocessor control of a robotic system. While the primary application of the multiprocessor involves the control of the Ohio State University (OSU) Hexapod, an 18-degree-of-freedom, motor-driven walking machine, the multiprocessor was designed to be versatile enough for a numbr of other uses.

  13. Partitioning of regular computation on multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Lee, Fung Fung

    1988-01-01

    Problem partitioning of regular computation over two dimensional meshes on multiprocessor systems is examined. The regular computation model considered involves repetitive evaluation of values at each mesh point with local communication. The computational workload and the communication pattern are the same at each mesh point. The regular computation model arises in numerical solutions of partial differential equations and simulations of cellular automata. Given a communication pattern, a systematic way to generate a family of partitions is presented. The influence of various partitioning schemes on performance is compared on the basis of computation to communication ratio.

  14. Allocation for the SANDAC multiprocessor system

    SciTech Connect

    Ravl, T.M.; Ercegovac, M.D.

    1986-01-01

    This report describes an algorithm for the static allocation of tasks in a general Dataflow Multiprocessor and the SANDAC IV system in particular. Initially a model of execution and the underlying assumptions about the architecture are outlined. The authors then discuss a Graph Reduction algorithm for preprocessing the computation graph. The Graph Reduction algorithm reduces a fine grain graph to an optimal grain graph. The heuristic allocation algorithm is presented and is based on giving precedence to critical paths and minimizing the communication time between tasks. The performance of the algorithm is analyzed and the effect of varying parameters is studied. Subsequently an alternative variation with better characteristics is proposed.

  15. A class Hierarchical, object-oriented approach to virtual memory management

    NASA Technical Reports Server (NTRS)

    Russo, Vincent F.; Campbell, Roy H.; Johnston, Gary M.

    1989-01-01

    The Choices family of operating systems exploits class hierarchies and object-oriented programming to facilitate the construction of customized operating systems for shared memory and networked multiprocessors. The software is being used in the Tapestry laboratory to study the performance of algorithms, mechanisms, and policies for parallel systems. Described here are the architectural design and class hierarchy of the Choices virtual memory management system. The software and hardware mechanisms and policies of a virtual memory system implement a memory hierarchy that exploits the trade-off between response times and storage capacities. In Choices, the notion of a memory hierarchy is captured by abstract classes. Concrete subclasses of those abstractions implement a virtual address space, segmentation, paging, physical memory management, secondary storage, and remote (that is, networked) storage. Captured in the notion of a memory hierarchy are classes that represent memory objects. These classes provide a storage mechanism that contains encapsulated data and have methods to read or write the memory object. Each of these classes provides specializations to represent the memory hierarchy.

  16. A multiprocessor airborne lidar data system

    NASA Astrophysics Data System (ADS)

    Wright, C. W.; Bailey, S. A.; Heath, G. E.; Piazza, C. R.

    A new multiprocessor data acquisition system was developed for the existing Airborne Oceanographic Lidar (AOL). This implementation simultaneously utilizes five single board 68010 microcomputers, the UNIX system V operating system, and the real time executive VRTX. The original data acquisition system was implemented on a Hewlett Packard HP 21-MX 16 bit minicomputer using a multi-tasking real time operating system and a mixture of assembly and FORTRAN languages. The present collection of data sources produce data at widely varied rates and require varied amounts of burdensome real time processing and formatting. It was decided to replace the aging HP 21-MX minicomputer with a multiprocessor system. A new and flexible recording format was devised and implemented to accommodate the constantly changing sensor configuration. A central feature of this data system is the minimization of non-remote sensing bus traffic. Therefore, it is highly desirable that each micro be capable of functioning as much as possible on-card or via private peripherals. The bus is used primarily for the transfer of remote sensing data to or from the buffer queue.

  17. A multiprocessor airborne lidar data system

    NASA Technical Reports Server (NTRS)

    Wright, C. W.; Bailey, S. A.; Heath, G. E.; Piazza, C. R.

    1988-01-01

    A new multiprocessor data acquisition system was developed for the existing Airborne Oceanographic Lidar (AOL). This implementation simultaneously utilizes five single board 68010 microcomputers, the UNIX system V operating system, and the real time executive VRTX. The original data acquisition system was implemented on a Hewlett Packard HP 21-MX 16 bit minicomputer using a multi-tasking real time operating system and a mixture of assembly and FORTRAN languages. The present collection of data sources produce data at widely varied rates and require varied amounts of burdensome real time processing and formatting. It was decided to replace the aging HP 21-MX minicomputer with a multiprocessor system. A new and flexible recording format was devised and implemented to accommodate the constantly changing sensor configuration. A central feature of this data system is the minimization of non-remote sensing bus traffic. Therefore, it is highly desirable that each micro be capable of functioning as much as possible on-card or via private peripherals. The bus is used primarily for the transfer of remote sensing data to or from the buffer queue.

  18. Selection in spatial working memory is independent of perceptual selective attention, but they interact in a shared spatial priority map.

    PubMed

    Hedge, Craig; Oberauer, Klaus; Leonards, Ute

    2015-11-01

    We examined the relationship between the attentional selection of perceptual information and of information in working memory (WM) through four experiments, using a spatial WM-updating task. Participants remembered the locations of two objects in a matrix and worked through a sequence of updating operations, each mentally shifting one dot to a new location according to an arrow cue. Repeatedly updating the same object in two successive steps is typically faster than switching to the other object; this object switch cost reflects the shifting of attention in WM. In Experiment 1, the arrows were presented in random peripheral locations, drawing perceptual attention away from the selected object in WM. This manipulation did not eliminate the object switch cost, indicating that the mechanisms of perceptual selection do not underlie selection in WM. Experiments 2a and 2b corroborated the independence of selection observed in Experiment 1, but showed a benefit to reaction times when the placement of the arrow cue was aligned with the locations of relevant objects in WM. Experiment 2c showed that the same benefit also occurs when participants are not able to mark an updating location through eye fixations. Together, these data can be accounted for by a framework in which perceptual selection and selection in WM are separate mechanisms that interact through a shared spatial priority map. PMID:26341873

  19. Mapping of H.264 decoding on a multiprocessor architecture

    NASA Astrophysics Data System (ADS)

    van der Tol, Erik B.; Jaspers, Egbert G.; Gelderblom, Rob H.

    2003-05-01

    Due to the increasing significance of development costs in the competitive domain of high-volume consumer electronics, generic solutions are required to enable reuse of the design effort and to increase the potential market volume. As a result from this, Systems-on-Chip (SoCs) contain a growing amount of fully programmable media processing devices as opposed to application-specific systems, which offered the most attractive solutions due to a high performance density. The following motivates this trend. First, SoCs are increasingly dominated by their communication infrastructure and embedded memory, thereby making the cost of the functional units less significant. Moreover, the continuously growing design costs require generic solutions that can be applied over a broad product range. Hence, powerful programmable SoCs are becoming increasingly attractive. However, to enable power-efficient designs, that are also scalable over the advancing VLSI technology, parallelism should be fully exploited. Both task-level and instruction-level parallelism can be provided by means of e.g. a VLIW multiprocessor architecture. To provide the above-mentioned scalability, we propose to partition the data over the processors, instead of traditional functional partitioning. An advantage of this approach is the inherent locality of data, which is extremely important for communication-efficient software implementations. Consequently, a software implementation is discussed, enabling e.g. SD resolution H.264 decoding with a two-processor architecture, whereas High-Definition (HD) decoding can be achieved with an eight-processor system, executing the same software. Experimental results show that the data communication considerably reduces up to 65% directly improving the overall performance. Apart from considerable improvement in memory bandwidth, this novel concept of partitioning offers a natural approach for optimally balancing the load of all processors, thereby further improving the

  20. Insertion of coherence requests for debugging a multiprocessor

    DOEpatents

    Blumrich, Matthias A.; Salapura, Valentina

    2010-02-23

    A method and system are disclosed to insert coherence events in a multiprocessor computer system, and to present those coherence events to the processors of the multiprocessor computer system for analysis and debugging purposes. The coherence events are inserted in the computer system by adding one or more special insert registers. By writing into the insert registers, coherence events are inserted in the multiprocessor system as if they were generated by the normal coherence protocol. Once these coherence events are processed, the processing of coherence events can continue in the normal operation mode.

  1. A New coscheduling technique for a cluster of symmetric multiprocessors

    SciTech Connect

    Yoo, A B; Jette, M A

    2000-04-17

    Coscheduling is essential for obtaining good performance in a time-shared symmetric multiprocessor (SMP) cluster environment. However, the most common technique, gang scheduling, has limitations such as poor scalability and vulnerability to faults mainly due to explicit synchronization between its components. A decentralized approach called dynamic coscheduling (DCS) has been shown to be effective for network of workstations (NOW), but this technique is not suitable for the workloads on a very large SMP-cluster with thousands of processors. Furthermore, its implementation can be prohibitively expensive for such a large-scale machine. In this paper, we propose a novel coscheduling technique based on the DCS approach which can achieve coscheduling on very large SMP-clusters in a scalable, efficient, and cost-effective way. In the proposed technique, each local scheduler achieves coscheduling based upon message traffic between the components of parallel jobs. Message trapping is carried out at the user-level, eliminating the need for unsupported hardware or device-level programming. A sending process attaches its status to outgoing messages so local schedulers on remote nodes can make more intelligent scheduling decisions. Once scheduled, processes are guaranteed some minimum period of time to execute. This provides an opportunity to synchronize the parallel job's components across all nodes and achieve good program performance. The results from a performance study reveal that the proposed technique is a promising approach that can reduce response time significantly over uncoordinated scheduling.

  2. High throughput network for multiprocessor interconnections

    NASA Astrophysics Data System (ADS)

    Raatikainen, Pertti; Zidbeck, Juha

    1993-05-01

    Multiprocessor architectures are needed to support modern broadband applications, since traditional bus structures are not capable of providing high throughput. New bus structures are needed, especially in the area of network components and terminals. A study to find an efficient and cost effective interconnection topology for the future high speed products is presented. The most common bus topologies are introduced, and their characteristics are estimated to decide which one of them offers best performance and lowest implementation cost. The ring topology is chosen to be studied in more detail. Four competing bus access schemes for the high throughput ring are introduced as well as simulation models for each of them. Using transfer delay and throughput results, as well as keeping the implementation point of view in mind, the best candidate is selected to be studied and experimented in the succeeding research project.

  3. Clocking and synchronization circuits in multiprocessor systems

    SciTech Connect

    Jeong, D.K.

    1989-01-01

    Microprocessors based on RISC (Reduced Instruction Set Computer) concepts have demonstrated an ability to provide more computing power at a given level of integration than conventional microprocessors. The next step is multiprocessors composed of RISC processing elements. Communication bandwidth among such microprocessors is critical in achieving efficient hardware utilization. This thesis focuses on the communication capability of VLSI circuits and presents new circuit techniques as a guide to build an interconnection network of VLSI microprocessors. Circuit techniques for PLL-based clock generation are described along with stability criteria. The main objective of the circuit is to realize a zero delay buffer. Experimental results show the feasibility of such circuits in VLSI. Synchronizer circuit configurations in both bipolar and MOS technology that best utilize each device, or overcome the technology limit using a bandwidth doubling technique are shown. Interface techniques including handshake mechanisms in such a system are also described.

  4. Fault diagnosis in sparse multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Blough, Douglas M.; Sullivan, Gregory F.; Masson, Gerald M.

    1988-01-01

    The problem of fault diagnosis in multiprocessor systems is considered under a uniformly probabilistic model in which processors are faulty with probability p. This work focuses on minimizing the number of tests that must be conducted in order to correctly diagnose the state of every processor in the system with high probability. A diagnosis algorithm that can correctly diagnose the state of every processor with probability approaching one in a class of systems performing slightly greater than a linear number of tests is presented. A nearly matching lower bound on the number of tests required to achieve correct diagnosis in arbitrary systems is also proven. The number of tests required under this probabilistic model is shown to be significantly less than under a bounded-size fault set model. Because the number of tests that must be conducted is a measure of the diagnosis overhead, these results represent a dramatic improvement in the performance of system-level diagnosis technique.

  5. Hardware for a real-time multiprocessor simulator

    NASA Technical Reports Server (NTRS)

    Blech, R. A.; Arpasi, D. J.

    1984-01-01

    The hardware for a real time multiprocessor simulator (RTMPS) developed at the NASA Lewis Research Center is described. The RTMPS is a multiple microprocessor system used to investigate the application of parallel processing concepts to real time simulation. It is designed to provide flexible data exchange paths between processors by using off the shelf microcomputer boards and minimal customized interfacing. A dedicated operator interface allows easy setup of the simulator and quick interpreting of simulation data. Simulations for the RTMPS are coded in a NASA designed real time multiprocessor language (RTMPL). This language is high level and geared to the multiprocessor environment. A real time multiprocessor operating system (RTMPOS) has also been developed that provides a user friendly operator interface. The RTMPS and supporting software are currently operational and are being evaluated at Lewis. The results of this evaluation will be used to specify the design of an optimized parallel processing system for real time simulation of dynamic systems.

  6. VME rollback hardware for time warp multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Robb, Michael J.; Buzzell, Calvin A.

    1992-01-01

    The purpose of the research effort is to develop and demonstrate innovative hardware to implement specific rollback and timing functions required for efficient queue management and precision timekeeping in multiprocessor discrete event simulations. The previously completed phase 1 effort demonstrated the technical feasibility of building hardware modules which eliminate the state saving overhead of the Time Warp paradigm used in distributed simulations on multiprocessor systems. The current phase 2 effort will build multiple pre-production rollback hardware modules integrated with a network of Sun workstations, and the integrated system will be tested by executing a Time Warp simulation. The rollback hardware will be designed to interface with the greatest number of multiprocessor systems possible. The authors believe that the rollback hardware will provide for significant speedup of large scale discrete event simulation problems and allow multiprocessors using Time Warp to dramatically increase performance.

  7. File-System Workload on a Scientific Multiprocessor

    NASA Technical Reports Server (NTRS)

    Kotz, David; Nieuwejaar, Nils

    1995-01-01

    Many scientific applications have intense computational and I/O requirements. Although multiprocessors have permitted astounding increases in computational performance, the formidable I/O needs of these applications cannot be met by current multiprocessors a their I/O subsystems. To prevent I/O subsystems from forever bottlenecking multiprocessors and limiting the range of feasible applications, new I/O subsystems must be designed. The successful design of computer systems (both hardware and software) depends on a thorough understanding of their intended use. A system designer optimizes the policies and mechanisms for the cases expected to most common in the user's workload. In the case of multiprocessor file systems, however, designers have been forced to build file systems based only on speculation about how they would be used, extrapolating from file-system characterizations of general-purpose workloads on uniprocessor and distributed systems or scientific workloads on vector supercomputers (see sidebar on related work). To help these system designers, in June 1993 we began the Charisma Project, so named because the project sought to characterize 1/0 in scientific multiprocessor applications from a variety of production parallel computing platforms and sites. The Charisma project is unique in recording individual read and write requests-in live, multiprogramming, parallel workloads (rather than from selected or nonparallel applications). In this article, we present the first results from the project: a characterization of the file-system workload an iPSC/860 multiprocessor running production, parallel scientific applications at NASA's Ames Research Center.

  8. Evaluation of the Cedar memory system: Configuration of 16 by 16

    NASA Technical Reports Server (NTRS)

    Gallivan, K.; Jalby, W.; Wijshoff, H.

    1991-01-01

    Some basic results on the performance of the Cedar multiprocessor system are presented. Empirical results on the 16 processor 16 memory bank system configuration, which show the behavior of the Cedar system under different modes of operation are presented.

  9. FTMP (Fault Tolerant Multiprocessor) programmer's manual

    NASA Technical Reports Server (NTRS)

    Feather, F. E.; Liceaga, C. A.; Padilla, P. A.

    1986-01-01

    The Fault Tolerant Multiprocessor (FTMP) computer system was constructed using the Rockwell/Collins CAPS-6 processor. It is installed in the Avionics Integration Research Laboratory (AIRLAB) of NASA Langley Research Center. It is hosted by AIRLAB's System 10, a VAX 11/750, for the loading of programs and experimentation. The FTMP support software includes a cross compiler for a high level language called Automated Engineering Design (AED) System, an assembler for the CAPS-6 processor assembly language, and a linker. Access to this support software is through an automated remote access facility on the VAX which relieves the user of the burden of learning how to use the IBM 4381. This manual is a compilation of information about the FTMP support environment. It explains the FTMP software and support environment along many of the finer points of running programs on FTMP. This will be helpful to the researcher trying to run an experiment on FTMP and even to the person probing FTMP with fault injections. Much of the information in this manual can be found in other sources; we are only attempting to bring together the basic points in a single source. If the reader should need points clarified, there is a list of support documentation in the back of this manual.

  10. Prefetching in file systems for MIMD multiprocessors

    NASA Technical Reports Server (NTRS)

    Kotz, David F.; Ellis, Carla Schlatter

    1990-01-01

    The question of whether prefetching blocks on the file into the block cache can effectively reduce overall execution time of a parallel computation, even under favorable assumptions, is considered. Experiments have been conducted with an interleaved file system testbed on the Butterfly Plus multiprocessor. Results of these experiments suggest that (1) the hit ratio, the accepted measure in traditional caching studies, may not be an adequate measure of performance when the workload consists of parallel computations and parallel file access patterns, (2) caching with prefetching can significantly improve the hit ratio and the average time to perform an I/O (input/output) operation, and (3) an improvement in overall execution time has been observed in most cases. In spite of these gains, prefetching sometimes results in increased execution times (a negative result, given the optimistic nature of the study). The authors explore why it is not trivial to translate savings on individual I/O requests into consistently better overall performance and identify the key problems that need to be addressed in order to improve the potential of prefetching techniques in the environment.

  11. Interrupt handling in a multiprocessor computing system

    SciTech Connect

    D'Amico, L.W.; Guyer, J.M.

    1989-01-03

    A multiprocessor computing system is described comprising: a system bus, including an address bus for carrying an address phase of an instruction and a data bus for carrying a data phase of an instruction; a plurality of processing units connected to the system bus, each processing unit including means for generating broadcast interrupt origin request instructions on the system bus; asynchronous input/output channel controllers connected to the system bus, each of the input/output channel controllers including means for generating a synchronizing signal in response to completion of an address phase of a broadcast instruction on the system bus, and corresponding to a different one of the processing units connected through each of the input/output channel controllers, the input/output channel controllers being arranged on the priority lines in order of priority, the priority lines being gated in an input/output channel controller so that priority is asserted over all lower priority input/output channel controllers on a priority line by an input/output channel controller if the input/output channel controller has an interrupt pending in the input/output channel controller for the processing unit corresponding to the priority line.

  12. Particle simulation in a multiprocessor environment

    NASA Technical Reports Server (NTRS)

    Mcdonald, Jeffrey D.

    1991-01-01

    A parallel implementation of a particle simulation method that is portable between a wide class of multiprocessor computers is presented. A fine grain spatial decomposition is utilized where several subdomains having a regular structure are computed at each processing node. This leads directly to an efficient and straightforward load balancing scheme if the number of subdomains at each processor is permitted to vary in an appropriate manner. Three dimensional simulations incorporating full thermochemical nonequilibrium are possible using the resulting code. Vectorizable algorithms are retained from earlier work allowing efficient use of deeply pipelined node processors where available. Performance results are presented from three different machine architectures demonstrating the portability of the code. On a 128-node Intel iPSC/860, performance is twice that of a single Cray-Y/MP CPU running a highly vectorized simulation code. Speedup is linear over the full range of number of processors on all target machines, indicating scalability of the method to higher degrees of parallelism.

  13. A fault-tolerant multiprocessor architecture for aircraft, volume 1. [autopilot configuration

    NASA Technical Reports Server (NTRS)

    Smith, T. B.; Hopkins, A. L.; Taylor, W.; Ausrotas, R. A.; Lala, J. H.; Hanley, L. D.; Martin, J. H.

    1978-01-01

    A fault-tolerant multiprocessor architecture is reported. This architecture, together with a comprehensive information system architecture, has important potential for future aircraft applications. A preliminary definition and assessment of a suitable multiprocessor architecture for such applications is developed.

  14. Memories.

    ERIC Educational Resources Information Center

    Brand, Judith, Ed.

    1998-01-01

    This theme issue of the journal "Exploring" covers the topic of "memories" and describes an exhibition at San Francisco's Exploratorium that ran from May 22, 1998 through January 1999 and that contained over 40 hands-on exhibits, demonstrations, artworks, images, sounds, smells, and tastes that demonstrated and depicted the biological,…

  15. Efficient static scheduling of loops on synchronous multiprocessors

    SciTech Connect

    Zaky, A.M.

    1989-01-01

    This dissertation investigates efficient compile-time scheduling techniques for exploiting parallelism on synchronous multiprocessors. Synchronous multiprocessors, e.g. Very Long Instruction Word (VLIW) machines, are very effective in utilizing unstructured fine-grained parallelism in programs. The effectiveness of such machines is crucially dependent on the static compile-time analysis and detection of potential parallelism. The first part of the dissertation focuses on scheduling sequential loops on multiprocessors with multiple identical processor units. The problem of determining the maximal initiation rate for the execution of a sequential loop with uniform dependence distances on a synchronous multiprocessor is addressed and cast as an eigenvalue problem in a path algebra. A low-order polynomial algorithm for the determination of the optimal loop initiation rate is developed, and a schedule that exploits fine-grained parallelism and achieves the optimal initiation rate is developed under an idealized unbounded processor model. Next, the concepts developed above are extended to deal with perfectly-nested loops with uniform dependences. A strategy is developed to identify both the loop level and fine-grained expression level parallelism in nested loops, and to efficiently schedule such loops on synchronous multiprocessors. Loop scheduling techniques such as Do-Across, Wavefront scheduling, and fine-grained scheduling techniques such as loop unfolding are shown to be derivable within the presented framework.

  16. Clocking and synchronization circuits in multiprocessor systems

    SciTech Connect

    Jeong, Deog-Kyoon.

    1989-01-01

    Microprocessors based on RISC (Reduced Instruction Set Computer) concepts have demonstrated an ability to provide more computing power at a given level of integration than conventional microprocessors. The next step is multiprocessors composed of RISC processing elements. Communication bandwidth among such microprocessors is critical in achieving efficient hardware utilization. This thesis focuses on the communication capability of VLSI circuits and presents new circuit techniques as a guide to build an interconnection network of VLSI microprocessors. Two of the most prominent problems in a synchronous system, which most of the current computer systems are based on, have been clock skew and synchronization failure. A new concept called self-timed systems solves such problems but has not been accepted in microprocessor implementations yet because of its complex design procedure and increased overhead. With this in mind, this thesis concentrates on a system in which individual synchronous subsystems are connected asynchronously. Synchronous subsystems operate with a better control over clock skew using a phase locked loop (PLL) technique. Communication among subsystems is done asynchronously with a controlled synchronization failure rate. One advantage is that conventional VLSI design methodologies which are more efficient can still be applied. Circuit techniques for PLL-based clock generation are described along with stability criteria. The main objective of the circuit is to realize a zero delay buffer. Experimental results show the feasibility of such circuits in VLSI. Synchronizer circuit configurations in both bipolar and MOS technology that best utilize each device, or overcome the technology limit using a bandwidth doubling technique are shown. Interface techniques including handshake mechanisms in such a system are also described.

  17. Real-Time Multiprocessor Programming Language (RTMPL) user's manual

    NASA Technical Reports Server (NTRS)

    Arpasi, D. J.

    1985-01-01

    A real-time multiprocessor programming language (RTMPL) has been developed to provide for high-order programming of real-time simulations on systems of distributed computers. RTMPL is a structured, engineering-oriented language. The RTMPL utility supports a variety of multiprocessor configurations and types by generating assembly language programs according to user-specified targeting information. Many programming functions are assumed by the utility (e.g., data transfer and scaling) to reduce the programming chore. This manual describes RTMPL from a user's viewpoint. Source generation, applications, utility operation, and utility output are detailed. An example simulation is generated to illustrate many RTMPL features.

  18. Job-mix modeling and system analysis of an aerospace multiprocessor.

    NASA Technical Reports Server (NTRS)

    Mallach, E. G.

    1972-01-01

    An aerospace guidance computer organization, consisting of multiple processors and memory units attached to a central time-multiplexed data bus, is described. A job mix for this type of computer is obtained by analysis of Apollo mission programs. Multiprocessor performance is then analyzed using: 1) queuing theory, under certain 'limiting case' assumptions; 2) Markov process methods; and 3) system simulation. Results of the analyses indicate: 1) Markov process analysis is a useful and efficient predictor of simulation results; 2) efficient job execution is not seriously impaired even when the system is so overloaded that new jobs are inordinately delayed in starting; 3) job scheduling is significant in determining system performance; and 4) a system having many slow processors may or may not perform better than a system of equal power having few fast processors, but will not perform significantly worse.

  19. Development and evaluation of a Fault-Tolerant Multiprocessor (FTMP) computer. Volume 2: FTMP software

    NASA Technical Reports Server (NTRS)

    Lala, J. H.; Smith, T. B., III

    1983-01-01

    The software developed for the Fault-Tolerant Multiprocessor (FTMP) is described. The FTMP executive is a timer-interrupt driven dispatcher that schedules iterative tasks which run at 3.125, 12.5, and 25 Hz. Major tasks which run under the executive include system configuration control, flight control, and display. The flight control task includes autopilot and autoland functions for a jet transport aircraft. System Displays include status displays of all hardware elements (processors, memories, I/O ports, buses), failure log displays showing transient and hard faults, and an autopilot display. All software is in a higher order language (AED, an ALGOL derivative). The executive is a fully distributed general purpose executive which automatically balances the load among available processor triads. Provisions for graceful performance degradation under processing overload are an integral part of the scheduling algorithms.

  20. Memory System Technologies for Future High-End Computing Systems

    SciTech Connect

    McKee, S A; de Supinski, B R; Mueller, F; Tyson, G S

    2003-05-16

    Our ability to solve Grand Challenge Problems in computing hinges on the development of reliable and efficient High-End Computing systems. Unfortunately, the increasing gap between memory and processor speeds remains one of the major bottlenecks in modern architectures. Uniprocessor nodes still suffer, but symmetric multiprocessor nodes--where access to physical memory is shared among all processors--are among the hardest hit. In the latter case, the memory system must juggle multiple working sets and maintain memory coherence, on top of simply responding to access requests. To illustrate the severity of the current situation, consider two important examples: even the high-performance parallel supercomputers in use at Department of Energy National labs observe single-processor utilization rates as low as 5%, and transaction processing commercial workloads see utilizations of at most about 33%. A wealth of research demonstrates that traditional memory systems are incapable of bridging the processor/memory performance gap, and the problem continues to grow. The success of future High-End Computing platforms therefore depends on our developing hardware and software technologies to dramatically relieve the memory bottleneck. In order to take better advantage of the tremendous computing power of modern microprocessors and future High-End systems, we consider it crucial to develop the hardware for intelligent, adaptable memory systems; the middleware and OS modifications to manage them; and the compiler technology and performance tools to exploit them. Taken together, these will provide the foundations for meeting the requirements of future generations of performance-critical, parallel systems based on either uniprocessor or SMP nodes (including PIM organizations). We feel that such solutions should not be vendor-specific, but should be sufficiently general and adaptable such that the technologies could be leveraged by any commercial vendor of High-End Computing systems

  1. Fault tree models for fault tolerant hypercube multiprocessors

    NASA Technical Reports Server (NTRS)

    Boyd, Mark A.; Tuazon, Jezus O.

    1991-01-01

    Three candidate fault tolerant hypercube architectures are modeled, their reliability analyses are compared, and the resulting implications of these methods of incorporating fault tolerance into hypercube multiprocessors are discussed. In the course of performing the reliability analyses, the use of HARP and fault trees in modeling sequence dependent system behaviors is demonstrated.

  2. Shared Attention.

    PubMed

    Shteynberg, Garriy

    2015-09-01

    Shared attention is extremely common. In stadiums, public squares, and private living rooms, people attend to the world with others. Humans do so across all sensory modalities-sharing the sights, sounds, tastes, smells, and textures of everyday life with one another. The potential for attending with others has grown considerably with the emergence of mass media technologies, which allow for the sharing of attention in the absence of physical co-presence. In the last several years, studies have begun to outline the conditions under which attending together is consequential for human memory, motivation, judgment, emotion, and behavior. Here, I advance a psychological theory of shared attention, defining its properties as a mental state and outlining its cognitive, affective, and behavioral consequences. I review empirical findings that are uniquely predicted by shared-attention theory and discuss the possibility of integrating shared-attention, social-facilitation, and social-loafing perspectives. Finally, I reflect on what shared-attention theory implies for living in the digital world. PMID:26385997

  3. Characterizing parallel file-access patterns on a large-scale multiprocessor

    NASA Technical Reports Server (NTRS)

    Purakayastha, Apratim; Ellis, Carla Schlatter; Kotz, David; Nieuwejaar, Nils; Best, Michael

    1994-01-01

    Rapid increases in the computational speeds of multiprocessors have not been matched by corresponding performance enhancements in the I/O subsystem. To satisfy the large and growing I/O requirements of some parallel scientific applications, we need parallel file systems that can provide high-bandwidth and high-volume data transfer between the I/O subsystem and thousands of processors. Design of such high-performance parallel file systems depends on a thorough grasp of the expected workload. So far there have been no comprehensive usage studies of multiprocessor file systems. Our CHARISMA project intends to fill this void. The first results from our study involve an iPSC/860 at NASA Ames. This paper presents results from a different platform, the CM-5 at the National Center for Supercomputing Applications. The CHARISMA studies are unique because we collect information about every individual read and write request and about the entire mix of applications running on the machines. The results of our trace analysis lead to recommendations for parallel file system design. First the file system should support efficient concurrent access to many files, and I/O requests from many jobs under varying load conditions. Second, it must efficiently manage large files kept open for long periods. Third, it should expect to see small requests predominantly sequential access patterns, application-wide synchronous access, no concurrent file-sharing between jobs appreciable byte and block sharing between processes within jobs, and strong interprocess locality. Finally, the trace data suggest that node-level write caches and collective I/O request interfaces may be useful in certain environments.

  4. Fast decision tree-based method to index large DNA-protein sequence databases using hybrid distributed-shared memory programming model.

    PubMed

    Jaber, Khalid Mohammad; Abdullah, Rosni; Rashid, Nur'Aini Abdul

    2014-01-01

    In recent times, the size of biological databases has increased significantly, with the continuous growth in the number of users and rate of queries; such that some databases have reached the terabyte size. There is therefore, the increasing need to access databases at the fastest rates possible. In this paper, the decision tree indexing model (PDTIM) was parallelised, using a hybrid of distributed and shared memory on resident database; with horizontal and vertical growth through Message Passing Interface (MPI) and POSIX Thread (PThread), to accelerate the index building time. The PDTIM was implemented using 1, 2, 4 and 5 processors on 1, 2, 3 and 4 threads respectively. The results show that the hybrid technique improved the speedup, compared to a sequential version. It could be concluded from results that the proposed PDTIM is appropriate for large data sets, in terms of index building time. PMID:24794073

  5. Combined shared and distributed memory ab-initio computations of molecular-hydrogen systems in the correlated state: Process pool solution and two-level parallelism

    NASA Astrophysics Data System (ADS)

    Biborski, Andrzej; Kądzielawa, Andrzej P.; Spałek, Józef

    2015-12-01

    An efficient computational scheme devised for investigations of ground state properties of the electronically correlated systems is presented. As an example, (H2)n chain is considered with the long-range electron-electron interactions taken into account. The implemented procedure covers: (i) single-particle Wannier wave-function basis construction in the correlated state, (ii) microscopic parameters calculation, and (iii) ground state energy optimization. The optimization loop is based on highly effective process-pool solution - specific root-workers approach. The hierarchical, two-level parallelism was applied: both shared (by use of Open Multi-Processing) and distributed (by use of Message Passing Interface) memory models were utilized. We discuss in detail the feature that such approach results in a substantial increase of the calculation speed reaching factor of 300 for the fully parallelized solution. The scheme elaborated in detail reflects the situation in which the most demanding task is the single-particle basis optimization.

  6. A prototype functional language implementation for hierarchical- memory architectures

    SciTech Connect

    Wolski, R.; Feo, J.; Cann, D.

    1992-01-14

    Programming languages are the most important tool at a programmers' disposal. All other tools correct, visualize, or evaluate the product crafted by this tool. The advent of multiprocessor computer systems has greatly complicated the programmer's task an increased his need for high-level languages capable of automatically taming these architectures. In this paper, we describe a prototype implementation of Sisal for multiprocessor, hierarchical-memory systems. The implementation includes explicit compiler and runtime control that effectively exploits the different levels of memory and manages interprocess communications (IPC). We give preliminary performance results for this system on the BBN TC2000.

  7. FTMP - A highly reliable Fault-Tolerant Multiprocessor for aircraft

    NASA Technical Reports Server (NTRS)

    Hopkins, A. L., Jr.; Smith, T. B., III; Lala, J. H.

    1978-01-01

    The FTMP (Fault-Tolerant Multiprocessor) is a complex multiprocessor computer that employs a form of redundancy related to systems considered by Mathur (1971), in which each major module can substitute for any other module of the same type. Despite the conceptual simplicity of the redundancy form, the implementation has many intricacies owing partly to the low target failure rate, and partly to the difficulty of eliminating single-fault vulnerability. An extensive analysis of the computer through the use of such modeling techniques as Markov processes and combinatorial mathematics shows that for random hard faults the computer can meet its requirements. It is also shown that the maintenance scheduled at intervals of 200 hr or more can be adequate most of the time.

  8. Analysis of a Multiprocessor Guidance Computer. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Maltach, E. G.

    1969-01-01

    The design of the next generation of spaceborne digital computers is described. It analyzes a possible multiprocessor computer configuration. For the analysis, a set of representative space computing tasks was abstracted from the Lunar Module Guidance Computer programs as executed during the lunar landing, from the Apollo program. This computer performs at this time about 24 concurrent functions, with iteration rates from 10 times per second to once every two seconds. These jobs were tabulated in a machine-independent form, and statistics of the overall job set were obtained. It was concluded, based on a comparison of simulation and Markov results, that the Markov process analysis is accurate in predicting overall trends and in configuration comparisons, but does not provide useful detailed information in specific situations. Using both types of analysis, it was determined that the job scheduling function is a critical one for efficiency of the multiprocessor. It is recommended that research into the area of automatic job scheduling be performed.

  9. An Or Processing Multiprocessor System For Artificial Intelligence

    NASA Astrophysics Data System (ADS)

    Fu, Hsin-Chia; Chiang, Cheng-Chin

    1989-03-01

    In this paper, an OR-Parallel Execution model based multiprocessor system is proposed. Our OR-parallel execution model addresses the following features: (1) Run-time Intelligent Backtracking, (2) Distributed process control and execution, (3) Minimization of data communication between processors, and (4) Minimization of parallel processing management overhead. Special hardware modules such as Intelligent Backtracking Controller, and Forward Execution Controller are designed to further enhance these features in run-time. A bus connected multiprocessor system is designed to experience the proposed OR-parallel execution model. Recent simulation results indicate that the OR-parallel execution model can be successfully used to conduct the parallel processing of most non-deterministic Prolog applications such as database systems, rule-based expert systems, natural language processing and theorem proving, etc.

  10. Queueing analysis of a canonical model of real-time multiprocessors

    NASA Technical Reports Server (NTRS)

    Krishna, C. M.; Shin, K. G.

    1983-01-01

    A logical classification of multiprocessor structures from the point of view of control applications is presented. A computation of the response time distribution for a canonical model of a real time multiprocessor is presented. The multiprocessor is approximated by a blocking model. Two separate models are derived: one created from the system's point of view, and the other from the point of view of an incoming task.

  11. VME multiprocessor system for data acquisition at OSIRIS

    SciTech Connect

    Ziem, P.; Kiehne, T.; Beschorner, C.; Drescher, B.; Zahn, J.

    1987-08-01

    A VME multiprocessor system for data acquisition and data display utilizing several MC68XXX based CPUs and VMEbuses is described. The design of the VME system was stimulated by the data handling requirements of experiments using the anti-Compton spectrometer OSIRIS, i.e. data storage on optical disks and on-line accumulation of large 2-dimensional histograms (4096 x 4096 channels). Due to the general approach the VME system is easily applicable for other nuclear physics experiments.

  12. Fault-free performance validation of fault-tolerant multiprocessors

    NASA Technical Reports Server (NTRS)

    Czeck, Edward W.; Feather, Frank E.; Grizzaffi, Ann Marie; Segall, Zary Z.; Siewiorek, Daniel P.

    1987-01-01

    A validation methodology for testing the performance of fault-tolerant computer systems was developed and applied to the Fault-Tolerant Multiprocessor (FTMP) at NASA-Langley's AIRLAB facility. This methodology was claimed to be general enough to apply to any ultrareliable computer system. The goal of this research was to extend the validation methodology and to demonstrate the robustness of the validation methodology by its more extensive application to NASA's Fault-Tolerant Multiprocessor System (FTMP) and to the Software Implemented Fault-Tolerance (SIFT) Computer System. Furthermore, the performance of these two multiprocessors was compared by conducting similar experiments. An analysis of the results shows high level language instruction execution times for both SIFT and FTMP were consistent and predictable, with SIFT having greater throughput. At the operating system level, FTMP consumes 60% of the throughput for its real-time dispatcher and 5% on fault-handling tasks. In contrast, SIFT consumes 16% of its throughput for the dispatcher, but consumes 66% in fault-handling software overhead.

  13. Modelling parallel programs and multiprocessor architectures with AXE

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Fineman, Charles E.

    1991-01-01

    AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.

  14. Modeling and measurement of fault-tolerant multiprocessors

    NASA Technical Reports Server (NTRS)

    Shin, K. G.; Woodbury, M. H.; Lee, Y. H.

    1985-01-01

    The workload effects on computer performance are addressed first for a highly reliable unibus multiprocessor used in real-time control. As an approach to studing these effects, a modified Stochastic Petri Net (SPN) is used to describe the synchronous operation of the multiprocessor system. From this model the vital components affecting performance can be determined. However, because of the complexity in solving the modified SPN, a simpler model, i.e., a closed priority queuing network, is constructed that represents the same critical aspects. The use of this model for a specific application requires the partitioning of the workload into job classes. It is shown that the steady state solution of the queuing model directly produces useful results. The use of this model in evaluating an existing system, the Fault Tolerant Multiprocessor (FTMP) at the NASA AIRLAB, is outlined with some experimental results. Also addressed is the technique of measuring fault latency, an important microscopic system parameter. Most related works have assumed no or a negligible fault latency and then performed approximate analyses. To eliminate this deficiency, a new methodology for indirectly measuring fault latency is presented.

  15. Input/output system for multiprocessors

    SciTech Connect

    Bernick, D.L.; Chan, K.K.; Chan, W.M.; Dan, Y.F.; Hoang, D.M.; Hussain, Z.; Iswandhi, G.I.; Korpi, J.E.; Sanner, M.W.; Zwangerman, J.A.

    1989-04-11

    A device controller is described, comprising: a first port-input/output controller coupled to a first input/output channel bus; and a second port-input/output controlled coupled to a second input/output channel bus; each of the first and second port-input/output controllers having: a first ownership latch means for granting shared ownership of the device controller to a first host processor to provide a first data path on a first I/O channel through the first port I/O controller between the first host processor and any peripheral, and at least a second ownership latch means operative independently of the first ownership latch means for granting shared ownership of the device controller to a second host processor independently of the first port input/output controller to provide a second data path on a second I/O channel through the second port I/O controller between the second host processor and any peripheral devices coupled to the device controller.

  16. Development and evaluation of a Fault-Tolerant Multiprocessor (FTMP) computer. Volume 3: FTMP test and evaluation

    NASA Technical Reports Server (NTRS)

    Lala, J. H.; Smith, T. B., III

    1983-01-01

    The experimental test and evaluation of the Fault-Tolerant Multiprocessor (FTMP) is described. Major objectives of this exercise include expanding validation envelope, building confidence in the system, revealing any weaknesses in the architectural concepts and in their execution in hardware and software, and in general, stressing the hardware and software. To this end, pin-level faults were injected into one LRU of the FTMP and the FTMP response was measured in terms of fault detection, isolation, and recovery times. A total of 21,055 stuck-at-0, stuck-at-1 and invert-signal faults were injected in the CPU, memory, bus interface circuits, Bus Guardian Units, and voters and error latches. Of these, 17,418 were detected. At least 80 percent of undetected faults are estimated to be on unused pins. The multiprocessor identified all detected faults correctly and recovered successfully in each case. Total recovery time for all faults averaged a little over one second. This can be reduced to half a second by including appropriate self-tests.

  17. Reconfigurable high-speed optoelectronic interconnect technology for multiprocessor computers

    NASA Astrophysics Data System (ADS)

    Cheng, Julian

    1995-06-01

    We describe a compact optoelectronic switching technology for interconnecting multiple computer processors and shared memory modules together through dynamically reconfigurable optical paths to provide simultaneous, high speed communication amongst different nodes. Each switch provides a optical link to other nodes as well as electrical access to an individual processor, and it can perform optical and optoelectronic switching to covert digital data between various electrical and optical input/output formats. This multifunctional switching technology is based on the monolithic integration of arrays of vertical-cavity surface-emitting lasers with photodetectors and heterojunction bipolar transistors. The various digital switching and routing functions, as well as optically cascaded multistage operation, have been experimentally demonstrated.

  18. Cell Multiprocessor Communication Network: Built for Speed

    SciTech Connect

    Kistler, Mike; Perrone, Michael; Petrini, Fabrizio

    2006-06-01

    The existence of major obstacles to the traditional path to processor performance improvement has led chip manufacturers to consider multi-core designs. These architectural solutions promise a variety of power/performance and area/performance benefits. But additional care must be taken to ensure that these benefits are not lost due to inadequate design of the on-chip communication network. This paper presents the design challenges of the on-chip network of the Cell Broadband Engine (Cell BE) processor, and describes in detail its architectural design and the network, communication and synchronization protocols. In the experimental evaluation, performed on an early prototype, we analyze the communication characteristics of the Cell BE processor, using a series of microbenchmarks involving various DMA traffic patterns and synchronization protocols. We find that the on-chip communication subsystem is well matched to the to computational capacity of the processor. A Synergistic Processing Element (SPE) can issue an internal direct memory access (DMA) operation in less than 4 nanoseconds, and a DMA of a single cache line can be executed in less the than 100 nanoseconds. SPEs can achieve the optimal bandwidth of 25.6 GB/second in point to point communication with surprisingly small messages ?only a few KB, using batches of non-blocking DMAs. The aggregate network behavior under heavy load is also remarkably efficient, reaching almost 200 GB/second with collective patterns and optimal contention resolution under hot-spot traffic. Additionally, we demonstrate the consistency of these hardware results with identical experiments carried out using the Mambo simulator software for Cell BE.

  19. Cache directory look-up re-use as conflict check mechanism for speculative memory requests

    DOEpatents

    Ohmacht, Martin

    2013-09-10

    In a cache memory, energy and other efficiencies can be realized by saving a result of a cache directory lookup for sequential accesses to a same memory address. Where the cache is a point of coherence for speculative execution in a multiprocessor system, with directory lookups serving as the point of conflict detection, such saving becomes particularly advantageous.

  20. Fault-free validation of a fault-tolerant multiprocessor: Baseline experiments and workoad implementation

    NASA Technical Reports Server (NTRS)

    Feather, F.; Siewiorek, D.; Segall, Z.

    1986-01-01

    In the future, aircraft employing active control technology must use highly reliable multiprocessors in order to achieve flight safety. Such computers must be experimentally validated before they are deployed. This project outlines a methodology for doing fault-free validation of reliable multiprocessors. The methodology begins with baseline experiments, which test single phenomenon. As experiments progress, tools for performance testing are developed. This report presents the results of interrupt baseline experiments performed on the Fault-Tolerant Multiprocessor (FTMP) at NASA-Langley's AIRLAB. Interrupt-causing excepting conditions were tested, and several were found to have unimplemented interrupt handling software while one had an unimplemented interrupt vector. A synthetic workload model for realtime multiprocessors is then developed as an application level performance analysis tool. Details of the workload implementation and calibration are presented. Both the experimental methodology and the synthetic workload model are general enough to be applicable to reliable multi-processors besides FTMP.

  1. A reconfigurable optoelectronic interconnect technology for multi-processor networks

    SciTech Connect

    Lu, Y.C.; Cheng, J.; Zolper, J.C.; Klem, J.

    1995-05-01

    This paper describes a new optical interconnect architecture and the integrated optoelectronic circuit technology for implementing a parallel, reconfigurable, multiprocessor network. The technology consists of monolithic array`s of optoelectronic switches that integrate vertical-cavity surface-emitting lasers with three-terminal heterojunction phototransistors, which effectively combined the functions of an optical transceiver and an optical spatial routing switch. These switches have demonstrated optical switching at 200 Mb/s, and electrical-to-optical data conversion at > 500 Mb/s, with a small-signal electrical-to-optical modulation bandwidth of {approximately} 4 GHz.

  2. Multi-processor developments in the United States for future high energy physics experiments and accelerators

    SciTech Connect

    Gaines, I.

    1988-03-01

    The use of multi-processors for analysis and high-level triggering in High Energy Physics experiments, pioneered by the early emulator systems, has reached maturity, in particular with the multiple microprocessor systems in use at Fermilab. It is widely acknowledged that such systems will fulfill the major portion of the computing needs of future large experiments. Recent developments at Fermilab's Advanced Computer Program will make such systems even more powerful, cost-effective, and easier to use than they are at present. The next generation of microprocessors, already available, will provide CPU power of about one VAX 780 equivalent/$300, while supporting most VMS FORTRAN extensions and large (>8MB) amounts of memory. Low cost high density mass storage devices (based on video tape cartridge technology) will allow parallel I/O to remove potential I/O bottlenecks in systems of over 1000 VAX equipment processors. New interconnection schemes and system software will allow more flexible topologies and extremely high data bandwidth, especially for on-line systems. This talk will summarize the work at the Advanced Computer Program and the rest of the US in this field. 3 refs., 4 figs.

  3. The design of a high performance dataflow processor for multiprocessor systems

    SciTech Connect

    Luc, K.Q.

    1989-01-01

    The objective of this work is to design a high performance dynamic dataflow processor for multiprocessor systems. The performance of contemporary dataflow processors is limited due to the presence of a component, called a matching unit. The function of this unit is to match instruction tokens in order to detect the executability of instructions. Since activities within the matching unit are sequential in nature and require multiple memory accesses, the unit has been identified as a major performance bottleneck in a prototype processor. The author proposes a natural way to partition the set of tokens and present a new implementation for the matching unit, called an Instance-Based Matching Unit. The new unit requires tokens to be partitioned into blocks and allows matching of these blocks of tokens to proceed concurrently. With the new matching unit, substantial throughput enhancement for the unit is reported. He then analyzes the throughputs at various stages of a conventional dataflow processor. The results thus obtained direct us to propose an optimum configuration for an effective sub-processor. The maximum throughput of this sub-processor is determined by the throughput of a queue. With the sub-processor as a building block, a high performance dataflow processor is presented which consists of multiple copies of the sub-processor. Characteristics of the processor are studied with the Livermore Fortran Kernels as inputs. The performance of this processor is high, and the performance increases with the number of Sub-Processors.

  4. Scalable Multiprocessor for High-Speed Computing in Space

    NASA Technical Reports Server (NTRS)

    Lux, James; Lang, Minh; Nishimoto, Kouji; Clark, Douglas; Stosic, Dorothy; Bachmann, Alex; Wilkinson, William; Steffke, Richard

    2004-01-01

    A report discusses the continuing development of a scalable multiprocessor computing system for hard real-time applications aboard a spacecraft. "Hard realtime applications" signifies applications, like real-time radar signal processing, in which the data to be processed are generated at "hundreds" of pulses per second, each pulse "requiring" millions of arithmetic operations. In these applications, the digital processors must be tightly integrated with analog instrumentation (e.g., radar equipment), and data input/output must be synchronized with analog instrumentation, controlled to within fractions of a microsecond. The scalable multiprocessor is a cluster of identical commercial-off-the-shelf generic DSP (digital-signal-processing) computers plus generic interface circuits, including analog-to-digital converters, all controlled by software. The processors are computers interconnected by high-speed serial links. Performance can be increased by adding hardware modules and correspondingly modifying the software. Work is distributed among the processors in a parallel or pipeline fashion by means of a flexible master/slave control and timing scheme. Each processor operates under its own local clock; synchronization is achieved by broadcasting master time signals to all the processors, which compute offsets between the master clock and their local clocks.

  5. Experience with a Genetic Algorithm Implemented on a Multiprocessor Computer

    NASA Technical Reports Server (NTRS)

    Plassman, Gerald E.; Sobieszczanski-Sobieski, Jaroslaw

    2000-01-01

    Numerical experiments were conducted to find out the extent to which a Genetic Algorithm (GA) may benefit from a multiprocessor implementation, considering, on one hand, that analyses of individual designs in a population are independent of each other so that they may be executed concurrently on separate processors, and, on the other hand, that there are some operations in a GA that cannot be so distributed. The algorithm experimented with was based on a gaussian distribution rather than bit exchange in the GA reproductive mechanism, and the test case was a hub frame structure of up to 1080 design variables. The experimentation engaging up to 128 processors confirmed expectations of radical elapsed time reductions comparing to a conventional single processor implementation. It also demonstrated that the time spent in the non-distributable parts of the algorithm and the attendant cross-processor communication may have a very detrimental effect on the efficient utilization of the multiprocessor machine and on the number of processors that can be used effectively in a concurrent manner. Three techniques were devised and tested to mitigate that effect, resulting in efficiency increasing to exceed 99 percent.

  6. A prototype functional language implementation for hierarchical- memory architectures. Revision 1

    SciTech Connect

    Wolski, R.; Feo, J.; Cann, D.

    1992-01-14

    Programming languages are the most important tool at a programmers` disposal. All other tools correct, visualize, or evaluate the product crafted by this tool. The advent of multiprocessor computer systems has greatly complicated the programmer`s task an increased his need for high-level languages capable of automatically taming these architectures. In this paper, we describe a prototype implementation of Sisal for multiprocessor, hierarchical-memory systems. The implementation includes explicit compiler and runtime control that effectively exploits the different levels of memory and manages interprocess communications (IPC). We give preliminary performance results for this system on the BBN TC2000.

  7. On some parallel algorithms on a ring of processors

    NASA Astrophysics Data System (ADS)

    Sameh, A.

    1985-07-01

    In this paper we describe some linear algebra multiprocessor algorithms which are suitable for a ring of processors. These algorithms are organized in such a way as to be easily modified for general-purpose multiprocessors with shared global memories.

  8. A Tensor Product Formulation of Strassen's Matrix Multiplication Algorithm with Memory Reduction

    DOE PAGESBeta

    Kumar, B.; Huang, C. -H.; Sadayappan, P.; Johnson, R. W.

    1995-01-01

    In this article, we present a program generation strategy of Strassen's matrix multiplication algorithm using a programming methodology based on tensor product formulas. In this methodology, block recursive programs such as the fast Fourier Transforms and Strassen's matrix multiplication algorithm are expressed as algebraic formulas involving tensor products and other matrix operations. Such formulas can be systematically translated to high-performance parallel/vector codes for various architectures. In this article, we present a nonrecursive implementation of Strassen's algorithm for shared memory vector processors such as the Cray Y-MP. A previous implementation of Strassen's algorithm synthesized from tensor product formulas required workingmore » storage of size O(7 n ) for multiplying 2 n × 2 n matrices. We present a modified formulation in which the working storage requirement is reduced to O(4 n ). The modified formulation exhibits sufficient parallelism for efficient implementation on a shared memory multiprocessor. Performance results on a Cray Y-MP8/64 are presented.« less

  9. Dynamically reconfigurable optical interconnect architecture for parallel multiprocessor systems

    NASA Astrophysics Data System (ADS)

    Girard, Mary M.; Husbands, Charles R.; Antoszewska, Reza

    1991-12-01

    The progress in parallel processing technology in recent years has resulted in increased requirements to process large amounts of data in real time. The massively parallel architectures proposed for these applications require the use of a high speed interconnect system to achieve processor-to-processor connectivity without incurring excessive delays. The characteristics of optical components permit high speed operation while the nonconductive nature of the optical medium eliminates ground loop and transmission line problems normally associated with a conductive medium. The MITRE Corp. is evaluating an optical wavelength division multiple access interconnect network design to improve interconnectivity within parallel processor systems and to allow reconfigurability of processor communication paths. This paper describes the architecture and control of and highlights the results from an 8- channel multiprocessor prototype with effective throughput of 3.2 Gigabits per second (Gbps).

  10. Fault-free performance validation of avionic multiprocessors

    NASA Technical Reports Server (NTRS)

    Czeck, Edward W.; Feather, Frank E.; Grizzaffi, Ann Marie; Segall, Zary Z.; Finelli, George B.

    1986-01-01

    This paper describes the application of a portion of a validation methodology to NASA's Fault-Tolerant Multiprocessor System (FTMP) and the Software Implemented Fault-Tolerance (SIFT) computer system. The methodology entails a building block approach, starting with simple baseline experiments and building to more complex experiments. The goal of the validation methodology is to thoroughly test and characterize the performance and behavior of ultrareliable computer systems. The validation methodology presented in this paper showed that the methodology is not machine specific and can be used in lieu of life testing approaches. By applying a building block approach at the systems level, the machine complexity was broken down to manageable levels independent of system implementation.

  11. Multiprocessor system with daisy-chained processor selection

    SciTech Connect

    Yamanaka, K.

    1988-09-27

    A multiprocessor system is described comprising: a bus, a master operation processing unit connected to the bus for generating on the bus data to be processes and a command for processing the data, slave operation processing units connected to the bus for receiving the data and the command from the master operation processing unit. The salve operation processing unit includes a priority discriminator which sequentially selects the slave operation processing units in a preset priority sequence. The priority discriminator also includes means for determining conditions of each of the slave operation processing units and means for initiating execution of the command in the first slave operation processing units selected in the preset priority sequence which has its determined conditions meeting preselected conditions. The initiated slave operation processing unit includes means for processing the data in accordance with the command.

  12. A measurement-based performability model for a multiprocessor system

    NASA Technical Reports Server (NTRS)

    Ilsueh, M. C.; Iyer, Ravi K.; Trivedi, K. S.

    1987-01-01

    A measurement-based performability model based on real error-data collected on a multiprocessor system is described. Model development from the raw errror-data to the estimation of cumulative reward is described. Both normal and failure behavior of the system are characterized. The measured data show that the holding times in key operational and failure states are not simple exponential and that semi-Markov process is necessary to model the system behavior. A reward function, based on the service rate and the error rate in each state, is then defined in order to estimate the performability of the system and to depict the cost of different failure types and recovery procedures.

  13. Parallel algorithm of VLBI software correlator under multiprocessor environment

    NASA Astrophysics Data System (ADS)

    Zheng, Weimin; Zhang, Dong

    2007-11-01

    The correlator is the key signal processing equipment of a Very Lone Baseline Interferometry (VLBI) synthetic aperture telescope. It receives the mass data collected by the VLBI observatories and produces the visibility function of the target, which can be used to spacecraft position, baseline length measurement, synthesis imaging, and other scientific applications. VLBI data correlation is a task of data intensive and computation intensive. This paper presents the algorithms of two parallel software correlators under multiprocessor environments. A near real-time correlator for spacecraft tracking adopts the pipelining and thread-parallel technology, and runs on the SMP (Symmetric Multiple Processor) servers. Another high speed prototype correlator using the mixed Pthreads and MPI (Massage Passing Interface) parallel algorithm is realized on a small Beowulf cluster platform. Both correlators have the characteristic of flexible structure, scalability, and with 10-station data correlating abilities.

  14. Fault recovery characteristics of the fault tolerant multi-processor

    NASA Technical Reports Server (NTRS)

    Padilla, Peter A.

    1990-01-01

    The fault handling performance of the fault tolerant multiprocessor (FTMP) was investigated. Fault handling errors detected during fault injection experiments were characterized. In these fault injection experiments, the FTMP disabled a working unit instead of the faulted unit once every 500 faults, on the average. System design weaknesses allow active faults to exercise a part of the fault management software that handles byzantine or lying faults. It is pointed out that these weak areas in the FTMP's design increase the probability that, for any hardware fault, a good LRU (line replaceable unit) is mistakenly disabled by the fault management software. It is concluded that fault injection can help detect and analyze the behavior of a system in the ultra-reliable regime. Although fault injection testing cannot be exhaustive, it has been demonstrated that it provides a unique capability to unmask problems and to characterize the behavior of a fault-tolerant system.

  15. A parallel multigrid method for data-driven multiprocessor systems

    SciTech Connect

    Lin, C.H.; Gaudiot, J.L.; Proskurowski, W.

    1989-12-31

    The multigrid algorithm (MG) is recognized as an efficient and rapidly converging method to solve a wide family of partial differential equations (PDE). When this method is implemented on a multiprocessor system, its major drawback is the low utilization of processors. Due to the sequentiality of the standard algorithm, the fine grid levels cannot start relaxation until the coarse grid levels complete their own relaxation. Indeed, of all processors active on the fine two dimensional grid level only one fourth will be active at the coarse grid level, leaving full 75% idle. In this paper, a novel parallel V-cycle multigrid (PVM) algorithm is proposed to cure the idle processors` problem. Highly programmable systems such as data-flow architectures are then applied to support this new algorithm. The experiments based on the proposed architecture show that the convergence rate of the new algorithm is about twice faster than that of the standard method and twice as efficient system utilization is achieved.

  16. Operating system for a real-time multiprocessor propulsion system simulator. User's manual

    NASA Technical Reports Server (NTRS)

    Cole, G. L.

    1985-01-01

    The NASA Lewis Research Center is developing and evaluating experimental hardware and software systems to help meet future needs for real-time, high-fidelity simulations of air-breathing propulsion systems. Specifically, the real-time multiprocessor simulator project focuses on the use of multiple microprocessors to achieve the required computing speed and accuracy at relatively low cost. Operating systems for such hardware configurations are generally not available. A real time multiprocessor operating system (RTMPOS) that supports a variety of multiprocessor configurations was developed at Lewis. With some modification, RTMPOS can also support various microprocessors. RTMPOS, by means of menus and prompts, provides the user with a versatile, user-friendly environment for interactively loading, running, and obtaining results from a multiprocessor-based simulator. The menu functions are described and an example simulation session is included to demonstrate the steps required to go from the simulation loading phase to the execution phase.

  17. An embedded multiprocessor computer for proof-of-principle testing of exploratory systems concepts

    SciTech Connect

    Borgman, C.R.; Dalton, L.J.

    1987-01-01

    This paper discusses the SANDAC V multiprocessor embedded computer hardware and software. Its expandable design provides adequate computing power for testing of various proof-of-principle (POP) exploratory system concepts. It is built from state-of-the-art integrated circuits with ASIC glue chips. A powerful software development system, multiprocessor on-board debugger, and a multitasking operating system kernel provide a user friendly software environment to complement the hardware.

  18. Evict on write, a management strategy for a prefetch unit and/or first level cache in a multiprocessor system with speculative execution

    DOEpatents

    Gara, Alan; Ohmacht, Martin

    2014-09-16

    In a multiprocessor system with at least two levels of cache, a speculative thread may run on a core processor in parallel with other threads. When the thread seeks to do a write to main memory, this access is to be written through the first level cache to the second level cache. After the write though, the corresponding line is deleted from the first level cache and/or prefetch unit, so that any further accesses to the same location in main memory have to be retrieved from the second level cache. The second level cache keeps track of multiple versions of data, where more than one speculative thread is running in parallel, while the first level cache does not have any of the versions during speculation. A switch allows choosing between modes of operation of a speculation blind first level cache.

  19. Memory interface simulator: A computer design aid

    NASA Technical Reports Server (NTRS)

    Taylor, D. S.; Williams, T.; Weatherbee, J. E.

    1972-01-01

    Results are presented of a study conducted with a digital simulation model being used in the design of the Automatically Reconfigurable Modular Multiprocessor System (ARMMS), a candidate computer system for future manned and unmanned space missions. The model simulates the activity involved as instructions are fetched from random access memory for execution in one of the system central processing units. A series of model runs measured instruction execution time under various assumptions pertaining to the CPU's and the interface between the CPU's and RAM. Design tradeoffs are presented in the following areas: Bus widths, CPU microprogram read only memory cycle time, multiple instruction fetch, and instruction mix.

  20. Meeting the Memory Challenges of Brain-Scale Network Simulation

    PubMed Central

    Kunkel, Susanne; Potjans, Tobias C.; Eppler, Jochen M.; Plesser, Hans Ekkehard; Morrison, Abigail; Diesmann, Markus

    2012-01-01

    The development of high-performance simulation software is crucial for studying the brain connectome. Using connectome data to generate neurocomputational models requires software capable of coping with models on a variety of scales: from the microscale, investigating plasticity, and dynamics of circuits in local networks, to the macroscale, investigating the interactions between distinct brain regions. Prior to any serious dynamical investigation, the first task of network simulations is to check the consistency of data integrated in the connectome and constrain ranges for yet unknown parameters. Thanks to distributed computing techniques, it is possible today to routinely simulate local cortical networks of around 105 neurons with up to 109 synapses on clusters and multi-processor shared-memory machines. However, brain-scale networks are orders of magnitude larger than such local networks, in terms of numbers of neurons and synapses as well as in terms of computational load. Such networks have been investigated in individual studies, but the underlying simulation technologies have neither been described in sufficient detail to be reproducible nor made publicly available. Here, we discover that as the network model sizes approach the regime of meso- and macroscale simulations, memory consumption on individual compute nodes becomes a critical bottleneck. This is especially relevant on modern supercomputers such as the Blue Gene/P architecture where the available working memory per CPU core is rather limited. We develop a simple linear model to analyze the memory consumption of the constituent components of neuronal simulators as a function of network size and the number of cores used. This approach has multiple benefits. The model enables identification of key contributing components to memory saturation and prediction of the effects of potential improvements to code before any implementation takes place. As a consequence, development cycles can be shorter and less

  1. Meeting the memory challenges of brain-scale network simulation.

    PubMed

    Kunkel, Susanne; Potjans, Tobias C; Eppler, Jochen M; Plesser, Hans Ekkehard; Morrison, Abigail; Diesmann, Markus

    2011-01-01

    The development of high-performance simulation software is crucial for studying the brain connectome. Using connectome data to generate neurocomputational models requires software capable of coping with models on a variety of scales: from the microscale, investigating plasticity, and dynamics of circuits in local networks, to the macroscale, investigating the interactions between distinct brain regions. Prior to any serious dynamical investigation, the first task of network simulations is to check the consistency of data integrated in the connectome and constrain ranges for yet unknown parameters. Thanks to distributed computing techniques, it is possible today to routinely simulate local cortical networks of around 10(5) neurons with up to 10(9) synapses on clusters and multi-processor shared-memory machines. However, brain-scale networks are orders of magnitude larger than such local networks, in terms of numbers of neurons and synapses as well as in terms of computational load. Such networks have been investigated in individual studies, but the underlying simulation technologies have neither been described in sufficient detail to be reproducible nor made publicly available. Here, we discover that as the network model sizes approach the regime of meso- and macroscale simulations, memory consumption on individual compute nodes becomes a critical bottleneck. This is especially relevant on modern supercomputers such as the Blue Gene/P architecture where the available working memory per CPU core is rather limited. We develop a simple linear model to analyze the memory consumption of the constituent components of neuronal simulators as a function of network size and the number of cores used. This approach has multiple benefits. The model enables identification of key contributing components to memory saturation and prediction of the effects of potential improvements to code before any implementation takes place. As a consequence, development cycles can be shorter and

  2. Efficient diagnosis of multiprocessor systems under probabilistic models

    NASA Technical Reports Server (NTRS)

    Blough, Douglas M.; Sullivan, Gregory F.; Masson, Gerald M.

    1989-01-01

    The problem of fault diagnosis in multiprocessor systems is considered under a probabilistic fault model. The focus is on minimizing the number of tests that must be conducted in order to correctly diagnose the state of every processor in the system with high probability. A diagnosis algorithm that can correctly diagnose the state of every processor with probability approaching one in a class of systems performing slightly greater than a linear number of tests is presented. A nearly matching lower bound on the number of tests required to achieve correct diagnosis in arbitrary systems is also proven. Lower and upper bounds on the number of tests required for regular systems are also presented. A class of regular systems which includes hypercubes is shown to be correctly diagnosable with high probability. In all cases, the number of tests required under this probabilistic model is shown to be significantly less than under a bounded-size fault set model. Because the number of tests that must be conducted is a measure of the diagnosis overhead, these results represent a dramatic improvement in the performance of system-level diagnosis techniques.

  3. MULTIPROCESSOR AND DISTRIBUTED PROCESSING BIBLIOGRAPHIC DATA BASE SOFTWARE SYSTEM

    NASA Technical Reports Server (NTRS)

    Miya, E. N.

    1994-01-01

    Multiprocessors and distributed processing are undergoing increased scientific scrutiny for many reasons. It is more and more difficult to keep track of the existing research in these fields. This package consists of a large machine-readable bibliographic data base which, in addition to the usual keyword searches, can be used for producing citations, indexes, and cross-references. The data base is compiled from smaller existing multiprocessing bibliographies, and tables of contents from journals and significant conferences. There are approximately 4,000 entries covering topics such as parallel and vector processing, networks, supercomputers, fault-tolerant computers, and cellular automata. Each entry is represented by 21 fields including keywords, author, referencing book or journal title, volume and page number, and date and city of publication. The data base contains UNIX 'refer' formatted ASCII data and can be implemented on any computer running under the UNIX operating system. The data base requires approximately one megabyte of secondary storage. The documentation for this program is included with the distribution tape, although it can be purchased for the price below. This bibliography was compiled in 1985 and updated in 1988.

  4. Solution of open region electromagnetic scattering problems on hypercube multiprocessors

    SciTech Connect

    Gedney, S.D.

    1991-01-01

    This thesis focuses on development of parallel algorithms that exploit hypercube multiprocessor computers for the solution of the scattering of electromagnetic fields by bodies situated in an unbounded space. Initially, algorithms based on the method of moments are investigated for coarse-grained MIMD hypercubes as well as finite-grained MIMD and SIMD hypercubes. It is shown that by exploiting the architecture of each hypercube, supercomputer performance can be obtained using the JPL Mark III hypercube and the Thinking Machine's CM2. Second, the use of the finite-element method for solution of the scattering by bodies of composite materials is presented. For finite bodies situated in an unbounded space, use of an absorbing boundary condition is investigated. A method known as the mixed-{chi} formulation is presented, which reduces the mesh density in the regions away from the scatterer, enhancing the use of an absorbing boundary condition. The scattering by troughs or slots is also investigated using a combined FEM/MoM formulation. This method is extended to the problem of the diffraction of electromagnetic waves by thick conducting and/or dielectric gratings. Finally, the adaptation of the FEM method onto a coarse-grained hypercube is presented.

  5. Performance and economy of a fault-tolerant multiprocessor

    NASA Technical Reports Server (NTRS)

    Lala, J. H.; Smith, C. J.

    1979-01-01

    The FTMP (Fault-Tolerant Multiprocessor) is one of two central aircraft fault-tolerant architectures now in the prototype phase under NASA sponsorship. The intended application of the computer includes such critical real-time tasks as 'fly-by-wire' active control and completely automatic Category III landings of commercial aircraft. The FTMP architecture is briefly described and it is shown that it is a viable solution to the multi-faceted problems of safety, speed, and cost. Three job dispatch strategies are described, and their results with respect to job-starting delay are presented. The first strategy is a simple First-Come-First-Serve (FCFS) job dispatch executive. The other two schedulers are an adaptive FCFS and an interrupt driven scheduler. Three failure modes are discussed, and the FTMP survival probability in the face of random hard failures is evaluated. It is noted that the hourly cost of operating two FTMPs in a transport aircraft can be as little as one-to-two percent of the total flight-hour cost of the aircraft.

  6. Memory management and compiler support for rapid recovery from failures in computer systems

    NASA Technical Reports Server (NTRS)

    Fuchs, W. K.

    1991-01-01

    This paper describes recent developments in the use of memory management and compiler technology to support rapid recovery from failures in computer systems. The techniques described include cache coherence protocols for user transparent checkpointing in multiprocessor systems, compiler-based checkpoint placement, compiler-based code modification for multiple instruction retry, and forward recovery in distributed systems utilizing optimistic execution.

  7. Validation of a fault-tolerant multiprocessor: Baseline experiments and workload implementation

    NASA Technical Reports Server (NTRS)

    Feather, Frank; Siewiorek, Daniel; Segall, Zary

    1985-01-01

    In the future, aircraft must employ highly reliable multiprocessors in order to achieve flight safety. Such computers must be experimentally validated before they are deployed. This project outlines a methodology for validating reliable multiprocessors. The methodology begins with baseline experiments, which tests a single phenomenon. As experiments progress, tools for performance testing are developed. The methodology is used, in part, on the Fault Tolerant Multiprocessor (FTMP) at NASA-Langley's AIRLAB facility. Experiments are designed to evaluate the fault-free performance of the system. Presented are the results of interrupt baseline experiments performed on FTMP. Interrupt causing exception conditions were tested, and several were found to have unimplemented interrupt handling software while one had an unimplemented interrupt vector. A synthetic workload model for realtime multiprocessors is then developed as an application level performance analysis tool. Details of the workload implementation and calibration are presented. Both the experimental methodology and the synthetic workload model are general enough to be applicable to reliable multiprocessors beside FTMP.

  8. Performance of the butterfly processor-memory interconnection in a vector environment

    NASA Astrophysics Data System (ADS)

    Brooks, E. D., III

    1985-02-01

    A fundamental hurdle impeding the development of large N common memory multiprocessors is the performance limitation in the switch connecting the processors to the memory modules. Multistage networks currently considered for this connection have a memory latency which grows like (ALPHA)log2N*. For scientific computing, it is natural to look for a multiprocessor architecture that will enable the use of vector operations to mask memory latency. The problem to be overcome here is the chaotic behavior introduced by conflicts occurring in the switch. The performance of the butterfly or indirect binary n-cube network in a vector processing environment is examined. A simple modification of the standard 2X2 switch node used in such networks which adaptively removes chaotic behavior during a vector operation is described.

  9. Clock synchronization of a large multiprocessor system in the presence of malicious faults

    NASA Technical Reports Server (NTRS)

    Shin, Kang G.; Ramanathan, P.

    1987-01-01

    An interconnection algorithm is presented for achieving clock synchronization in a multiprocessor system. The system is assumed to be maliciously faulty, i.e., some processors are out of synchronization and lie about their clock state to other intragroup or intergroup processors. A phase-locked clock network design is proposed which groups the clocks in the system into diverse clusters. The clusters are then treated as single clock units from the perspective of the network. The algorithm minimizes the number of interconnections while permitting synchronization of large multiprocessor systems controlling time-critical applications such as aircraft, nuclear reactors and industrial processes.

  10. Programmable Optoelectronic Multiprocessors: Design, Performance and CAD Development

    NASA Astrophysics Data System (ADS)

    Kiamilev, Fouad Eskender

    1992-01-01

    This thesis describes the development of Programmable Optoelectronic Multiprocessor (POEM) architectures and systems. POEM systems combine simple electronic processing elements with free-space optical interconnects to implement high-performance, massively-parallel computers. POEM architectures are fundamentally different from architectures used in conventional VLSI systems. Novel system partitioning and processing element design methods have been developed to ensure efficient implementation of POEM architectures with optoelectronic technology. The main contributions of this thesis are: architecture and software design for the POEM prototype built at UCSD; detailed technology design-tradeoff and comparison studies for POEM interconnection networks; and application of the VHSIC Hardware Description Language (VHDL) to the design, simulation, and synthesis of POEM computers. A general-purpose POEM SIMD parallel computer architecture has been designed for symbolic computing applications. A VHDL simulation of this architecture was written to test the POEM hardware running parallel programs prior to prototype fabrication. Detailed performance comparison of this architecture with all-optical computing, based on symbolic substitution, has also been carried out to show that POEMs offer higher computational efficiency. A detailed technological design of a packet-switched POEM multistage interconnection network system has been performed. This design uses optically interconnected stages of K x K electronic switching elements, where K is a variable parameter, called grain-size, that determines the ratio of optics to electronics in the system. A thorough cost and performance comparison between this design and existing VLSI implementations was undertaken to show that the POEM approach offers better scalability and higher performance. The grain-size was optimized, showing that switch sizes of 16 x 16 to 256 x 256 provide maximum performance/cost. The effects of varying

  11. Hybrid Simulation of the Interaction of Europa's Atmosphere with the Jovian Plasma: Multiprocessor Simulations

    NASA Astrophysics Data System (ADS)

    Dols, V. J.; Delamere, P. A.; Bagenal, F.; Cassidy, T. A.; Crary, F. J.

    2014-12-01

    We model the interaction of Europa's tenuous atmosphere with the plasma of Jupiter's torus with an improved version of our hybrid plasma code. In a hybrid plasma code, the ions are treated as kinetic Macro-particles moving under the Lorentz force and the electrons as a fluid leading to a generalized formulation of Ohm's law. In this version, the spatial simulation domain is decomposed in 2 directions and is non-uniform in the plasma convection direction. The code is run on a multi-processor supercomputer that offers 16416 cores and 2GB Ram per core. This new version allows us to tap into the large memory of the supercomputer and simulate the full interaction volume (Reuropa=1561km) with a high spatial resolution (50km). Compared to Io, Europa's atmosphere is about 100 times more tenuous, the ambient magnetic field is weaker and the density of incident plasma is lower. Consequently, the electrodynamic interaction is also weaker and substantial fluxes of thermal torus ions might reach and sputter the icy surface. Molecular O2 is the dominant atmospheric product of this surface sputtering. Observations of oxygen UV emissions (specifically the ratio of OI 1356A / 1304A emissions) are roughly consistent with an atmosphere that is composed predominantely of O2 with a small amount of atomic O. Galileo observations along flybys close to Europa have revealed the existence of induced currents in a conducting ocean under the icy crust. They also showed that, from flyby to flyby, the plasma interaction is very variable. Asymmetries of the plasma density and temperature in the wake of Europa were also observed and still elude a clear explanation. Galileo mag data also detected ion cyclotron waves, which is an indication of heavy ion pickup close to the moon. We prescribe an O2 atmosphere with a vertical density column consistent with UV observations and model the plasma properties along several Galileo flybys of the moon. We compare our results with the magnetometer

  12. Apparatus for multiprocessor-based control of a multiagent robot

    NASA Technical Reports Server (NTRS)

    Peters, II, Richard Alan (Inventor)

    2009-01-01

    An architecture for robot intelligence enables a robot to learn new behaviors and create new behavior sequences autonomously and interact with a dynamically changing environment. Sensory information is mapped onto a Sensory Ego-Sphere (SES) that rapidly identifies important changes in the environment and functions much like short term memory. Behaviors are stored in a DBAM that creates an active map from the robot's current state to a goal state and functions much like long term memory. A dream state converts recent activities stored in the SES and creates or modifies behaviors in the DBAM.

  13. Evaluation of the impact chip multiprocessors have on SNL application performance.

    SciTech Connect

    Doerfler, Douglas W.

    2009-10-01

    This report describes trans-organizational efforts to investigate the impact of chip multiprocessors (CMPs) on the performance of important Sandia application codes. The impact of CMPs on the performance and applicability of Sandia's system software was also investigated. The goal of the investigation was to make algorithmic and architectural recommendations for next generation platform acquisitions.

  14. SIFT - Multiprocessor architecture for Software Implemented Fault Tolerance flight control and avionics computers

    NASA Technical Reports Server (NTRS)

    Forman, P.; Moses, K.

    1979-01-01

    A brief description of a SIFT (Software Implemented Fault Tolerance) Flight Control Computer with emphasis on implementation is presented. A multiprocessor system that relies on software-implemented fault detection and reconfiguration algorithms is described. A high level reliability and fault tolerance is achieved by the replication of computing tasks among processing units.

  15. Architecture and applications of the HEP multiprocessor computer system

    SciTech Connect

    Smith, B.J.

    1981-01-01

    The HEP computer system is a large scale scientific parallel computer employing shared-resource MIMD architecture. The hardware and software facilities provided by the system are described, and techniques found useful in programming the system are discussed. 3 references.

  16. Principle for possible memory structures with extra high density by using the electron sharing mechanisms of atoms in an inflective orbit

    NASA Astrophysics Data System (ADS)

    Sengor, T.

    2014-10-01

    Both of the qualitative and quantitative knowledge of electromagnetic fields in the inter-atomic scale bring useful applications. From this point of view, bringing some possible new sights and solutions to atom-electron-photon-atom and/or molecule interactions is aimed in the near-field at inter atomic scale and their potential applications. The electron sharing processes between neighbor atoms are considered as an inflective surface system and an inflective guiding processes. The critical pass and transition structures are derived. The structures involving trigging that transition mechanisms may be suitable to design extra high density and fast data storage processes. The electron sharing processes between two near atomic system are modelled with gate mechanisms involving two distinct passages: continuous pass and discontinuous pass. Even if the stochastic processes are applicable at these cases theoretical approach putting an influence like inner and external dipole mechanisms fits best to the situation and provides almost deterministic scheme, which has potential to estimate some processes being able to design new electronics structures and devices. We call orbitron all of such structures and/or devices. The boundary value problem of atomic system sharing an electron in the way of electron passage model is formulated in inflective spherical coordinate system. The wave phenomenon is studied near spherically inflection points. The analytical essentials are derived for the solution of Helmholtz's equation when inflective boundaries are included. The evaluation is obtained by the extracted separation method. The results are given by using the spherically inflective wave series. The method is reshaped for the solution of Schrödinger equation.

  17. Myrmics Memory Allocator

    Energy Science and Technology Software Center (ESTSC)

    2011-09-23

    MMA is a stand-alone memory management system for MPI clusters. It implements a shared Partitioned Global Address Space, where multiple MPI processes request objects from the allocator and the latter provides them with system-wide unique memory addresses for each object. It provides applications with an intuitive way of managing the memory system in a unified way, thus enabling easier writing of irregular application code.

  18. Mapping of video decoder software on a VLIW DSP multiprocessor

    NASA Astrophysics Data System (ADS)

    Freimann, Achim; Brune, Thomas; Pirsch, Peter

    1998-03-01

    When implementing today's video compression standards on programmable processors, it is essential to optimize the algorithms with respect to the underlying hardware. As an example, the core decoder functions of the H.263 hybrid coding scheme were implemented on a SIMD controlled processor with four parallel VLIW data paths, the HiPAR-DSP. The decoder tasks were implemented employing local memory, parallelization on several levels, and data statistics. Special effort was paid on the computation intensive tasks IDCT, and motion compensated frame reconstruction. To speed up the IDCT computation, a data dependent approach was chosen, which distinguishes different block types. The determination of IDCT block type could be parallelized together with other tasks, thus no additional overhead is required. Frame reconstruction mainly benefits from data parallel operations and transparent DMA transfers to and from external memory.

  19. Animal models of source memory.

    PubMed

    Crystal, Jonathon D

    2016-01-01

    Source memory is the aspect of episodic memory that encodes the origin (i.e., source) of information acquired in the past. Episodic memory (i.e., our memories for unique personal past events) typically involves source memory because those memories focus on the origin of previous events. Source memory is at work when, for example, someone tells a favorite joke to a person while avoiding retelling the joke to the friend who originally shared the joke. Importantly, source memory permits differentiation of one episodic memory from another because source memory includes features that were present when the different memories were formed. This article reviews recent efforts to develop an animal model of source memory using rats. Experiments are reviewed which suggest that source memory is dissociated from other forms of memory. The review highlights strengths and weaknesses of a number of animal models of episodic memory. Animal models of source memory may be used to probe the biological bases of memory. Moreover, these models can be combined with genetic models of Alzheimer's disease to evaluate pharmacotherapies that ultimately have the potential to improve memory. PMID:26609644

  20. Architecture and applications of the HEP multiprocessor computer system

    SciTech Connect

    Smith, B.J.; Fink, D.J.

    1982-01-01

    The HEP computer system is a large scale scientific parallel computer employing shared resource MIMD architecture. The hardware and software facilities provided by the system are described, and techniques found to be useful in programming the system are also discussed. 3 references.

  1. Validation of fault-free behavior of a reliable multiprocessor system - FTMP: A case study. [Fault-Tolerant Multi-Processor avionics

    NASA Technical Reports Server (NTRS)

    Clune, E.; Segall, Z.; Siewiorek, D.

    1984-01-01

    A program of experiments has been conducted at NASA-Langley to test the fault-free performance of a Fault-Tolerant Multiprocessor (FTMP) avionics system for next-generation aircraft. Baseline measurements of an operating FTMP system were obtained with respect to the following parameters: instruction execution time, frame size, and the variation of clock ticks. The mechanisms of frame stretching were also investigated. The experimental results are summarized in a table. Areas of interest for future tests are identified, with emphasis given to the implementation of a synthetic workload generation mechanism on FTMP.

  2. Shared Intentionality

    ERIC Educational Resources Information Center

    Tomasello, Michael; Carpenter, Malinda

    2007-01-01

    We argue for the importance of processes of shared intentionality in children's early cognitive development. We look briefly at four important social-cognitive skills and how they are transformed by shared intentionality. In each case, we look first at a kind of individualistic version of the skill--as exemplified most clearly in the behavior of…

  3. Intercrate-communications controller for multicrate multiprocessor systems that use CAMAC-COMPEX protocol

    SciTech Connect

    Basiladze, S.G.; Kybnikov, V.M.

    1986-07-01

    This paper describes a controller for organization of intercrate communications in multicrate multiprocessor systems by means of an intercrate dataway that follows the structure of a COMEX dataway. The intercrate controller makes it possible to create multicrate multiprocessor systems with ordinary CAMAC slave modules. The crate-communications controller is implemented as an auxiliary crate controller and requires at the 25th station an EUR6500e-compatible device for communications between the main and auxiliary dataways. Two light-emitting diodes on the front panel indicate which of the dataways has been captured by the controller. The controller is designed for operation with intercrate dataway lines that are matched by means of one or two terminators. The matching network must have the equivalent circuit of a resistor of 110-130 ohm connected to a source of 3.5-4.5 V.

  4. Safe and Efficient Support for Embeded Multi-Processors in ADA

    NASA Astrophysics Data System (ADS)

    Ruiz, Jose F.

    2010-08-01

    New software demands increasing processing power, and multi-processor platforms are spreading as the answer to achieve the required performance. Embedded real-time systems are also subject to this trend, but in the case of real-time mission-critical systems, the properties of reliability, predictability and analyzability are also paramount. The Ada 2005 language defined a subset of its tasking model, the Ravenscar profile, that provides the basis for the implementation of deterministic and time analyzable applications on top of a streamlined run-time system. This Ravenscar tasking profile, originally designed for single processors, has proven remarkably useful for modelling verifiable real-time single-processor systems. This paper proposes a simple extension to the Ravenscar profile to support multi-processor systems using a fully partitioned approach. The implementation of this scheme is simple, and it can be used to develop applications amenable to schedulability analysis.

  5. Method for wiring allocation and switch configuration in a multiprocessor environment

    DOEpatents

    Aridor, Yariv; Domany, Tamar; Frachtenberg, Eitan; Gal, Yoav; Shmueli, Edi; Stockmeyer, legal representative, Robert E.; Stockmeyer, Larry Joseph

    2008-07-15

    A method for wiring allocation and switch configuration in a multiprocessor computer, the method including employing depth-first tree traversal to determine a plurality of paths among a plurality of processing elements allocated to a job along a plurality of switches and wires in a plurality of D-lines, and selecting one of the paths in accordance with at least one selection criterion.

  6. Operating system for a real-time multiprocessor propulsion system simulator

    NASA Technical Reports Server (NTRS)

    Cole, G. L.

    1984-01-01

    The success of the Real Time Multiprocessor Operating System (RTMPOS) in the development and evaluation of experimental hardware and software systems for real time interactive simulation of air breathing propulsion systems was evaluated. The Real Time Multiprocessor Operating System (RTMPOS) provides the user with a versatile, interactive means for loading, running, debugging and obtaining results from a multiprocessor based simulator. A front end processor (FEP) serves as the simulator controller and interface between the user and the simulator. These functions are facilitated by the RTMPOS which resides on the FEP. The RTMPOS acts in conjunction with the FEP's manufacturer supplied disk operating system that provides typical utilities like an assembler, linkage editor, text editor, file handling services, etc. Once a simulation is formulated, the RTMPOS provides for engineering level, run time operations such as loading, modifying and specifying computation flow of programs, simulator mode control, data handling and run time monitoring. Run time monitoring is a powerful feature of RTMPOS that allows the user to record all actions taken during a simulation session and to receive advisories from the simulator via the FEP. The RTMPOS is programmed mainly in PASCAL along with some assembly language routines. The RTMPOS software is easily modified to be applicable to hardware from different manufacturers.

  7. RTMPL: A structured programming and documentation utility for real-time multiprocessor simulations

    NASA Technical Reports Server (NTRS)

    Arpasi, D. J.

    1984-01-01

    The NASA Lewis Research Center is developing and evaluating experimental hardware and software systems to help meet future needs for real time simulations of air-breathing propulsion systems. The Real Time Multiprocessor Simulator (RTMPS) project is aimed at developing a prototype simulator system that uses multiple microprocessors to achieve the desired computing speed and accuracy at relatively low cost. Software utilities are being developed to provide engineering-level programming and interactive operation of the simulator. Two major software development efforts were undertaken in the RTMPS project. A real time multiprocessor operating system was developed to provide for interactive operation of the simulator. The second effort was aimed at developing a structured, high-level, engineering-oriented programming language and translator that would facilitate the programming of the simulator. The Real Time Multiprocessor Programming Language (RTMPL) allows the user to describe simulation tasks for each processor in a straight-forward, structured manner. The RTMPL utility acts as an assembly language programmer, translating the high-level simulation description into time-efficient assembly language code for the processors. The utility sets up all of the interfaces between the simulator hardware, firmware, and operating system.

  8. Computational principles of memory.

    PubMed

    Chaudhuri, Rishidev; Fiete, Ila

    2016-03-01

    The ability to store and later use information is essential for a variety of adaptive behaviors, including integration, learning, generalization, prediction and inference. In this Review, we survey theoretical principles that can allow the brain to construct persistent states for memory. We identify requirements that a memory system must satisfy and analyze existing models and hypothesized biological substrates in light of these requirements. We also highlight open questions, theoretical puzzles and problems shared with computer science and information theory. PMID:26906506

  9. Mechanical memory

    DOEpatents

    Gilkey, Jeffrey C.; Duesterhaus, Michelle A.; Peter, Frank J.; Renn, Rosemarie A.; Baker, Michael S.

    2006-08-15

    A first-in-first-out (FIFO) microelectromechanical memory apparatus (also termed a mechanical memory) is disclosed. The mechanical memory utilizes a plurality of memory cells, with each memory cell having a beam which can be bowed in either of two directions of curvature to indicate two different logic states for that memory cell. The memory cells can be arranged around a wheel which operates as a clocking actuator to serially shift data from one memory cell to the next. The mechanical memory can be formed using conventional surface micromachining, and can be formed as either a nonvolatile memory or as a volatile memory.

  10. Mechanical memory

    DOEpatents

    Gilkey, Jeffrey C.; Duesterhaus, Michelle A.; Peter, Frank J.; Renn, Rosemarie A.; Baker, Michael S.

    2006-05-16

    A first-in-first-out (FIFO) microelectromechanical memory apparatus (also termed a mechanical memory) is disclosed. The mechanical memory utilizes a plurality of memory cells, with each memory cell having a beam which can be bowed in either of two directions of curvature to indicate two different logic states for that memory cell. The memory cells can be arranged around a wheel which operates as a clocking actuator to serially shift data from one memory cell to the next. The mechanical memory can be formed using conventional surface micromachining, and can be formed as either a nonvolatile memory or as a volatile memory.

  11. Parallelizing a molecular dynamics algorithm on a multiprocessor workstation using OpenMP.

    PubMed

    Tarmyshov, Konstantin B; Müller-Plathe, Florian

    2005-01-01

    The atomistic molecular dynamics program YASP has been parallelized for shared-memory computer architectures. Parallelization was restricted to the most CPU-time-consuming parts: neighbor-list construction, calculation of nonbonded, angle and dihedral forces, and constraints. Most of the sequential FORTRAN code was kept; parallel constructs were inserted as compiler directives using the OpenMP standard. Only in the case of the neighbor list did the data structure have to be changed. The parallel code achieves a useful speedup over the sequential version for systems of several thousand atoms and above. On an IBM Regatta p690+, the throughput increases with the number of processors up to a maximum of 12-16 processors depending on the characteristics of the simulated systems. On dual-processor Xeon systems, the speedup is about 1.7. PMID:16309301

  12. Shared Cataloguing.

    ERIC Educational Resources Information Center

    Westby, Barbara M.

    The National Program for Acquisition and Cataloging (NPAC) authorized under Title IIC of the Higher Education Act of 1965 is called the Shared Cataloging Program. Under this Act the Library of Congress is authorized to: (1) acquire for its own collections all materials currently published throughout the world that are of value to scholarship and…

  13. Towards Scalable 1024 Processor Shared Memory Systems

    NASA Technical Reports Server (NTRS)

    Ciotti, Robert B.; Thigpen, William W. (Technical Monitor)

    2001-01-01

    Over the past 3 years, NASA Ames has been involved in a cooperative effort with SGI to develop the largest single system image systems available. Currently a 1024 Origin3OOO is under development, with first boot expected later in the summer of 2001. This paper discusses some early results with a 512p Origin3OOO system and some arcane IRIX system calls that can dramatically improve scaling performance.

  14. Debugging Fortran on a shared memory machine

    SciTech Connect

    Allen, T.R.; Padua, D.A.

    1987-01-01

    Debugging on a parallel processor is more difficult than debugging on a serial machine because errors in a parallel program may introduce nondeterminism. The approach to parallel debugging presented here attempts to reduce the problem of debugging on a parallel machine to that of debugging on a serial machine by automatically detecting nondeterminism. 20 refs., 6 figs.

  15. Parallel implementation of arbitrary-shaped MPEG-4 decoder for multiprocessor systems

    NASA Astrophysics Data System (ADS)

    Pastrnak, Milan; de With, Peter H. N.; Stuijk, Sander; van Meerbergen, Jef

    2006-01-01

    MPEG-4 is the first standard that combines synthetic objects, like 2D/3D graphics objects, with natural rectangular and non-rectangular video objects. The independent access to individual synthetic video objects for further manipulation creates a large space for future applications. This paper addresses the optimization of such complex multimedia algorithms for implementation on multiprocessor platforms. It is shown that when choosing the correct granularity of processing for enhanced parallelism and splitting time-critical tasks, a substantial improvement in processing efficiency can be obtained. In our work, we focus on non-rectangular (also called arbitrary-shaped) video objects decoder. In previous work, we motivated the use of a multiprocessor System-on-Chip(SoC) setup that satisfies the requirements on the overall computation capacity. We propose the optimization of the MPEG-4 algorithm to increase the decoding throughput and a more efficient usage of the multiprocessor architecture. First, we present a modification of the Repetitive Padding to increase the pipelining at block level. We identified the part of the padding algorithm that can be executed in parallel with the DCT-coefficient decoding and modified the original algorithm into two communicating tasks. Second, we introduce a synchronization mechanism that allows the processing for the Extended Padding and postprocessing (Deblocking & Deringing) filters at block level. The first optimization results in about 58% decrease of the original Repetitive-Padding task computational requirements. By introducing the previously proposed data-level parallelism and exploiting the inherent parallelism between the separated color components (Y, Cr, Cb), the computational savings are about 72% on the average. Moreover, the proposed optimizations marginalize the processing latency from frame size to slice order-of-magnitude.

  16. ScalaBLAST 2.0: Rapid and robust BLAST calculations on multiprocessor systems

    SciTech Connect

    Oehmen, Christopher S.; Baxter, Douglas J.

    2013-03-15

    BLAST remains one of the most widely used tools in computational biology. The rate at which new sequence data is available continues to grow exponentially, driving the emergence of new fields of biological research. At the same time multicore systems and conventional clusters are more accessible. ScalaBLAST has been designed to run on conventional multiprocessor systems with an eye to extreme parallelism, enabling parallel BLAST calculations using over 16,000 processing cores with a portable, robust, fault-resilient design. ScalaBLAST 2.0 source code can be freely downloaded from http://omics.pnl.gov/software/ScalaBLAST.php.

  17. Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 4: FTMP executive summary

    NASA Technical Reports Server (NTRS)

    Smith, T. B., III; Lala, J. H.

    1984-01-01

    The FTMP architecture is a high reliability computer concept modeled after a homogeneous multiprocessor architecture. Elements of the FTMP are operated in tight synchronism with one another and hardware fault-detection and fault-masking is provided which is transparent to the software. Operating system design and user software design is thus greatly simplified. Performance of the FTMP is also comparable to that of a simplex equivalent due to the efficiency of fault handling hardware. The FTMP project constructed an engineering module of the FTMP, programmed the machine and extensively tested the architecture through fault injection and other stress testing. This testing confirmed the soundness of the FTMP concepts.

  18. A high speed multi-tasking, multi-processor telemetry system

    SciTech Connect

    Wu, Kung Chris

    1996-12-31

    This paper describes a small size, light weight, multitasking, multiprocessor telemetry system capable of collecting 32 channels of differential signals at a sampling rate of 6.25 kHz per channel. The system is designed to collect data from remote wind turbine research sites and transfer the data via wireless communication. A description of operational theory, hardware components, and itemized cost is provided. Synchronization with other data acquisition systems and test data on data transmission rates is also given. 11 refs., 7 figs., 4 tabs.

  19. Energy-efficient fault tolerance in multiprocessor real-time systems

    NASA Astrophysics Data System (ADS)

    Guo, Yifeng

    The recent progress in the multiprocessor/multicore systems has important implications for real-time system design and operation. From vehicle navigation to space applications as well as industrial control systems, the trend is to deploy multiple processors in real-time systems: systems with 4 -- 8 processors are common, and it is expected that many-core systems with dozens of processing cores will be available in near future. For such systems, in addition to general temporal requirement common for all real-time systems, two additional operational objectives are seen as critical: energy efficiency and fault tolerance. An intriguing dimension of the problem is that energy efficiency and fault tolerance are typically conflicting objectives, due to the fact that tolerating faults (e.g., permanent/transient) often requires extra resources with high energy consumption potential. In this dissertation, various techniques for energy-efficient fault tolerance in multiprocessor real-time systems have been investigated. First, the Reliability-Aware Power Management (RAPM) framework, which can preserve the system reliability with respect to transient faults when Dynamic Voltage Scaling (DVS) is applied for energy savings, is extended to support parallel real-time applications with precedence constraints. Next, the traditional Standby-Sparing (SS) technique for dual processor systems, which takes both transient and permanent faults into consideration while saving energy, is generalized to support multiprocessor systems with arbitrary number of identical processors. Observing the inefficient usage of slack time in the SS technique, a Preference-Oriented Scheduling Framework is designed to address the problem where tasks are given preferences for being executed as soon as possible (ASAP) or as late as possible (ALAP). A preference-oriented earliest deadline (POED) scheduler is proposed and its application in multiprocessor systems for energy-efficient fault tolerance is

  20. Multi-objective two-stage multiprocessor flow shop scheduling - a subgroup particle swarm optimisation approach

    NASA Astrophysics Data System (ADS)

    Huang, Rong-Hwa; Yang, Chang-Lin; Hsu, Chun-Ting

    2015-12-01

    Flow shop production system - compared to other economically important production systems - is popular in real manufacturing environments. This study focuses on the flow shop with multiprocessor scheduling problem (FSMP), and develops an improved particle swarm optimisation heuristic to solve it. Additionally, this study designs an integer programming model to perform effectiveness and robustness testing on the proposed heuristic. Experimental results demonstrate a 10% to 50% improvement in the effectiveness of the proposed heuristic in small-scale problem tests, and a 10% to 40% improvement in the robustness of the heuristic in large-scale problem tests, indicating extremely satisfactory performance.

  1. Mapping virtual addresses to different physical addresses for value disambiguation for thread memory access requests

    DOEpatents

    Gala, Alan; Ohmacht, Martin

    2014-09-02

    A multiprocessor system includes nodes. Each node includes a data path that includes a core, a TLB, and a first level cache implementing disambiguation. The system also includes at least one second level cache and a main memory. For thread memory access requests, the core uses an address associated with an instruction format of the core. The first level cache uses an address format related to the size of the main memory plus an offset corresponding to hardware thread meta data. The second level cache uses a physical main memory address plus software thread meta data to store the memory access request. The second level cache accesses the main memory using the physical address with neither the offset nor the thread meta data after resolving speculation. In short, this system includes mapping of a virtual address to a different physical addresses for value disambiguation for different threads.

  2. System and method for memory allocation in a multiclass memory system

    DOEpatents

    Loh, Gabriel; Meswani, Mitesh; Ignatowski, Michael; Nutter, Mark

    2016-06-28

    A system for memory allocation in a multiclass memory system includes a processor coupleable to a plurality of memories sharing a unified memory address space, and a library store to store a library of software functions. The processor identifies a type of a data structure in response to a memory allocation function call to the library for allocating memory to the data structure. Using the library, the processor allocates portions of the data structure among multiple memories of the multiclass memory system based on the type of the data structure.

  3. Data Sharing.

    PubMed

    Longo, Dan L; Drazen, Jeffrey M

    2016-01-21

    The aerial view of the concept of data sharing is beautiful. What could be better than having high-quality information carefully reexamined for the possibility that new nuggets of useful data are lying there, previously unseen? The potential for leveraging existing results for even more benefit pays appropriate increased tribute to the patients who put themselves at risk to generate the data. The moral imperative to honor their collective sacrifice is the trump card that takes this trick. However, many of us who have actually conducted clinical research, managed clinical studies and data collection and analysis, and curated data sets have . . . PMID:26789876

  4. A hybrid pyramid multiprocessor system for image processing. Volumes I and II

    SciTech Connect

    Chen, Inching.

    1989-01-01

    Various multiprocessor architectures have been considered by many researchers to handle the high computational requirements of image processing and analysis application. However, many of these architectures are efficient only for a small class of image processing algorithms. In this research, a multiprocessor system has been proposed, designed and constructed taking into consideration various input-output and other characteristics of image processing applications. It is a hybrid pyramid with five 68020-68881 based processor nodes in the top two layers and sixteen DSP56001 based processor nodes in the third layer. The DSP (RISC) processor nodes at the bottom level are optimized for low-level image processing operations and the CISC (68020) processor nodes handle high-level tasks more efficiently. Experiments using the algorithms that have operations on neighborhoods of different sizes have shown consistent improvement in performance if the FIFO cache is enabled. Larger neighborhoods result in greater saving in time. Preliminary test indicate that the top five processor nodes can execute five times as fast as a single node for many image processing tasks. Finally, the versatile image I/O with the MMU has created a simpler programming environment, while facilitating various I/O structures. The OSU pyramid is a general-purpose image processing system, utilizing pyramidal architecture of hybrid processors, with additional hardware to retain the advantageous features of array processors, as well as to overcome some of the inherent deficiencies of pipeline processors and cellular arrays.

  5. Development of FPGA based NURBS interpolator and motion controller with multiprocessor technique

    NASA Astrophysics Data System (ADS)

    Zhao, Huan; Zhu, Limin; Xiong, Zhenhua; Ding, Han

    2013-09-01

    The high-speed computational performance is gained at the cost of huge hardware resource, which restricts the application of high-accuracy algorithms because of the limited hardware cost in practical use. To solve the problem, a novel method for designing the field programmable gate array(FPGA)-based non-uniform rational B-spline(NURBS) interpolator and motion controller, which adopts the embedded multiprocessor technique, is proposed in this study. The hardware and software design for the multiprocessor, one of which is for NURBS interpolation and the other for position servo control, is presented. Performance analysis and experiments on an X-Y table are carried out, hardware cost as well as consuming time for interpolation and motion control is compared with the existing methods. The experimental and comparing results indicate that, compared with the existing methods, the proposed method can reduce the hardware cost by 97.5% using higher-accuracy interpolation algorithm within the period of 0.5 ms. A method which ensures the real-time performance and interpolation accuracy, and reduces the hardware cost significantly is proposed, and it’s practical in the use of industrial application.

  6. BioThreads: a novel VLIW-based chip multiprocessor for accelerating biomedical image processing applications.

    PubMed

    Stevens, David; Chouliaras, Vassilios; Azorin-Peris, Vicente; Zheng, Jia; Echiadis, Angelos; Hu, Sijung

    2012-06-01

    We discuss BioThreads, a novel, configurable, extensible system-on-chip multiprocessor and its use in accelerating biomedical signal processing applications such as imaging photoplethysmography (IPPG). BioThreads is derived from the LE1 open-source VLIW chip multiprocessor and efficiently handles instruction, data and thread-level parallelism. In addition, it supports a novel mechanism for the dynamic creation, and allocation of software threads to uncommitted processor cores by implementing key POSIX Threads primitives directly in hardware, as custom instructions. In this study, the BioThreads core is used to accelerate the calculation of the oxygen saturation map of living tissue in an experimental setup consisting of a high speed image acquisition system, connected to an FPGA board and to a host system. Results demonstrate near-linear acceleration of the core kernels of the target blood perfusion assessment with increasing number of hardware threads. The BioThreads processor was implemented on both standard-cell and FPGA technologies; in the first case and for an issue width of two, full real-time performance is achieved with 4 cores whereas on a mid-range Xilinx Virtex6 device this is achieved with 10 dual-issue cores. An 8-core LE1 VLIW FPGA prototype of the system achieved 240 times faster execution time than the scalar Microblaze processor demonstrating the scalability of the proposed solution to a state-of-the-art FPGA vendor provided soft CPU core. PMID:23853147

  7. Memory Matters

    MedlinePlus

    ... different parts. Some of them are important for memory. The hippocampus (say: hih-puh-KAM-pus) is one of the more important parts of the brain that processes memories. Old information and new information, or memories, are ...

  8. Sharing values, sharing a vision

    SciTech Connect

    Not Available

    1993-12-31

    Teamwork, partnership and shared values emerged as recurring themes at the Third Technology Transfer/Communications Conference. The program drew about 100 participants who sat through a packed two days to find ways for their laboratories and facilities to better help American business and the economy. Co-hosts were the Lawrence Livermore National Laboratory and the Lawrence Berkeley Laboratory, where most meetings took place. The conference followed traditions established at the First Technology Transfer/Communications Conference, conceived of and hosted by the Pacific Northwest Laboratory in May 1992 in Richmond, Washington, and the second conference, hosted by the National Renewable Energy Laboratory in January 1993 in Golden, Colorado. As at the other conferences, participants at the third session represented the fields of technology transfer, public affairs and communications. They came from Department of Energy headquarters and DOE offices, laboratories and production facilities. Continued in this report are keynote address; panel discussion; workshops; and presentations in technology transfer.

  9. Event parallelism: Distributed memory parallel computing for high energy physics experiments

    SciTech Connect

    Nash, T.

    1989-05-01

    This paper describes the present and expected future development of distributed memory parallel computers for high energy physics experiments. It covers the use of event parallel microprocessor farms, particularly at Fermilab, including both ACP multiprocessors and farms of MicroVAXES. These systems have proven very cost effective in the past. A case is made for moving to the more open environment of UNIX and RISC processors. The 2nd Generation ACP Multiprocessor System, which is based on powerful RISC systems, is described. Given the promise of still more extraordinary increases in processor performance, a new emphasis on point to point, rather than bussed, communication will be required. Developments in this direction are described. 6 figs.

  10. [Memory systems and memory disorders].

    PubMed

    Van der Linden, Martial; Juillerat, Anne-Claude

    2003-02-15

    Recent cognitive models suggest that memory has a complex structure, composed of several independent systems (working memory, and four long-term memory systems: episodic memory, semantic memory, perceptual representation system, and procedural memory). Furthermore, neuropsychological studies show that a brain lesion can selectively impair some systems or some particular process in a system, while others are spared. In this theoretical context, the objective of assessment is to detect the impaired memory systems and processes as well as those, which remain intact. To do this, the clinician has to use various-tests specifically designed to assess the integrity of each memory system and process. PMID:12708274

  11. The OpenMP Memory Model

    SciTech Connect

    Hoeflinger, J P; de Supinski, B R

    2005-06-01

    The memory model of OpenMP has been widely misunderstood since the first OpenMP specification was published in 1997 (Fortran 1.0). The proposed OpenMP specification (version 2.5) includes a memory model section to address this issue. This section unifies and clarifies the text about the use of memory in all previous specifications, and relates the model to well-known memory consistency semantics. In this paper, we discuss the memory model and show its implications for future distributed shared memory implementations of OpenMP.

  12. Bipartite memory network architectures for parallel processing

    SciTech Connect

    Smith, W.; Kale, L.V. . Dept. of Computer Science)

    1990-01-01

    Parallel architectures are boradly classified as either shared memory or distributed memory architectures. In this paper, the authors propose a third family of architectures, called bipartite memory network architectures. In this architecture, processors and memory modules constitute a bipartite graph, where each processor is allowed to access a small subset of the memory modules, and each memory module allows access from a small set of processors. The architecture is particularly suitable for computations requiring dynamic load balancing. The authors explore the properties of this architecture by examining the Perfect Difference set based topology for the graph. Extensions of this topology are also suggested.

  13. Commodity multi-processor systems in the ATLAS level-2 trigger

    SciTech Connect

    Abolins, M.; Blair, R.; Bock, R.; Bogaerts, A.; Dawson, J.; Ermoline, Y.; Hauser, R.; Kugel, A.; Lay, R.; Muller, M.; Noffz, K.-H.; Pope, B.; Schlereth, J.; Werner, P.

    2000-05-23

    Low cost SMP (Symmetric Multi-Processor) systems provide substantial CPU and I/O capacity. These features together with the ease of system integration make them an attractive and cost effective solution for a number of real-time applications in event selection. In ATLAS the authors consider them as intelligent input buffers (active ROB complex), as event flow supervisors or as powerful processing nodes. Measurements of the performance of one off-the-shelf commercial 4-processor PC with two PCI buses, equipped with commercial FPGA based data source cards (microEnable) and running commercial software are presented and mapped on such applications together with a long-term program of work. The SMP systems may be considered as an important building block in future data acquisition systems.

  14. Closed-form solutions of performability. [modeling of a degradable buffer/multiprocessor system

    NASA Technical Reports Server (NTRS)

    Meyer, J. F.

    1981-01-01

    Methods which yield closed form performability solutions for continuous valued variables are developed. The models are similar to those employed in performance modeling (i.e., Markovian queueing models) but are extended so as to account for variations in structure due to faults. In particular, the modeling of a degradable buffer/multiprocessor system is considered whose performance Y is the (normalized) average throughput rate realized during a bounded interval of time. To avoid known difficulties associated with exact transient solutions, an approximate decomposition of the model is employed permitting certain submodels to be solved in equilibrium. These solutions are then incorporated in a model with fewer transient states and by solving the latter, a closed form solution of the system's performability is obtained. In conclusion, some applications of this solution are discussed and illustrated, including an example of design optimization.

  15. A partitioning strategy for efficient nonlinear finite element dynamic analysis on multiprocessor computer

    NASA Technical Reports Server (NTRS)

    Noor, Ahmed K.; Peters, Jeanne M.

    1989-01-01

    A computational procedure is presented for the nonlinear dynamic analysis of unsymmetric structures on vector multiprocessor systems. The procedure is based on a novel hierarchical partitioning strategy in which the response of the unsymmetric and antisymmetric response vectors (modes), each obtained by using only a fraction of the degrees of freedom of the original finite element model. The three key elements of the procedure which result in high degree of concurrency throughout the solution process are: (1) mixed (or primitive variable) formulation with independent shape functions for the different fields; (2) operator splitting or restructuring of the discrete equations at each time step to delineate the symmetric and antisymmetric vectors constituting the response; and (3) two level iterative process for generating the response of the structure. An assessment is made of the effectiveness of the procedure on the CRAY X-MP/4 computers.

  16. Partitioning strategy for efficient nonlinear finite element dynamic analysis on multiprocessor computers

    NASA Technical Reports Server (NTRS)

    Noor, Ahmed K.; Peters, Jeanne M.

    1989-01-01

    A computational procedure is presented for the nonlinear dynamic analysis of unsymmetric structures on vector multiprocessor systems. The procedure is based on a novel hierarchical partitioning strategy in which the response of the unsymmetric and antisymmetric response vectors (modes), each obtained by using only a fraction of the degrees of freedom of the original finite element model. The three key elements of the procedure which result in high degree of concurrency throughout the solution process are: (1) mixed (or primitive variable) formulation with independent shape functions for the different fields; (2) operator splitting or restructuring of the discrete equations at each time step to delineate the symmetric and antisymmetric vectors constituting the response; and (3) two level iterative process for generating the response of the structure. An assessment is made of the effectiveness of the procedure on the CRAY X-MP/4 computers.

  17. Analysis of Photonic Networks for a Chip Multiprocessor Using Scientific Applications

    SciTech Connect

    Kamil, Shoaib A; Hendry, Gilbert; Biberman, Aleksandr; Chan, Johnnie; Lee, Benjamin G.; Mohiyuddin, Marghoob; Jain, Ankit; Bergman, Keren; Carloni, Luca; Kubiatowicz, John; Oliker, Leonid; Shalf, John

    2009-01-31

    As multiprocessors scale to unprecedented numbers of cores in order to sustain performance growth, it is vital that these gains are not nullified by high energy consumption from inter-core communication. With recent advances in 3D Integration CMOS technology, the possibility for realizing hybrid photonic-electronic networks-on-chip warrants investigating real application traces on functionally comparable photonic and electronic network designs. We present a comparative analysis using both synthetic benchmarks as well as real applications, run through detailed cycle accurate models implemented under the OMNeT++ discrete event simulation environment. Results show that when utilizing standard process-to-processor mapping methods, this hybrid network can achieve 75X improvement in energy efficiency for synthetic benchmarks and up to 37X improvement for real scientific applications, defined as network performance per energy spent, over an electronic mesh for large messages across a variety of communication patterns.

  18. Simulating a small turboshaft engine in real-time multiprocessor simulator (RTMPS) environment

    NASA Technical Reports Server (NTRS)

    Milner, E. J.; Arpasi, D. J.

    1986-01-01

    A Real-Time Multiprocessor Simulator (RTMPS) has been developed at NASA Lewis Research Center. The RTMPS uses parallel microprocessors to achieve computing speeds needed for real-time engine simulation. This report describes the use of the RTMPS system to simulate a small turboshaft engine. The process of programming the engine equations and distributing them over one, two, and four processors is discussed. Steady-state and transient results from the RTMPS simulation are compared with results from a main-frame-based simulation. Processor execution times and the associated execution time savings for the two and four processor cases are presented using actual data obtained from the RTMPS system. Included is a discussion of why the minimum achievable calculation time for the turboshaft engine model was attained using four processors. Finally, future enhancements to the RTMPS system are discussed including the development of a generalized partitioning algorithm to automatically distribute the system equations among the processors in optimum fashion.

  19. Characterizing parallel file-access patterns on a large-scale multiprocessor

    NASA Technical Reports Server (NTRS)

    Purakayastha, A.; Ellis, Carla; Kotz, David; Nieuwejaar, Nils; Best, Michael L.

    1995-01-01

    High-performance parallel file systems are needed to satisfy tremendous I/O requirements of parallel scientific applications. The design of such high-performance parallel file systems depends on a comprehensive understanding of the expected workload, but so far there have been very few usage studies of multiprocessor file systems. This paper is part of the CHARISMA project, which intends to fill this void by measuring real file-system workloads on various production parallel machines. In particular, we present results from the CM-5 at the National Center for Supercomputing Applications. Our results are unique because we collect information about nearly every individual I/O request from the mix of jobs running on the machine. Analysis of the traces leads to various recommendations for parallel file-system design.

  20. Memory Palaces

    ERIC Educational Resources Information Center

    Wood, Marianne

    2007-01-01

    This article presents a lesson called Memory Palaces. A memory palace is a memory tool used to remember information, usually as visual images, in a sequence that is logical to the person remembering it. In his book, "In the Palaces of Memory", George Johnson calls them "...structure(s) for arranging knowledge. Lots of connections to language arts,…

  1. Real-time aware rendering of scalable arbitrary-shaped MPEG-4 decoder for multiprocessor systems

    NASA Astrophysics Data System (ADS)

    Pastrnak, Milan; de With, Peter H. N.; van Meerbergen, Jef

    2007-02-01

    The MPEG-4 video standard extends the traditional frame-based processing with the option to compose several video objects (VO) superimposed on a background sprite image. In our previous work, we presented a distributed, multiprocessor based, scalable implementation of an MPEG-4 arbitrary-shaped decoder, which forms together with the background sprite decoder an essential part for further scene rendering. For control of the multiprocessor architecture, we have constructed a Quality-of-Service (QoS) management that monitors the availability of required data and distributes the processing of individual tasks with guaranteed or best-effort services of the platform. However, the proposed architecture with the combined guaranteed and best-effort services poses problems for real-time scene rendering. In this paper, we present a technique for proper run-time rendering of the final scene after decoding one VO Layer. The individual video-object monitors check the data availability and select the highest quality for the final scene rendering. The algorithm operates hierarchically both at the scene level and at the task level of the video object processing. Whereas the earlier work on scalable implementation concentrated only on guaranteed services, we now introduce a new element in the system architecture for the real-time control and fall back mechanism of the best-effort services. This element is based on first, controlling data availability at task level, and second, introducing the propagation service to QoS management. We present our simulation results in the comparison with the standard "frame-skipping" technique that is the only currently available solution to this type of rendering a scalable processing.

  2. Involuntary memory chains: what do they tell us about autobiographical memory organisation?

    PubMed

    Mace, John H; Clevinger, Amanda M; Bernas, Ronan S

    2013-04-01

    Involuntary memory chains are spontaneous recollections of the past that occur as a sequence of associated memories. This memory phenomenon has provided some insights into the nature of associations in autobiographical memory. For example, it has shown that conceptually associated memories (memories sharing similar content, such as the same people or themes) are more prevalent than general-event associated memories (memories from the same extended event period, such as a trip). This finding has suggested that conceptual associations are a central organisational principle in the autobiographical memory system. This study used involuntary memories chains to gain additional insights into the associative structure of autobiographical memory. Among the main results, we found that general-event associations have higher rates of forgetting than conceptual associations, and in long memory chains (i.e., those with more than two memories) conceptually associated memories were more likely to activate memories in their associative class, whereas general-event associated memories were less likely to activate memories in their associative class. We interpret the results as further evidence that conceptual associations are a major organising principle in the autobiographical memory system, and attempt to explain why general-event associations have shorter lifespans than conceptual associations. PMID:23016577

  3. The evolution of episodic memory

    PubMed Central

    Allen, Timothy A.; Fortin, Norbert J.

    2013-01-01

    One prominent view holds that episodic memory emerged recently in humans and lacks a “(neo)Darwinian evolution” [Tulving E (2002) Annu Rev Psychol 53:1–25]. Here, we review evidence supporting the alternative perspective that episodic memory has a long evolutionary history. We show that fundamental features of episodic memory capacity are present in mammals and birds and that the major brain regions responsible for episodic memory in humans have anatomical and functional homologs in other species. We propose that episodic memory capacity depends on a fundamental neural circuit that is similar across mammalian and avian species, suggesting that protoepisodic memory systems exist across amniotes and, possibly, all vertebrates. The implication is that episodic memory in diverse species may primarily be due to a shared underlying neural ancestry, rather than the result of evolutionary convergence. We also discuss potential advantages that episodic memory may offer, as well as species-specific divergences that have developed on top of the fundamental episodic memory architecture. We conclude by identifying possible time points for the emergence of episodic memory in evolution, to help guide further research in this area. PMID:23754432

  4. Abnormal fault-recovery characteristics of the fault-tolerant multiprocessor uncovered using a new fault-injection methodology

    NASA Astrophysics Data System (ADS)

    Padilla, Peter A.

    1991-03-01

    An investigation was made in AIRLAB of the fault handling performance of the Fault Tolerant MultiProcessor (FTMP). Fault handling errors detected during fault injection experiments were characterized. In these fault injection experiments, the FTMP disabled a working unit instead of the faulted unit once in every 500 faults, on the average. System design weaknesses allow active faults to exercise a part of the fault management software that handles Byzantine or lying faults. Byzantine faults behave such that the faulted unit points to a working unit as the source of errors. The design's problems involve: (1) the design and interface between the simplex error detection hardware and the error processing software, (2) the functional capabilities of the FTMP system bus, and (3) the communication requirements of a multiprocessor architecture. These weak areas in the FTMP's design increase the probability that, for any hardware fault, a good line replacement unit (LRU) is mistakenly disabled by the fault management software.

  5. Abnormal fault-recovery characteristics of the fault-tolerant multiprocessor uncovered using a new fault-injection methodology

    NASA Technical Reports Server (NTRS)

    Padilla, Peter A.

    1991-01-01

    An investigation was made in AIRLAB of the fault handling performance of the Fault Tolerant MultiProcessor (FTMP). Fault handling errors detected during fault injection experiments were characterized. In these fault injection experiments, the FTMP disabled a working unit instead of the faulted unit once in every 500 faults, on the average. System design weaknesses allow active faults to exercise a part of the fault management software that handles Byzantine or lying faults. Byzantine faults behave such that the faulted unit points to a working unit as the source of errors. The design's problems involve: (1) the design and interface between the simplex error detection hardware and the error processing software, (2) the functional capabilities of the FTMP system bus, and (3) the communication requirements of a multiprocessor architecture. These weak areas in the FTMP's design increase the probability that, for any hardware fault, a good line replacement unit (LRU) is mistakenly disabled by the fault management software.

  6. Memory systems.

    PubMed

    Wolk, David A; Budson, Andrew E

    2010-08-01

    Converging evidence from patient and neuroimaging studies suggests that memory is a collection of abilities that use different neuroanatomic systems. Neurologic injury may impair one or more of these memory systems. Episodic memory allows us to mentally travel back in time and relive an episode of our life. Episodic memory depends on the hippocampus, other medial temporal lobe structures, the limbic system, and the frontal lobes, as well as several other brain regions. Semantic memory provides our general knowledge about the world and is unconnected to any specific episode of our life. Although semantic memory likely involves much of the neocortex, the inferolateral temporal lobes (particularly the left) are most important. Procedural memory enables us to learn cognitive and behavioral skills and algorithms that operate at an automatic, unconscious level. Damage to the basal ganglia, cerebellum, and supplementary motor area often impair procedural memory. PMID:22810510

  7. Loop transformations to prevent false sharing

    SciTech Connect

    Granston, E.D.; Montaut, T.; Bodin, F.

    1995-08-01

    To date, page management in shared virtual memory (SVM) systems has been primarily the responsibility of the run-time system. However, there are some problems that are difficult to resolve efficiently at run time. Chief among these is false sharing. In this paper, a loop transformation theory is developed for identifying and eliminating potential sources of multiple-writer false sharing and other sources of page migration resulting from regular references in numerical applications. Loop nests of one and two dimensions (before blocking) with single-level, DOALL-style parallelism are covered. The potential of these transformations is demonstrated experimentally.

  8. Cognitive memory.

    PubMed

    Widrow, Bernard; Aragon, Juan Carlos

    2013-05-01

    Regarding the workings of the human mind, memory and pattern recognition seem to be intertwined. You generally do not have one without the other. Taking inspiration from life experience, a new form of computer memory has been devised. Certain conjectures about human memory are keys to the central idea. The design of a practical and useful "cognitive" memory system is contemplated, a memory system that may also serve as a model for many aspects of human memory. The new memory does not function like a computer memory where specific data is stored in specific numbered registers and retrieval is done by reading the contents of the specified memory register, or done by matching key words as with a document search. Incoming sensory data would be stored at the next available empty memory location, and indeed could be stored redundantly at several empty locations. The stored sensory data would neither have key words nor would it be located in known or specified memory locations. Sensory inputs concerning a single object or subject are stored together as patterns in a single "file folder" or "memory folder". When the contents of the folder are retrieved, sights, sounds, tactile feel, smell, etc., are obtained all at the same time. Retrieval would be initiated by a query or a prompt signal from a current set of sensory inputs or patterns. A search through the memory would be made to locate stored data that correlates with or relates to the prompt input. The search would be done by a retrieval system whose first stage makes use of autoassociative artificial neural networks and whose second stage relies on exhaustive search. Applications of cognitive memory systems have been made to visual aircraft identification, aircraft navigation, and human facial recognition. Concerning human memory, reasons are given why it is unlikely that long-term memory is stored in the synapses of the brain's neural networks. Reasons are given suggesting that long-term memory is stored in DNA or RNA

  9. 3-dimensional magnetotelluric inversion including topography using deformed hexahedral edge finite elements and direct solvers parallelized on symmetric multiprocessor computers - Part II: direct data-space inverse solution

    NASA Astrophysics Data System (ADS)

    Kordy, M.; Wannamaker, P.; Maris, V.; Cherkaev, E.; Hill, G.

    2016-01-01

    Following the creation described in Part I of a deformable edge finite-element simulator for 3-D magnetotelluric (MT) responses using direct solvers, in Part II we develop an algorithm named HexMT for 3-D regularized inversion of MT data including topography. Direct solvers parallelized on large-RAM, symmetric multiprocessor (SMP) workstations are used also for the Gauss-Newton model update. By exploiting the data-space approach, the computational cost of the model update becomes much less in both time and computer memory than the cost of the forward simulation. In order to regularize using the second norm of the gradient, we factor the matrix related to the regularization term and apply its inverse to the Jacobian, which is done using the MKL PARDISO library. For dense matrix multiplication and factorization related to the model update, we use the PLASMA library which shows very good scalability across processor cores. A synthetic test inversion using a simple hill model shows that including topography can be important; in this case depression of the electric field by the hill can cause false conductors at depth or mask the presence of resistive structure. With a simple model of two buried bricks, a uniform spatial weighting for the norm of model smoothing recovered more accurate locations for the tomographic images compared to weightings which were a function of parameter Jacobians. We implement joint inversion for static distortion matrices tested using the Dublin secret model 2, for which we are able to reduce nRMS to ˜1.1 while avoiding oscillatory convergence. Finally we test the code on field data by inverting full impedance and tipper MT responses collected around Mount St Helens in the Cascade volcanic chain. Among several prominent structures, the north-south trending, eruption-controlling shear zone is clearly imaged in the inversion.

  10. Destination memory impairment in older people.

    PubMed

    Gopie, Nigel; Craik, Fergus I M; Hasher, Lynn

    2010-12-01

    Older adults are assumed to have poor destination memory-knowing to whom they tell particular information-and anecdotes about them repeating stories to the same people are cited as informal evidence for this claim. Experiment 1 assessed young and older adults' destination memory by having participants tell facts (e.g., "A dime has 118 ridges around its edge") to pictures of famous people (e.g., Oprah Winfrey). Surprise recognition memory tests, which also assessed confidence, revealed that older adults, compared to young adults, were disproportionately impaired on destination memory relative to spared memory for the individual components (i.e., facts, faces) of the episode. Older adults also were more confident that they had not told a fact to a particular person when they actually had (i.e., a miss); this presumably causes them to repeat information more often than young adults. When the direction of information transfer was reversed in Experiment 2, such that the famous people shared information with the participants (i.e., a source memory experiment), age-related memory differences disappeared. In contrast to the destination memory experiment, older adults in the source memory experiment were more confident than young adults that someone had shared a fact with them when a different person actually had shared the fact (i.e., a false alarm). Overall, accuracy and confidence jointly influence age-related changes to destination memory, a fundamental component of successful communication. PMID:20718537

  11. The neural basis of involuntary episodic memories.

    PubMed

    Hall, Shana A; Rubin, David C; Miles, Amanda; Davis, Simon W; Wing, Erik A; Cabeza, Roberto; Berntsen, Dorthe

    2014-10-01

    Voluntary episodic memories require an intentional memory search, whereas involuntary episodic memories come to mind spontaneously without conscious effort. Cognitive neuroscience has largely focused on voluntary memory, leaving the neural mechanisms of involuntary memory largely unknown. We hypothesized that, because the main difference between voluntary and involuntary memory is the controlled retrieval processes required by the former, there would be greater frontal activity for voluntary than involuntary memories. Conversely, we predicted that other components of the episodic retrieval network would be similarly engaged in the two types of memory. During encoding, all participants heard sounds, half paired with pictures of complex scenes and half presented alone. During retrieval, paired and unpaired sounds were presented, panned to the left or to the right. Participants in the involuntary group were instructed to indicate the spatial location of the sound, whereas participants in the voluntary group were asked to additionally recall the pictures that had been paired with the sounds. All participants reported the incidence of their memories in a postscan session. Consistent with our predictions, voluntary memories elicited greater activity in dorsal frontal regions than involuntary memories, whereas other components of the retrieval network, including medial-temporal, ventral occipitotemporal, and ventral parietal regions were similarly engaged by both types of memories. These results clarify the distinct role of dorsal frontal and ventral occipitotemporal regions in predicting strategic retrieval and recalled information, respectively, and suggest that, although there are neural differences in retrieval, involuntary memories share neural components with established voluntary memory systems. PMID:24702453

  12. Microsupercomputers: Design and implementation. Technical progress report, November 1988-March 1989

    SciTech Connect

    Hennessy, J.L.; Horowitz, M.A.

    1989-03-01

    Contents: (1) parallel processor architecture; (2) parallel software; (3) unit processor architecture; (4) computer aided designs tools; (5) very large scale integration. keywords: scalable shared memory multiprocessors, high performance cache design.

  13. Memory protection

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.

    1988-01-01

    Accidental overwriting of files or of memory regions belonging to other programs, browsing of personal files by superusers, Trojan horses, and viruses are examples of breakdowns in workstations and personal computers that would be significantly reduced by memory protection. Memory protection is the capability of an operating system and supporting hardware to delimit segments of memory, to control whether segments can be read from or written into, and to confine accesses of a program to its segments alone. The absence of memory protection in many operating systems today is the result of a bias toward a narrow definition of performance as maximum instruction-execution rate. A broader definition, including the time to get the job done, makes clear that cost of recovery from memory interference errors reduces expected performance. The mechanisms of memory protection are well understood, powerful, efficient, and elegant. They add to performance in the broad sense without reducing instruction execution rate.

  14. Quantum memory Quantum memory

    NASA Astrophysics Data System (ADS)

    Le Gouët, Jean-Louis; Moiseev, Sergey

    2012-06-01

    Interaction of quantum radiation with multi-particle ensembles has sparked off intense research efforts during the past decade. Emblematic of this field is the quantum memory scheme, where a quantum state of light is mapped onto an ensemble of atoms and then recovered in its original shape. While opening new access to the basics of light-atom interaction, quantum memory also appears as a key element for information processing applications, such as linear optics quantum computation and long-distance quantum communication via quantum repeaters. Not surprisingly, it is far from trivial to practically recover a stored quantum state of light and, although impressive progress has already been accomplished, researchers are still struggling to reach this ambitious objective. This special issue provides an account of the state-of-the-art in a fast-moving research area that makes physicists, engineers and chemists work together at the forefront of their discipline, involving quantum fields and atoms in different media, magnetic resonance techniques and material science. Various strategies have been considered to store and retrieve quantum light. The explored designs belong to three main—while still overlapping—classes. In architectures derived from photon echo, information is mapped over the spectral components of inhomogeneously broadened absorption bands, such as those encountered in rare earth ion doped crystals and atomic gases in external gradient magnetic field. Protocols based on electromagnetic induced transparency also rely on resonant excitation and are ideally suited to the homogeneous absorption lines offered by laser cooled atomic clouds or ion Coulomb crystals. Finally off-resonance approaches are illustrated by Faraday and Raman processes. Coupling with an optical cavity may enhance the storage process, even for negligibly small atom number. Multiple scattering is also proposed as a way to enlarge the quantum interaction distance of light with matter. The

  15. Solving Partial Differential Equations in a data-driven multiprocessor environment

    SciTech Connect

    Gaudiot, J.L.; Lin, C.M.; Hosseiniyar, M.

    1988-12-31

    Partial differential equations can be found in a host of engineering and scientific problems. The emergence of new parallel architectures has spurred research in the definition of parallel PDE solvers. Concurrently, highly programmable systems such as data-how architectures have been proposed for the exploitation of large scale parallelism. The implementation of some Partial Differential Equation solvers (such as the Jacobi method) on a tagged token data-flow graph is demonstrated here. Asynchronous methods (chaotic relaxation) are studied and new scheduling approaches (the Token No-Labeling scheme) are introduced in order to support the implementation of the asychronous methods in a data-driven environment. New high-level data-flow language program constructs are introduced in order to handle chaotic operations. Finally, the performance of the program graphs is demonstrated by a deterministic simulation of a message passing data-flow multiprocessor. An analysis of the overhead in the data-flow graphs is undertaken to demonstrate the limits of parallel operations in dataflow PDE program graphs.

  16. Fault-free behavior of reliable multiprocessor systems: FTMP experiments in AIRLAB

    NASA Technical Reports Server (NTRS)

    Clune, E.; Segall, Z.; Siewiorek, D.

    1985-01-01

    This report describes a set of experiments which were implemented on the Fault tolerant Multi-Processor (FTMP) at NASA/Langley's AIRLAB facility. These experiments are part of an effort to formulate and evaluate validation methodologies for fault-tolerant computers. This report deals with the measurement of single parameters (baselines) of a fault free system. The initial set of baseline experiments lead to the following conclusions: (1) The system clock is constant and independent of workload in the tested cases; (2) the instruction execution times are constant; (3) the R4 frame size is 40mS with some variation; (4) the frame stretching mechanism has some flaws in its implementation that allow the possibility of an infinite stretching of frame duration. Future experiments are planned. Some will broaden the results of these initial experiments. Others will measure the system more dynamically. The implementation of a synthetic workload generation mechanism for FTMP is planned to enhance the experimental environment of the system.

  17. Common interface real-time multiprocessor operating system for embedded systems. Master's thesis

    SciTech Connect

    Rottman, M.S.

    1991-03-04

    Large real time applications such as aerospace avionics systems, battle management, and factory automation place many demands and constraints on the computing system not found in other applications. Software development is hindered by software dependence on the computer architecture and the lack of portability between systems. This thesis specifies and designs a real time multiprocessor operating system (RTMOS) that implements a consistent programming model, enabling the development of real time parallel software independent of the target architecture. The RTMOS defines the core functionality required to demonstrate the programming model. The RTMOS functional requirements are specified using Structured Analysis and Design Technique (SADT). A hybrid of the Design Approach for Real-Time Software (DARTS) is used to perform the preliminary and detailed designs. The preliminary design is architecture-independent; the detailed design phase maps the design to a specific parallel system, the Intel iPSC/2 hypercube. The modular RTMOS design partitions operating system operations and data structures from hardware-dependent functions for portability.

  18. MIMD (multiple instruction multiple data) multiprocessor system for real-time image processing

    NASA Astrophysics Data System (ADS)

    Pirsch, Peter; Jeschke, Hartwig

    1991-06-01

    Anovel MIMD (Multiple Instruction Multiple Data) based architecture consisting of multiple processing elements (PE) has been developed. This architecture is adapted to real-time processing of sequences of different tasks for local image segments. Each PE contains an arithmetic processing unit (APU), adapted to parallel processing of low level operations, and a high level and control processor (HLCP) for medium and high level operations and control of the PE. This HLCP can be a standard signal processor or a RISC processor. Because of the local control of each PE by the HLCP and a SIMD structure of the APU, the overall system architecture is characterized as MIMD based with a local SIMD structure for low level processing. Due to an overlapped computation and communication the multiprocessor system achieves a linear speedup compared to a single processing element. Main parts of the PE have been realized as two ASICs in a 1.5 jim CMOS-Process. With a system clock rate of 25MHz, each PE provides a peak performance of 400 Mega operations per second (MOPS).

  19. Reusing existing resources for testing a multi-processor system-on-chip

    NASA Astrophysics Data System (ADS)

    Lee, Seung Eun

    2013-03-01

    In this article, we propose a test strategy for a multi-processor system-on-chip and model the test time for distributed Intellectual Property (IP) cores. The proposed test methodology uses the existing on-chip resources, IP cores and network elements in network-on-chip. The use of embedded IP cores as a built- in self-test (BIST) module completes the test much faster than an external test and provides flexibility in the test program. Moreover, the reuse of the existing network resources as a test media eliminates additional test access mechanism (TAM) wires in the design and increases test parallelism, reducing the area and test time. Based on the proposed test methodology, we evaluate the test time for distributed IP cores. First, we define the model for a distributed IP core with four parameters in the context of test purposes. Next, the required test time is driven. Finally, we show the characteristics of IP cores for a parallel testing that provides useful information for the test scheduling.

  20. Declarative memory.

    PubMed

    Riedel, Wim J; Blokland, Arjan

    2015-01-01

    Declarative Memory consists of memory for events (episodic memory) and facts (semantic memory). Methods to test declarative memory are key in investigating effects of potential cognition-enhancing substances--medicinal drugs or nutrients. A number of cognitive performance tests assessing declarative episodic memory tapping verbal learning, logical memory, pattern recognition memory, and paired associates learning are described. These tests have been used as outcome variables in 34 studies in humans that have been described in the literature in the past 10 years. Also, the use of episodic tests in animal research is discussed also in relation to the drug effects in these tasks. The results show that nutritional supplementation of polyunsaturated fatty acids has been investigated most abundantly and, in a number of cases, but not all, show indications of positive effects on declarative memory, more so in elderly than in young subjects. Studies investigating effects of registered anti-Alzheimer drugs, cholinesterase inhibitors in mild cognitive impairment, show positive and negative effects on declarative memory. Studies mainly carried out in healthy volunteers investigating the effects of acute dopamine stimulation indicate enhanced memory consolidation as manifested specifically by better delayed recall, especially at time points long after learning and more so when drug is administered after learning and if word lists are longer. The animal studies reveal a different picture with respect to the effects of different drugs on memory performance. This suggests that at least for episodic memory tasks, the translational value is rather poor. For the human studies, detailed parameters of the compositions of word lists for declarative memory tests are discussed and it is concluded that tailored adaptations of tests to fit the hypothesis under study, rather than "off-the-shelf" use of existing tests, are recommended. PMID:25977084

  1. The FORCE - A highly portable parallel programming language

    NASA Technical Reports Server (NTRS)

    Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

    1989-01-01

    This paper explains why the FORCE parallel programming language is easily portable among six different shared-memory multiprocessors, and how a two-level macro preprocessor makes it possible to hide low-level machine dependencies and to build machine-independent high-level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared-memory multiprocessor executing them.

  2. Destination Memory Impairment in Older People

    PubMed Central

    Gopie, Nigel; Craik, Fergus I. M.; Hasher, Lynn

    2012-01-01

    Older adults are assumed to have poor destination memory— knowing to whom they tell particular information—and anecdotes about them repeating stories to the same people are cited as informal evidence for this claim. Experiment 1 assessed young and older adults’ destination memory by having participants tell facts (e.g., “A dime has 118 ridges around its edge”) to pictures of famous people (e.g., Oprah Winfrey). Surprise recognition memory tests, which also assessed confidence, revealed that older adults, compared to young adults, were disproportionately impaired on destination memory relative to spared memory for the individual components (i.e., facts, faces) of the episode. Older adults also were more confident that they had not told a fact to a particular person when they actually had (i.e., a miss); this presumably causes them to repeat information more often than young adults. When the direction of information transfer was reversed in Experiment 2, such that the famous people shared information with the participants (i.e., a source memory experiment), age-related memory differences disappeared. In contrast to the destination memory experiment, older adults in the source memory experiment were more confident than young adults that someone had shared a fact with them when a different person actually had shared the fact (i.e., a false alarm). Overall, accuracy and confidence jointly influence age-related changes to destination memory, a fundamental component of successful communication. PMID:20718537

  3. SHARING EDUCATIONAL SERVICES.

    ERIC Educational Resources Information Center

    Catskill Area Project in Small School Design, Oneonta, NY.

    SHARED SERVICES, A COOPERATIVE SCHOOL RESOURCE PROGRAM, IS DEFINED IN DETAIL. INCLUDED IS A DISCUSSION OF THEIR NEED, ADVANTAGES, GROWTH, DESIGN, AND OPERATION. SPECIFIC PROCEDURES FOR OBTAINING STATE AID IN SHARED SERVICES, EFFECTS OF SHARED SERVICES ON THE SCHOOL, AND HINTS CONCERNING SHARED SERVICES ARE DESCRIBED. CHARACTERISTICS OF THE SMALL…

  4. Children's Working Memory: Investigating Performance Limitations in Complex Span Tasks

    ERIC Educational Resources Information Center

    Conlin, J.A.; Gathercole, S.E.; Adams, J.W.

    2005-01-01

    Three experiments investigated the roles of resource-sharing and intrinsic memory demands in complex working memory span performance in 7- and 9-year-olds. In Experiment 1, the processing complexity of arithmetic operations was varied under conditions in which processing times were equivalent. Memory span did not differ as a function of processing…

  5. Virtual memory

    NASA Technical Reports Server (NTRS)

    Denning, P. J.

    1986-01-01

    Virtual memory was conceived as a way to automate overlaying of program segments. Modern computers have very large main memories, but need automatic solutions to the relocation and protection problems. Virtual memory serves this need as well and is thus useful in computers of all sizes. The history of the idea is traced, showing how it has become a widespread, little noticed feature of computers today.

  6. Runtime and Programming Support for Memory Adaptation in Scientific Applications via Local Disk and Remote Memory

    SciTech Connect

    Mills, Richard T; Yue, Chuan; Andreas, Stathopoulos; Nikolopoulos, Dimitrios S

    2007-01-01

    The ever increasing memory demands of many scientific applications and the complexity of today's shared computational resources still require the occasional use of virtual memory, network memory, or even out-of-core implementations, with well known drawbacks in performance and usability. In Mills et al. (Adapting to memory pressure from within scientific applications on multiprogrammed COWS. In: International Parallel and Distributed Processing Symposium, IPDPS, Santa Fe, NM, 2004), we introduced a basic framework for a runtime, user-level library, MMlib, in which DRAM is treated as a dynamic size cache for large memory objects residing on local disk. Application developers can specify and access these objects through MMlib, enabling their application to execute optimally under variable memory availability, using as much DRAM as fluctuating memory levels will allow. In this paper, we first extend our earlier MMlib prototype from a proof of concept to a usable, robust, and flexible library. We present a general framework that enables fully customizable memory malleability in a wide variety of scientific applications. We provide several necessary enhancements to the environment sensing capabilities of MMlib, and introduce a remote memory capability, based on MPI communication of cached memory blocks between 'compute nodes' and designated memory servers. The increasing speed of interconnection networks makes a remote memory approach attractive, especially at the large granularity present in large scientific applications. We show experimental results from three important scientific applications that require the general MMlib framework. The memory-adaptive versions perform nearly optimally under constant memory pressure and execute harmoniously with other applications competing for memory, without thrashing the memory system. Under constant memory pressure, we observe execution time improvements of factors between three and

  7. Memories Are Made of This

    ERIC Educational Resources Information Center

    Chang, Christine

    2010-01-01

    In this article, the author shares her memories of Sally Smith, the founder of The Lab School of Washington, where she works as the director of the Occupational Therapy. When the author first met Smith, Smith asked her what brought her to The Lab School at that point in her career. She told Smith that her background was rather eclectic, since she…

  8. Ferroelectric memory

    NASA Astrophysics Data System (ADS)

    Vorotilov, K. A.; Sigov, A. S.

    2012-05-01

    The current status of developments in the field of ferroelectric memory devices has been considered. The rapidly growing market of non-volatile memory devices has been analyzed, and the current state of the art and prospects for the scaling of parameters of non-volatile memory devices of different types have been considered. The basic constructive and technological solutions in the field of the design of ferroelectric memory devices, as well as the "roadmaps" of the development of this technology, have been discussed.

  9. When visual and verbal memories compete: evidence of cross-domain limits in working memory.

    PubMed

    Morey, Candice C; Cowan, Nelson

    2004-04-01

    Recently, investigators have suggested that visual working memory operates in a manner unaffected by the retention of verbal material. We question that conclusion on the basis of a simple dual-task experiment designed to rule out phonological memory and to identify a more central faculty as the source of a shared limitation. With a visual working memory task in which two arrays of color squares were to be compared, performance was unaffected by concurrent recitation of a two-digit list or a known seven-digit sequence. However, visual working memory performance decreased markedly when paired with a load of seven random digits. This was not a simple tradeoff, inasmuch as errors on the visual array and high digit load tasks tended to co-occur. Working memory for digits and visual information thus are both subject to at least one type of shared limit, not just domain-specific limitations. The nature of the shared limit is discussed. PMID:15260196

  10. Memory in health and in schizophrenia

    PubMed Central

    Gur, Ruben C.; Gur, Raquel E.

    2013-01-01

    Memory is an important capacity needed for survival in a changing environment, and its principles are shared across species. These principles have been studied since the inception of behavioral science, and more recently neuroscience has helped understand brain systems and mechanisms responsible for enabling aspects of memory. Here we outline the history of work on memory and its neural underpinning, and describe the major dimensions of memory processing that have been evaluated by cognitive neuroscience, focusing on episodic memory. We present evidence in healthy populations for sex differences—females outperforming in verbal and face memory, and age effects—slowed memory processes with age. We then describe deficits associated with schizophrenia. Impairment in schizophrenia is more severe in patients with negative symptoms—especially flat affect—who also show deficits in measures of social cognition. This evidence implicates medial temporal and frontal regions in schizophrenia. PMID:24459407

  11. Efficient mapping algorithms for scheduling robot inverse dynamics computation on a multiprocessor system

    NASA Technical Reports Server (NTRS)

    Lee, C. S. G.; Chen, C. L.

    1989-01-01

    Two efficient mapping algorithms for scheduling the robot inverse dynamics computation consisting of m computational modules with precedence relationship to be executed on a multiprocessor system consisting of p identical homogeneous processors with processor and communication costs to achieve minimum computation time are presented. An objective function is defined in terms of the sum of the processor finishing time and the interprocessor communication time. The minimax optimization is performed on the objective function to obtain the best mapping. This mapping problem can be formulated as a combination of the graph partitioning and the scheduling problems; both have been known to be NP-complete. Thus, to speed up the searching for a solution, two heuristic algorithms were proposed to obtain fast but suboptimal mapping solutions. The first algorithm utilizes the level and the communication intensity of the task modules to construct an ordered priority list of ready modules and the module assignment is performed by a weighted bipartite matching algorithm. For a near-optimal mapping solution, the problem can be solved by the heuristic algorithm with simulated annealing. These proposed optimization algorithms can solve various large-scale problems within a reasonable time. Computer simulations were performed to evaluate and verify the performance and the validity of the proposed mapping algorithms. Finally, experiments for computing the inverse dynamics of a six-jointed PUMA-like manipulator based on the Newton-Euler dynamic equations were implemented on an NCUBE/ten hypercube computer to verify the proposed mapping algorithms. Computer simulation and experimental results are compared and discussed.

  12. Collaboratively Sharing Scientific Data

    NASA Astrophysics Data System (ADS)

    Wang, Fusheng; Vergara-Niedermayr, Cristobal

    Scientific research becomes increasingly reliant on multi-disciplinary, multi-institutional collaboration through sharing experimental data. Indeed, data sharing is mandatory by government research agencies such as NIH. The major hurdles for data sharing come from: i) the lack of data sharing infrastructure to make data sharing convenient for users; ii) users’ fear of losing control of their data; iii) difficulty on sharing schemas and incompatible data from sharing partners; and iv) inconsistent data under schema evolution. In this paper, we develop a collaborative data sharing system SciPort, to support consistency preserved data sharing among multiple distributed organizations. The system first provides Central Server based lightweight data integration architecture, so data and schemas can be conveniently shared across multiple organizations. Through distributed schema management, schema sharing and evolution is made possible, while data consistency is maintained and data compatibility is enforced. With this data sharing system, distributed sites can now consistently share their research data and their associated schemas with much convenience and flexibility. SciPort has been successfully used for data sharing in biomedical research, clinical trials and large scale research collaboration.

  13. Childhood Memories.

    ERIC Educational Resources Information Center

    Danielson, Kathy Everts

    1989-01-01

    Provides numerous ideas for helping students write about special memories in the following categories: growing up--future dreams; authors and illustrators; family history; special places; and special memories. Describes how to write a "bio poem," and includes a bibliography of children's books that enhance and enrich student learning and writing.…

  14. Memory Magic.

    ERIC Educational Resources Information Center

    Hartman, Thomas G.; Nowak, Norman

    This paper outlines several "tricks" that aid students in improving their memories. The distinctions between operational and figural thought processes are noted. Operational memory is described as something that allows adults to make generalizations about numbers and the rules by which they may be combined, thus leading to easier memorization.…

  15. Collaging Memories

    ERIC Educational Resources Information Center

    Wallach, Michele

    2011-01-01

    Even middle school students can have memories of their childhoods, of an earlier time. The art of Romare Bearden and the writings of Paul Auster can be used to introduce ideas about time and memory to students and inspire works of their own. Bearden is an exceptional role model for young artists, not only because of his astounding art, but also…

  16. Episodic Memories

    ERIC Educational Resources Information Center

    Conway, Martin A.

    2009-01-01

    An account of episodic memories is developed that focuses on the types of knowledge they represent, their properties, and the functions they might serve. It is proposed that episodic memories consist of "episodic elements," summary records of experience often in the form of visual images, associated to a "conceptual frame" that provides a…

  17. Robert Hooke's model of memory.

    PubMed

    Hintzman, Douglas L

    2003-03-01

    In 1682 the scientist and inventor Robert Hooke read a lecture to the Royal Society of London, in which he described a mechanistic model of human memory. Yet few psychologists today seem to have heard of Hooke's memory model. The lecture addressed questions of encoding, memory capacity, repetition, retrieval, and forgetting--some of these in a surprisingly modern way. Hooke's model shares several characteristics with the theory of Richard Semon, which came more than 200 years later, but it is more complete. Among the model's interesting properties are that (1) it allows for attention and other top-down influences on encoding; (2) it uses resonance to implement parallel, cue-dependent retrieval; (3) it explains memory for recency; (4) it offers a single-system account of repetition priming; and (5) the power law of forgetting can be derived from the model's assumptions in a straightforward way. PMID:12747488

  18. Generalized quantum secret sharing

    SciTech Connect

    Singh, Sudhir Kumar; Srikanth, R.

    2005-01-01

    We explore a generalization of quantum secret sharing (QSS) in which classical shares play a complementary role to quantum shares, exploring further consequences of an idea first studied by Nascimento, Mueller-Quade, and Imai [Phys. Rev. A 64, 042311 (2001)]. We examine three ways, termed inflation, compression, and twin thresholding, by which the proportion of classical shares can be augmented. This has the important application that it reduces quantum (information processing) players by replacing them with their classical counterparts, thereby making quantum secret sharing considerably easier and less expensive to implement in a practical setting. In compression, a QSS scheme is turned into an equivalent scheme with fewer quantum players, compensated for by suitable classical shares. In inflation, a QSS scheme is enlarged by adding only classical shares and players. In a twin-threshold scheme, we invoke two separate thresholds for classical and quantum shares based on the idea of information dilution.

  19. Share Your Values

    MedlinePlus

    ... Español Text Size Email Print Share Share Your Values Page Content Article Body Today, teenagers are bombarded ... mid-twenties. The Most Effective Way to Instill Values? By Example Your words will carry more weight ...

  20. Process Management and Exception Handling in Multiprocessor Operating Systems Using Object-Oriented Design Techniques. Revised Sep. 1988

    NASA Technical Reports Server (NTRS)

    Russo, Vincent; Johnston, Gary; Campbell, Roy

    1988-01-01

    The programming of the interrupt handling mechanisms, process switching primitives, scheduling mechanism, and synchronization primitives of an operating system for a multiprocessor require both efficient code in order to support the needs of high- performance or real-time applications and careful organization to facilitate maintenance. Although many advantages have been claimed for object-oriented class hierarchical languages and their corresponding design methodologies, the application of these techniques to the design of the primitives within an operating system has not been widely demonstrated. To investigate the role of class hierarchical design in systems programming, the authors have constructed the Choices multiprocessor operating system architecture the C++ programming language. During the implementation, it was found that many operating system design concerns can be represented advantageously using a class hierarchical approach, including: the separation of mechanism and policy; the organization of an operating system into layers, each of which represents an abstract machine; and the notions of process and exception management. In this paper, we discuss an implementation of the low-level primitives of this system and outline the strategy by which we developed our solution.

  1. Parallel processing of real-time dynamic systems simulation on OSCAR (Optimally SCheduled Advanced multiprocessoR)

    NASA Technical Reports Server (NTRS)

    Kasahara, Hironori; Honda, Hiroki; Narita, Seinosuke

    1989-01-01

    Parallel processing of real-time dynamic systems simulation on a multiprocessor system named OSCAR is presented. In the simulation of dynamic systems, generally, the same calculation are repeated every time step. However, we cannot apply to Do-all or the Do-across techniques for parallel processing of the simulation since there exist data dependencies from the end of an iteration to the beginning of the next iteration and furthermore data-input and data-output are required every sampling time period. Therefore, parallelism inside the calculation required for a single time step, or a large basic block which consists of arithmetic assignment statements, must be used. In the proposed method, near fine grain tasks, each of which consists of one or more floating point operations, are generated to extract the parallelism from the calculation and assigned to processors by using optimal static scheduling at compile time in order to reduce large run time overhead caused by the use of near fine grain tasks. The practicality of the scheme is demonstrated on OSCAR (Optimally SCheduled Advanced multiprocessoR) which has been developed to extract advantageous features of static scheduling algorithms to the maximum extent.

  2. DMA shared byte counters in a parallel computer

    DOEpatents

    Chen, Dong; Gara, Alan G.; Heidelberger, Philip; Vranas, Pavlos

    2010-04-06

    A parallel computer system is constructed as a network of interconnected compute nodes. Each of the compute nodes includes at least one processor, a memory and a DMA engine. The DMA engine includes a processor interface for interfacing with the at least one processor, DMA logic, a memory interface for interfacing with the memory, a DMA network interface for interfacing with the network, injection and reception byte counters, injection and reception FIFO metadata, and status registers and control registers. The injection FIFOs maintain memory locations of the injection FIFO metadata memory locations including its current head and tail, and the reception FIFOs maintain the reception FIFO metadata memory locations including its current head and tail. The injection byte counters and reception byte counters may be shared between messages.

  3. LACC Shared Governance Model.

    ERIC Educational Resources Information Center

    Spangler, Mary

    This document discusses Los Angeles City College's (LACC) (California) Shared Governance Model. In response to California Assembly Bill 1725, LACC set forth a plan to implement the statutory requirements of shared governance. Shared governance is a concept grounded in the idea that decision-making is a process that affects the entire campus…

  4. Experiences with Transitioning Science Data Production from a Symmetric Multiprocessor Platform to a Linux Cluster Environment

    NASA Astrophysics Data System (ADS)

    Walter, R. J.; Protack, S. P.; Harris, C. J.; Caruthers, C.; Kusterer, J. M.

    2008-12-01

    NASA's Atmospheric Science Data Center at the NASA Langley Research Center performs all of the science data processing for the Multi-angle Imaging SpectroRadiometer (MISR) instrument. MISR is one of the five remote sensing instruments flying aboard NASA's Terra spacecraft. From the time of Terra launch in December 1999 until February 2008, all MISR science data processing was performed on a Silicon Graphics, Inc. (SGI) platform. However, dramatic improvements in commodity computing technology coupled with steadily declining project budgets during that period eventually made transitioning MISR processing to a commodity computing environment both feasible and necessary. The Atmospheric Science Data Center has successfully ported the MISR science data processing environment from the SGI platform to a Linux cluster environment. There were a multitude of technical challenges associated with this transition. Even though the core architecture of the production system did not change, the manner in which it interacted with underlying hardware was fundamentally different. In addition, there are more potential throughput bottlenecks in a cluster environment than there are in a symmetric multiprocessor environment like the SGI platform and each of these had to be addressed. Once all the technical issues associated with the transition were resolved, the Atmospheric Science Data Center had a MISR science data processing system with significantly higher throughput than the SGI platform at a fraction of the cost. In addition to the commodity hardware, free and open source software such as S4PM, Sun Grid Engine, PostgreSQL and Ganglia play a significant role in the new system. Details of the technical challenges and resolutions, software systems, performance improvements, and cost savings associated with the transition will be discussed. The Atmospheric Science Data Center in Langley's Science Directorate leads NASA's program for the processing, archival and distribution of Earth

  5. State recovery and lockstep execution restart in a system with multiprocessor pairing

    SciTech Connect

    Gara, Alan; Gschwind, Michael K; Salapura, Valentina

    2014-01-21

    System, method and computer program product for a multiprocessing system to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). Each paired microprocessor or processor cores that provide one highly reliable thread for high-reliability connect with a system components such as a memory "nest" (or memory hierarchy), an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. The memory nest is attached to a selective pairing facility via a switch or a bus. Each selectively paired processor core is includes a transactional execution facility, whereing the system is configured to enable processor rollback to a previous state and reinitialize lockstep execution in order to recover from an incorrect execution when an incorrect execution has been detected by the selective pairing facility.

  6. Memory loss

    MedlinePlus

    ... usually include asking questions of family members and friends. For this reason, they should come to the appointment. Medical history questions may include: Type of memory loss, such as short-term or long-term ...

  7. Multiprocessor computer system which includes n parallel-operating modules and an external apparatus, and computer module for use in such a system

    SciTech Connect

    Krol, T.; Van Gils, W.J.

    1989-11-28

    This patent describes a multiprocessor computer system. It includes a plurality of n parallel-operating computer modules, each of which is situated in its own fault isolation are. The computer system having a failure protection means for maintaining operability under failure of at most 1 {lt}e t {lt} n of the computer modules.

  8. Commissioning of SharePlan: The Liverpool Experience

    NASA Astrophysics Data System (ADS)

    Xing, Aitang; Deshpande, Shrikant; Arumugam, Sankar; George, Armia; Holloway, Lois; Goozee, Gary

    2014-03-01

    SharePlan is a treatment planning system developed by Raysearch Laboratories AB to enable creation of a linear accelerator intensity modulated radiotherapy (IMRT) plan as a backup for a Tomotherapy plan. A 6MV Elekta Synergy Linear accelerator photon beam was modelled in SharePlan. The beam model was validated using Matrix Evolution, a 2D ion chamber array, for two head-neck and three prostate plans using 3%/3mm Gamma criteria. For 39 IMRT beams, the minimum and maximum Gamma pass rates are 95.4% and 98.7%. SharePlan is able to generate backup IMRT plans which are deliverable on a traditional linear accelerator and accurate in terms of clinical criteria. During use of SharePlan, however, an out-of-memory error frequently occurred and SharePlan was forced to be closed. This error occurred occasionally at any of these steps: loading the Tomotherapy plan into SharePlan, generating the IMRT plan, selecting the optimal plan, approving the plan and setting up a QA plan. The out-of-memory error was caused by memory leakage in one or more of the C/C++ functions implemented in SharePlan fluence engine, dose engine or optimizer, as acknowledged by the manufacturer. Because of the interruption caused by out-of-memory errors, SharePlan has not been implemented in our clinic although accuracy has been verified. A new software program is now being provided to our centre to replace SharePlan.

  9. Musical and verbal memory in Alzheimer's disease: a study of long-term and short-term memory.

    PubMed

    Ménard, Marie-Claude; Belleville, Sylvie

    2009-10-01

    Musical memory was tested in Alzheimer patients and in healthy older adults using long-term and short-term memory tasks. Long-term memory (LTM) was tested with a recognition procedure using unfamiliar melodies. Short-term memory (STM) was evaluated with same/different judgment tasks on short series of notes. Musical memory was compared to verbal memory using a task that used pseudowords (LTM) or syllables (STM). Results indicated impaired musical memory in AD patients relative to healthy controls. The deficit was found for both long-term and short-term memory. Furthermore, it was of the same magnitude for both musical and verbal domains whether tested with short-term or long-term memory tasks. No correlation was found between musical and verbal LTM. However, there was a significant correlation between verbal and musical STM in AD participants and healthy older adults, which suggests that the two domains may share common mechanisms. PMID:19398148

  10. Proactive quantum secret sharing

    NASA Astrophysics Data System (ADS)

    Qin, Huawang; Dai, Yuewei

    2015-11-01

    A proactive quantum secret sharing scheme is proposed, in which the participants can update their key shares periodically. In an updating period, one participant randomly generates the EPR pairs, and the other participants update their key shares and perform the corresponding unitary operations on the particles of the EPR pairs. Then, the participant who generated the EPR pairs performs the Bell-state measurement and updates his key share according to the result of the Bell-state measurement. After an updating period, each participant can change his key share, but the secret is changeless, and the old key shares will be useless even if they have been stolen by the attacker. The proactive property of our scheme is very useful to resist the mobile attacker.

  11. Memory consolidation.

    PubMed

    Squire, Larry R; Genzel, Lisa; Wixted, John T; Morris, Richard G

    2015-08-01

    Conscious memory for a new experience is initially dependent on information stored in both the hippocampus and neocortex. Systems consolidation is the process by which the hippocampus guides the reorganization of the information stored in the neocortex such that it eventually becomes independent of the hippocampus. Early evidence for systems consolidation was provided by studies of retrograde amnesia, which found that damage to the hippocampus-impaired memories formed in the recent past, but typically spared memories formed in the more remote past. Systems consolidation has been found to occur for both episodic and semantic memories and for both spatial and nonspatial memories, although empirical inconsistencies and theoretical disagreements remain about these issues. Recent work has begun to characterize the neural mechanisms that underlie the dialogue between the hippocampus and neocortex (e.g., "neural replay," which occurs during sharp wave ripple activity). New work has also identified variables, such as the amount of preexisting knowledge, that affect the rate of consolidation. The increasing use of molecular genetic tools (e.g., optogenetics) can be expected to further improve understanding of the neural mechanisms underlying consolidation. PMID:26238360

  12. Fear Memory.

    PubMed

    Izquierdo, Ivan; Furini, Cristiane R G; Myskiw, Jociane C

    2016-04-01

    Fear memory is the best-studied form of memory. It was thoroughly investigated in the past 60 years mostly using two classical conditioning procedures (contextual fear conditioning and fear conditioning to a tone) and one instrumental procedure (one-trial inhibitory avoidance). Fear memory is formed in the hippocampus (contextual conditioning and inhibitory avoidance), in the basolateral amygdala (inhibitory avoidance), and in the lateral amygdala (conditioning to a tone). The circuitry involves, in addition, the pre- and infralimbic ventromedial prefrontal cortex, the central amygdala subnuclei, and the dentate gyrus. Fear learning models, notably inhibitory avoidance, have also been very useful for the analysis of the biochemical mechanisms of memory consolidation as a whole. These studies have capitalized on in vitro observations on long-term potentiation and other kinds of plasticity. The effect of a very large number of drugs on fear learning has been intensively studied, often as a prelude to the investigation of effects on anxiety. The extinction of fear learning involves to an extent a reversal of the flow of information in the mentioned structures and is used in the therapy of posttraumatic stress disorder and fear memories in general. PMID:26983799

  13. System and method for programmable bank selection for banked memory subsystems

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Gara, Alan G.; Giampapa, Mark E.; Hoenicke, Dirk; Ohmacht, Martin; Salapura, Valentina; Sugavanam, Krishnan

    2010-09-07

    A programmable memory system and method for enabling one or more processor devices access to shared memory in a computing environment, the shared memory including one or more memory storage structures having addressable locations for storing data. The system comprises: one or more first logic devices associated with a respective one or more processor devices, each first logic device for receiving physical memory address signals and programmable for generating a respective memory storage structure select signal upon receipt of pre-determined address bit values at selected physical memory address bit locations; and, a second logic device responsive to each of the respective select signal for generating an address signal used for selecting a memory storage structure for processor access. The system thus enables each processor device of a computing environment memory storage access distributed across the one or more memory storage structures.

  14. A design fix to supervisory control for fault-tolerant scheduling of real-time multiprocessor systems with aperiodic tasks

    NASA Astrophysics Data System (ADS)

    Devaraj, Rajesh; Sarkar, Arnab; Biswas, Santosh

    2015-11-01

    In the article 'Supervisory control for fault-tolerant scheduling of real-time multiprocessor systems with aperiodic tasks', Park and Cho presented a systematic way of computing a largest fault-tolerant and schedulable language that provides information on whether the scheduler (i.e., supervisor) should accept or reject a newly arrived aperiodic task. The computation of such a language is mainly dependent on the task execution model presented in their paper. However, the task execution model is unable to capture the situation when the fault of a processor occurs even before the task has arrived. Consequently, a task execution model that does not capture this fact may possibly be assigned for execution on a faulty processor. This problem has been illustrated with an appropriate example. Then, the task execution model of Park and Cho has been modified to strengthen the requirement that none of the tasks are assigned for execution on a faulty processor.

  15. Memory clinics

    PubMed Central

    Jolley, D; Benbow, S M; Grizzell, M

    2006-01-01

    Memory clinics were first described in the 1980s. They have become accepted worldwide as useful vehicles for improving practice in the identification, investigation, and treatment of memory disorders, including dementia. They are provided in various settings, the setting determining clientele and practice. All aim to facilitate referral from GPs, other specialists, or by self referral, in the early stages of impairment, and to avoid the stigma associated with psychiatric services. They bring together professionals with a range of skills for the benefit of patients, carers, and colleagues, and contribute to health promotion, health education, audit, and research, as well as service to patients. PMID:16517802

  16. Models, Norms and Sharing.

    ERIC Educational Resources Information Center

    Harris, Mary B.

    To investigate the effect of modeling on altruism, 156 third and fifth grade children were exposed to a model who either shared with them, gave to a charity, or refused to share. The test apparatus, identified as a game, consisted of a box with signal lights and a chute through which marbles were dispensed. Subjects and the model played the game…

  17. Shared Parenting Dysfunction.

    ERIC Educational Resources Information Center

    Turkat, Ira Daniel

    2002-01-01

    Joint custody of children is the most prevalent court ordered arrangement for families of divorce. A growing body of literature indicates that many parents engage in behaviors that are incompatible with shared parenting. This article provides specific criteria for a definition of the Shared Parenting Dysfunction. Clinical aspects of the phenomenon…

  18. Rethinking Resource Sharing

    ERIC Educational Resources Information Center

    Beaubien, Anne; Stevens, Patricia

    2008-01-01

    This article describes the need for rethinking resource sharing to offer both library users and nonlibrary users options to obtain the material they seek from both libraries and commerical sources. The article discusses several programs that are emerging including the "GoGetter" function, the Rethinking Resource Sharing Manifesto, user needs, and…

  19. Support for non-locking parallel reception of packets belonging to a single memory reception FIFO

    DOEpatents

    Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M.; Steinmacher-Burow, Burkhard; Sugawara, Yutaka

    2011-01-27

    A method and apparatus for distributed parallel messaging in a parallel computing system. A plurality of DMA engine units are configured in a multiprocessor system to operate in parallel, one DMA engine unit for transferring a current packet received at a network reception queue to a memory location in a memory FIFO (rmFIFO) region of a memory. A control unit implements logic to determine whether any prior received packet destined for that rmFIFO is still in a process of being stored in the associated memory by another DMA engine unit of the plurality, and prevent the one DMA engine unit from indicating completion of storing the current received packet in the reception memory FIFO (rmFIFO) until all prior received packets destined for that rmFIFO are completely stored by the other DMA engine units. Thus, there is provided non-locking support so that multiple packets destined for a single rmFIFO are transferred and stored in parallel to predetermined locations in a memory.

  20. Memory dysfunction.

    PubMed

    Amici, Serena

    2012-01-01

    Memory is the cognitive ability that allows to acquire, store and recall information; its dysfunction is called amnesia and can be a presentation of unilateral ischemic stroke in the territory of the posterior cerebral and anterior choroidal artery as well as subarachnoid hemorrhage. PMID:22377863

  1. Retracing Memories

    ERIC Educational Resources Information Center

    Harrison, David L.

    2005-01-01

    There are plenty of paths to poetry but few are as accessible as retracing ones own memories. When students are asked to write about something they remember, they are given them the gift of choosing from events that are important enough to recall. They remember because what happened was funny or scary or embarrassing or heartbreaking or silly.…

  2. Memory Loss

    ERIC Educational Resources Information Center

    Cassebaum, Anne

    2011-01-01

    In four decades of teaching college English, the author has watched many good teaching jobs morph into second-class ones. Worse, she has seen the memory and then the expectation of teaching jobs with decent status, security, and salary depart along with principles and collegiality. To help reverse this downward spiral, she contends that what is…

  3. Fueling Memories

    PubMed Central

    Powell, Jonathan D.; Pollizzi, Kristen

    2012-01-01

    A hallmark of the adaptive immune response is rapid and robust activation upon rechallenge. In the current issue of Immunity van der Windt et al. (2012) provide an important link between mitochondrial respiratory capacity and the development of CD8+ T cell memory. PMID:22284413

  4. Share with thy neighbors

    NASA Astrophysics Data System (ADS)

    Chandra, Surendar; Yu, Xuwen

    2007-01-01

    Peer to peer (P2P) systems are traditionally designed to scale to a large number of nodes. However, we focus on scenarios where the sharing is effected only among neighbors. Localized sharing is particularly attractive in scenarios where wide area network connectivity is undesirable, expensive or unavailable. On the other hand, local neighbors may not offer the wide variety of objects possible in a much larger system. The goal of this paper is to investigate a P2P system that shares contents with its neighbors. We analyze the sharing behavior of Apple iTunes users in an University setting. iTunes restricts the sharing of audio and video objects to peers within the same LAN sub-network. We show that users are already making a significant amount of content available for local sharing. We show that these systems are not appropriate for applications that require access to a specific object. We argue that mechanisms that allow the user to specify classes of interesting objects are better suited for these systems. Mechanisms such as bloom filters can allow each peer to summarize the contents available in the neighborhood, reducing network search overhead. This research can form the basis for future storage systems that utilize the shared storage available in neighbors and build a probabilistic storage for local consumption.

  5. Chimpanzees share forbidden fruit.

    PubMed

    Hockings, Kimberley J; Humle, Tatyana; Anderson, James R; Biro, Dora; Sousa, Claudia; Ohashi, Gaku; Matsuzawa, Tetsuro

    2007-01-01

    The sharing of wild plant foods is infrequent in chimpanzees, but in chimpanzee communities that engage in hunting, meat is frequently used as a 'social tool' for nurturing alliances and social bonds. Here we report the only recorded example of regular sharing of plant foods by unrelated, non-provisioned wild chimpanzees, and the contexts in which these sharing behaviours occur. From direct observations, adult chimpanzees at Bossou (Republic of Guinea, West Africa) very rarely transferred wild plant foods. In contrast, they shared cultivated plant foods much more frequently (58 out of 59 food sharing events). Sharing primarily consists of adult males allowing reproductively cycling females to take food that they possess. We propose that hypotheses focussing on 'food-for-sex and -grooming' and 'showing-off' strategies plausibly account for observed sharing behaviours. A changing human-dominated landscape presents chimpanzees with fresh challenges, and our observations suggest that crop-raiding provides adult male chimpanzees at Bossou with highly desirable food commodities that may be traded for other currencies. PMID:17849015

  6. Programming distributed memory architectures using Kali

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush; Vanrosendale, John

    1990-01-01

    Programming nonshared memory systems is more difficult than programming shared memory systems, in part because of the relatively low level of current programming environments for such machines. A new programming environment is presented, Kali, which provides a global name space and allows direct access to remote data values. In order to retain efficiency, Kali provides a system on annotations, allowing the user to control those aspects of the program critical to performance, such as data distribution and load balancing. The primitives and constructs provided by the language is described, and some of the issues raised in translating a Kali program for execution on distributed memory systems are also discussed.

  7. A compact PE memory for vision chips

    NASA Astrophysics Data System (ADS)

    Cong, Shi; Zhe, Chen; Jie, Yang; Nanjian, Wu; Zhihua, Wang

    2014-09-01

    This paper presents a novel compact memory in the processing element (PE) for single-instruction multiple-data (SIMD) vision chips. The PE memory is constructed with 8 × 8 register cells, where one latch in the slave stage is shared by eight latches in the master stage. The memory supports simultaneous read and write on the same address in one clock cycle. Its compact area of 14.33 μm2/bit promises a higher integration level of the processor. A prototype chip with a 64 × 64 PE array is fabricated in a UMC 0.18 μm CMOS technology. Five types of the PE memory cell structure are designed and compared. The testing results demonstrate that the proposed PE memory architecture well satisfies the requirement of the vision chip in high-speed real-time vision applications, such as 1000 fps edge extraction.

  8. A comparison of multiprocessor scheduling methods for iterative data flow architectures

    NASA Technical Reports Server (NTRS)

    Storch, Matthew

    1993-01-01

    A comparative study is made between the Algorithm to Architecture Mapping Model (ATAMM) and three other related multiprocessing models from the published literature. The primary focus of all four models is the non-preemptive scheduling of large-grain iterative data flow graphs as required in real-time systems, control applications, signal processing, and pipelined computations. Important characteristics of the models such as injection control, dynamic assignment, multiple node instantiations, static optimum unfolding, range-chart guided scheduling, and mathematical optimization are identified. The models from the literature are compared with the ATAMM for performance, scheduling methods, memory requirements, and complexity of scheduling and design procedures.

  9. Shared decision making

    MedlinePlus

    ... Shared decision making to improve care and reduce costs. N Engl J Med . 2013 Jan 3;368(1):6-8. ... UW Medicine, School of Medicine, University of Washington, Seattle, WA. Also reviewed by David ...

  10. Sharing a Faculty Position.

    ERIC Educational Resources Information Center

    O'Kane, Patricia K.; Meyer, Mary

    1982-01-01

    Describes the experience of two nursing faculty members who shared an assistant professor of nursing position. Discusses positive and negative aspects of the experience and notes that a unified and creative approach must be taken for it to succeed. (JOW)

  11. A Sharing Proposition.

    ERIC Educational Resources Information Center

    Sturgeon, Julie

    2002-01-01

    Describes how the University of Vermont and St. Michael's College in Burlington, Vermont cooperated to share a single card access system. Discusses the planning, financial, and marketplace advantages of the cooperation. (EV)

  12. Shared decision making

    MedlinePlus

    Shared decision making is when health care providers and patients work together to decide the best way to test ... you. The two of you will make a decision based on your provider's expertise and your values ...

  13. Accelerating Spectrum Sharing Technologies

    SciTech Connect

    Juan D. Deaton; Lynda L. Brighton; Rangam Subramanian; Hussein Moradi; Jose Loera

    2013-09-01

    Spectrum sharing potentially holds the promise of solving the emerging spectrum crisis. However, technology innovators face the conundrum of developing spectrum sharing technologies without the ability to experiment and test with real incumbent systems. Interference with operational incumbents can prevent critical services, and the cost of deploying and operating an incumbent system can be prohibitive. Thus, the lack of incumbent systems and frequency authorization for technology incubation and demonstration has stymied spectrum sharing research. To this end, industry, academia, and regulators all require a test facility for validating hypotheses and demonstrating functionality without affecting operational incumbent systems. This article proposes a four-phase program supported by our spectrum accountability architecture. We propose that our comprehensive experimentation and testing approach for technology incubation and demonstration will accelerate the development of spectrum sharing technologies.

  14. Parallel variable-band Choleski solvers for computational structural analysis applications on vector multiprocessor supercomputers

    NASA Technical Reports Server (NTRS)

    Poole, E. L.; Overman, A. L.

    1991-01-01

    A Choleski method used to solve linear systems of equations that arise in large scale structural analyses is described. The method uses a novel variable-band storage scheme and is structured to exploit fast local memory caches while minimizing data access delays between main memory and vector registers. Several parallel implementations of this method are described for the CRAY-2 and CRAY Y-MP computers demonstrating the use of microtasking and autotasking directives. A portable parallel language, FORCE, is also used for two different parallel implementations, demonstrating the use of CRAY macrotasking. Results are presented comparing the matrix factorization times for three representative structural analysis problems from runs made in both dedicated and multi-user modes on both the CRAY-2 and CRAY Y-MP computers. CPU and wall clock timings are given for the various parallel methods and are compared to single processor timings of the same algorithm. Computation rates over 1 GIGAFLOP (1 billion floating point operations per second) on a four processor CRAY-2 and over 2 GIGAFLOPS on an eight processor CRAY Y-MP are demonstrated as measured by wall clock time in a dedicated environment. Reduced wall clock times for the parallel methods relative to the single processor implementation of the same Choleski algorithm are also demonstrated for runs made in multi-user mode.

  15. Developmental Differences in the Use of Recognition Memory Rejection Mechanisms

    ERIC Educational Resources Information Center

    Odegard, Timothy N.; Jenkins, Kara M.; Koen, Joshua D.

    2010-01-01

    The current experiment examined the use of plausibility judgments by children to reject distractors presented on "yes/no" recognition memory tests. Participants studied two lists of word pairs that shared either a categorical or rhyme association, which constituted the global nature of the two study conditions. During the recognition memory tests,…

  16. Secure Information Sharing

    Energy Science and Technology Software Center (ESTSC)

    2005-09-09

    We are develoing a peer-to-peer system to support secure, location independent information sharing in the scientific community. Once complete, this system will allow seamless and secure sharing of information between multiple collaborators. The owners of information will be able to control how the information is stored, managed. ano shared. In addition, users will have faster access to information updates within a collaboration. Groups collaborating on scientific experiments have a need to share information and data.more » This information and data is often represented in the form of files and database entries. In a typical scientific collaboration, there are many different locations where data would naturally be stored. This makes It difficult for collaborators to find and access the information they need. Our goal is to create a lightweight file-sharing system that makes it’easy for collaborators to find and use the data they need. This system must be easy-to-use, easy-to-administer, and secure. Our information-sharing tool uses group communication, in particular the InterGroup protocols, to reliably deliver each query to all of the current participants in a scalable manner, without having to discover all of their identities. We will use the Secure Group Layer (SGL) and Akenti to provide security to the participants of our environment, SGL will provide confldentiality, integrity, authenticity, and authorization enforcement for the InterGroup protocols and Akenti will provide access control to other resources.« less

  17. Information partnerships--shared data, shared scale.

    PubMed

    Konsynski, B R; McFarlan, F W

    1990-01-01

    How can one company gain access to another's resources or customers without merging ownership, management, or plotting a takeover? The answer is found in new information partnerships, enabling diverse companies to develop strategic coalitions through the sharing of data. The key to cooperation is a quantum improvement in the hardware and software supporting relational databases: new computer speeds, cheaper mass-storage devices, the proliferation of fiber-optic networks, and networking architectures. Information partnerships mean that companies can distribute the technological and financial exposure that comes with huge investments. For the customer's part, partnerships inevitably lead to greater simplification on the desktop and more common standards around which vendors have to compete. The most common types of partnership are: joint marketing partnerships, such as American Airline's award of frequent flyer miles to customers who use Citibank's credit card; intraindustry partnerships, such as the insurance value-added network service (which links insurance and casualty companies to independent agents); customer-supplier partnerships, such as Baxter Healthcare's electronic channel to hospitals for medical and other equipment; and IT vendor-driven partnerships, exemplified by ESAB (a European welding supplies and equipment company), whose expansion strategy was premised on a technology platform offered by an IT vendor. Partnerships that succeed have shared vision at the top, reciprocal skills in information technology, concrete plans for an early success, persistence in the development of usable information for all partners, coordination on business policy, and a new and imaginative business architecture. PMID:10107083

  18. CaLRS: A Critical-Aware Shared LLC Request Scheduling Algorithm on GPGPU

    PubMed Central

    Ma, Jianliang; Meng, Jinglei; Chen, Tianzhou; Wu, Minghui

    2015-01-01

    Ultra high thread-level parallelism in modern GPUs usually introduces numerous memory requests simultaneously. So there are always plenty of memory requests waiting at each bank of the shared LLC (L2 in this paper) and global memory. For global memory, various schedulers have already been developed to adjust the request sequence. But we find few work has ever focused on the service sequence on the shared LLC. We measured that a big number of GPU applications always queue at LLC bank for services, which provide opportunity to optimize the service order on LLC. Through adjusting the GPU memory request service order, we can improve the schedulability of SM. So we proposed a critical-aware shared LLC request scheduling algorithm (CaLRS) in this paper. The priority representative of memory request is critical for CaLRS. We use the number of memory requests that originate from the same warp but have not been serviced when they arrive at the shared LLC bank to represent the criticality of each warp. Experiments show that the proposed scheme can boost the SM schedulability effectively by promoting the scheduling priority of the memory requests with high criticality and improves the performance of GPU indirectly. PMID:25729772

  19. Reciprocal food sharing in the vampire bat

    NASA Astrophysics Data System (ADS)

    Wilkinson, Gerald S.

    1984-03-01

    Behavioural reciprocity can be evolutionarily stable1-3. Initial increase in frequency depends, however, on reciprocal altruists interacting predominantly with other reciprocal altruists either by associating within kin groups or by having sufficient memory to recognize and not aid nonreciprocators. Theory thus suggests that reciprocity should evolve more easily among animals which live in kin groups. Data are available separating reciprocity from nepotism only for unrelated nonhuman animals4. Here, I show that food sharing by regurgitation of blood among wild vampire bats (Desmodus rotundus) depends equally and independently on degree of relatedness and an index of opportunity for recipro cation. That reciprocity operates within groups containing both kin and nonkin is supported further with data on the availability of blood-sharing occasions, estimates of the economics of shar ing blood, and experiments which show that unrelated bats will reciprocally exchange blood in captivity.

  20. Shared direct memory access on the Explorer 2-LX

    NASA Technical Reports Server (NTRS)

    Musgrave, Jeffrey L.

    1990-01-01

    Advances in Expert System technology and Artificial Intelligence have provided a framework for applying automated Intelligence to the solution of problems which were generally perceived as intractable using more classical approaches. As a result, hybrid architectures and parallel processing capability have become more common in computing environments. The Texas Instruments Explorer II-LX is an example of a machine which combines a symbolic processing environment, and a computationally oriented environment in a single chassis for integrated problem solutions. This user's manual is an attempt to make these capabilities more accessible to a wider range of engineers and programmers with problems well suited to solution in such an environment.

  1. HEP - A semaphore-synchronized multiprocessor with central control. [Heterogeneous Element Processor

    NASA Technical Reports Server (NTRS)

    Gilliland, M. C.; Smith, B. J.; Calvert, W.

    1976-01-01

    The paper describes the design concept of the Heterogeneous Element Processor (HEP), a system tailored to the special needs of scientific simulation. In order to achieve high-speed computation required by simulation, HEP features a hierarchy of processes executing in parallel on a number of processors, with synchronization being largely accomplished by hardware. A full-empty-reserve scheme of synchronization is realized by zero-one-valued hardware semaphores. A typical system has, besides the control computer and the scheduler, an algebraic module, a memory module, a first-in first-out (FIFO) module, an integrator module, and an I/O module. The architecture of the scheduler and the algebraic module is examined in detail.

  2. Processing Demand and Short-Term Memory: The Response-Prefix Effect

    ERIC Educational Resources Information Center

    Jahnke, John C.; Nowaczyk, Ronald H.

    1977-01-01

    Seven-digit strings were presented for immediate recall. Before recall, subjects either read or retrieved from memory a single item (response prefix). Results were seen in terms of the sharing of the limited capacity of an active memory system by the memory series, the response prefix, and the operations to retrieve and emit the items. (Editor/RK)

  3. The Association between Auditory Memory Span and Speech Rate in Children from Kindergarten to Sixth Grade.

    ERIC Educational Resources Information Center

    Ferguson, Angela N.; Bowey, Judith A.; Tilley, Andrew

    2002-01-01

    Examined association between speech rate and memory span in children from kindergarten to sixth grade. Found that speech rate for word triples shared variance with memory span independent of speech rate for single words. Speech rate for word triples was largely redundant with age in explaining additional variation in memory span when effects of…

  4. Coordinating Shared Activities

    NASA Technical Reports Server (NTRS)

    Clement, Bradley

    2004-01-01

    Shared Activity Coordination (ShAC) is a computer program for planning and scheduling the activities of an autonomous team of interacting spacecraft and exploratory robots. ShAC could also be adapted to such terrestrial uses as helping multiple factory managers work toward competing goals while sharing such common resources as floor space, raw materials, and transports. ShAC iteratively invokes the Continuous Activity Scheduling Planning Execution and Replanning (CASPER) program to replan and propagate changes to other planning programs in an effort to resolve conflicts. A domain-expert specifies which activities and parameters thereof are shared and reports the expected conditions and effects of these activities on the environment. By specifying these conditions and effects differently for each planning program, the domain-expert subprogram defines roles that each spacecraft plays in a coordinated activity. The domain-expert subprogram also specifies which planning program has scheduling control over each shared activity. ShAC enables sharing of information, consensus over the scheduling of collaborative activities, and distributed conflict resolution. As the other planning programs incorporate new goals and alter their schedules in the changing environment, ShAC continually coordinates to respond to unexpected events.

  5. Mechanisms of Memory.

    ERIC Educational Resources Information Center

    Squire, Larry R.

    1986-01-01

    Focuses on the brain processes and brain systems involved in learning and memory from a neuropsychological perspective of analysis. Reports findings related to the locus of memory storage, types of memory and knowledge, and memory consolidation. Models of animal memory are also examined. An extensive reference list is included. (ML)

  6. Test Sequence Priming in Recognition Memory

    ERIC Educational Resources Information Center

    Johns, Elizabeth E.; Mewhort, D. J. K.

    2009-01-01

    The authors examined priming within the test sequence in 3 recognition memory experiments. A probe primed its successor whenever both probes shared a feature with the same studied item ("interjacent priming"), indicating that the study item like the probe is central to the decision. Interjacent priming occurred even when the 2 probes did not…

  7. Memory During Oral and Silent Reading.

    ERIC Educational Resources Information Center

    Perfetti, Charles A.; And Others

    Following reading and listening tasks, adult long-term memory is high in semantic information and low in syntactic and lexical information. Comprehension during reading and listening must depend to some extent, however, on short term retention of linguistic information that is less abstract and shares more features of the input than the semantic…

  8. Close Associations and Memory in Brainwriting Groups

    ERIC Educational Resources Information Center

    Coskun, Hamit

    2011-01-01

    The present experiment examined whether or not the type of associations (close (e.g. apple-pear) and distant (e.g. apple-fish) word associations) and memory instruction (paying attention to the ideas of others) had effects on the idea generation performances in the brainwriting paradigm in which all participants shared their ideas by using paper…

  9. The Developmental Influence of Primary Memory Capacity on Working Memory and Academic Achievement

    PubMed Central

    2015-01-01

    In this study, we investigate the development of primary memory capacity among children. Children between the ages of 5 and 8 completed 3 novel tasks (split span, interleaved lists, and a modified free-recall task) that measured primary memory by estimating the number of items in the focus of attention that could be spontaneously recalled in serial order. These tasks were calibrated against traditional measures of simple and complex span. Clear age-related changes in these primary memory estimates were observed. There were marked individual differences in primary memory capacity, but each novel measure was predictive of simple span performance. Among older children, each measure shared variance with reading and mathematics performance, whereas for younger children, the interleaved lists task was the strongest single predictor of academic ability. We argue that these novel tasks have considerable potential for the measurement of primary memory capacity and provide new, complementary ways of measuring the transient memory processes that predict academic performance. The interleaved lists task also shared features with interference control tasks, and our findings suggest that young children have a particular difficulty in resisting distraction and that variance in the ability to resist distraction is also shared with measures of educational attainment. PMID:26075630

  10. The developmental influence of primary memory capacity on working memory and academic achievement.

    PubMed

    Hall, Debbora; Jarrold, Christopher; Towse, John N; Zarandi, Amy L

    2015-08-01

    In this study, we investigate the development of primary memory capacity among children. Children between the ages of 5 and 8 completed 3 novel tasks (split span, interleaved lists, and a modified free-recall task) that measured primary memory by estimating the number of items in the focus of attention that could be spontaneously recalled in serial order. These tasks were calibrated against traditional measures of simple and complex span. Clear age-related changes in these primary memory estimates were observed. There were marked individual differences in primary memory capacity, but each novel measure was predictive of simple span performance. Among older children, each measure shared variance with reading and mathematics performance, whereas for younger children, the interleaved lists task was the strongest single predictor of academic ability. We argue that these novel tasks have considerable potential for the measurement of primary memory capacity and provide new, complementary ways of measuring the transient memory processes that predict academic performance. The interleaved lists task also shared features with interference control tasks, and our findings suggest that young children have a particular difficulty in resisting distraction and that variance in the ability to resist distraction is also shared with measures of educational attainment. PMID:26075630

  11. Multiparty quantum secret sharing

    SciTech Connect

    Zhang Zhanjun; Li Yong; Man Zhongxiao

    2005-04-01

    Based on a quantum secure direct communication (QSDC) protocol [Phys. Rev. A 69 052319 (2004)], we propose a (n,n)-threshold scheme of multiparty quantum secret sharing of classical messages (QSSCM) using only single photons. We take advantage of this multiparty QSSCM scheme to establish a scheme of multiparty secret sharing of quantum information (SSQI), in which only all quantum information receivers collaborate can the original qubit be reconstructed. A general idea is also proposed for constructing multiparty SSQI schemes from any QSSCM scheme.

  12. Sharing the atom bomb

    SciTech Connect

    Chace, J.

    1996-01-01

    Shaken by the devastation of Hiroshima and Nagasaki and fearful that the American atomic monopoly would spark an arms race, Dean Acheson led a push in 1946 to place the bomb-indeed, all atomic energy-under international control. But as the memories of wartime collaboration faded, relations between the superpowers grew increasingly tense, and the confrontational atmosphere undid his proposal. Had Acheson succeeded, the Cold War might not have been. 2 figs.

  13. Memory effects in turbulence

    NASA Technical Reports Server (NTRS)

    Hinze, J. O.

    1979-01-01

    Experimental investigations of the wake flow of a hemisphere and cylinder show that such memory effects can be substantial and have a significant influence on momentum transport. Memory effects are described in terms of suitable memory functions.

  14. A parallel row-based algorithm with error control for standard-cell replacement on a hypercube multiprocessor

    NASA Technical Reports Server (NTRS)

    Sargent, Jeff Scott

    1988-01-01

    A new row-based parallel algorithm for standard-cell placement targeted for execution on a hypercube multiprocessor is presented. Key features of this implementation include a dynamic simulated-annealing schedule, row-partitioning of the VLSI chip image, and two novel new approaches to controlling error in parallel cell-placement algorithms; Heuristic Cell-Coloring and Adaptive (Parallel Move) Sequence Control. Heuristic Cell-Coloring identifies sets of noninteracting cells that can be moved repeatedly, and in parallel, with no buildup of error in the placement cost. Adaptive Sequence Control allows multiple parallel cell moves to take place between global cell-position updates. This feedback mechanism is based on an error bound derived analytically from the traditional annealing move-acceptance profile. Placement results are presented for real industry circuits and the performance is summarized of an implementation on the Intel iPSC/2 Hypercube. The runtime of this algorithm is 5 to 16 times faster than a previous program developed for the Hypercube, while producing equivalent quality placement. An integrated place and route program for the Intel iPSC/2 Hypercube is currently being developed.

  15. Nonpreemptive run-time scheduling issues on a multitasked, multiprogrammed multiprocessor with dependencies, bidimensional tasks, folding and dynamic graphs

    SciTech Connect

    Miller, Allan Ray

    1987-05-01

    Increases in high speed hardware have mandated studies in software techniques to exploit the parallel capabilities. This thesis examines the effects a run-time scheduler has on a multiprocessor. The model consists of directed, acyclic graphs, generated from serial FORTRAN benchmark programs by the parallel compiler Parafrase. A multitasked, multiprogrammed environment is created. Dependencies are generated by the compiler. Tasks are bidimensional, i.e., they may specify both time and processor requests. Processor requests may be folded into execution time by the scheduler. The graphs may arrive at arbitrary time intervals. The general case is NP-hard, thus, a variety of heuristics are examined by a simulator. Multiprogramming demonstrates a greater need for a run-time scheduler than does monoprogramming for a variety of reasons, e.g., greater stress on the processors, a larger number of independent control paths, more variety in the task parameters, etc. The dynamic critical path series of algorithms perform well. Dynamic critical volume did not add much. Unfortunately, dynamic critical path maximizes turnaround time as well as throughput. Two schedulers are presented which balance throughput and turnaround time. The first requires classification of jobs by type; the second requires selection of a ratio value which is dependent upon system parameters. 45 refs., 19 figs., 20 tabs.

  16. The Sharing Tree: Preschool Children Learn to Share.

    ERIC Educational Resources Information Center

    Wolf, Arlene; Fine, Elaine

    1996-01-01

    This article describes a learning activity in which preschool children learn cooperative skills and metacognitive strategies as they master sharing strategies guided by leaves on a "sharing tree." Leaf colors (red, yellow, green) cue the child to stop, slow down and think about sharing and playing with others, and go ahead with a sharing activity.…

  17. [Neuroscience and collective memory: memory schemas linking brain, societies and cultures].

    PubMed

    Legrand, Nicolas; Gagnepain, Pierre; Peschanski, Denis; Eustache, Francis

    2015-01-01

    During the last two decades, the effect of intersubjective relationships on cognition has been an emerging topic in cognitive neurosciences leading through a so-called "social turn" to the formation of new domains integrating society and cultures to this research area. Such inquiry has been recently extended to collective memory studies. Collective memory refers to shared representations that are constitutive of the identity of a group and distributed among all its members connected by a common history. After briefly describing those evolutions in the study of human brain and behaviors, we review recent researches that have brought together cognitive psychology, neuroscience and social sciences into collective memory studies. Using the reemerging concept of memory schema, we propose a theoretical framework allowing to account for collective memories formation with a specific focus on the encoding process of historical events. We suggest that (1) if the concept of schema has been mainly used to describe rather passive framework of knowledge, such structure may also be implied in more active fashions in the understanding of significant collective events. And, (2) if some schema researches have restricted themselves to the individual level of inquiry, we describe a strong coherence between memory and cultural frameworks. Integrating the neural basis and properties of memory schema to collective memory studies may pave the way toward a better understanding of the reciprocal interaction between individual memories and cultural resources such as media or education. PMID:26820833

  18. Synapsin Determines Memory Strength after Punishment- and Relief-Learning

    PubMed Central

    Niewalda, Thomas; Michels, Birgit; Jungnickel, Roswitha; Diegelmann, Sören; Kleber, Jörg; Kähne, Thilo

    2015-01-01

    Adverse life events can induce two kinds of memory with opposite valence, dependent on timing: “negative” memories for stimuli preceding them and “positive” memories for stimuli experienced at the moment of “relief.” Such punishment memory and relief memory are found in insects, rats, and man. For example, fruit flies (Drosophila melanogaster) avoid an odor after odor-shock training (“forward conditioning” of the odor), whereas after shock-odor training (“backward conditioning” of the odor) they approach it. Do these timing-dependent associative processes share molecular determinants? We focus on the role of Synapsin, a conserved presynaptic phosphoprotein regulating the balance between the reserve pool and the readily releasable pool of synaptic vesicles. We find that a lack of Synapsin leaves task-relevant sensory and motor faculties unaffected. In contrast, both punishment memory and relief memory scores are reduced. These defects reflect a true lessening of associative memory strength, as distortions in nonassociative processing (e.g., susceptibility to handling, adaptation, habituation, sensitization), discrimination ability, and changes in the time course of coincidence detection can be ruled out as alternative explanations. Reductions in punishment- and relief-memory strength are also observed upon an RNAi-mediated knock-down of Synapsin, and are rescued both by acutely restoring Synapsin and by locally restoring it in the mushroom bodies of mutant flies. Thus, both punishment memory and relief memory require the Synapsin protein and in this sense share genetic and molecular determinants. We note that corresponding molecular commonalities between punishment memory and relief memory in humans would constrain pharmacological attempts to selectively interfere with excessive associative punishment memories, e.g., after traumatic experiences. PMID:25972175

  19. Synapsin determines memory strength after punishment- and relief-learning.

    PubMed

    Niewalda, Thomas; Michels, Birgit; Jungnickel, Roswitha; Diegelmann, Sören; Kleber, Jörg; Kähne, Thilo; Gerber, Bertram

    2015-05-13

    Adverse life events can induce two kinds of memory with opposite valence, dependent on timing: "negative" memories for stimuli preceding them and "positive" memories for stimuli experienced at the moment of "relief." Such punishment memory and relief memory are found in insects, rats, and man. For example, fruit flies (Drosophila melanogaster) avoid an odor after odor-shock training ("forward conditioning" of the odor), whereas after shock-odor training ("backward conditioning" of the odor) they approach it. Do these timing-dependent associative processes share molecular determinants? We focus on the role of Synapsin, a conserved presynaptic phosphoprotein regulating the balance between the reserve pool and the readily releasable pool of synaptic vesicles. We find that a lack of Synapsin leaves task-relevant sensory and motor faculties unaffected. In contrast, both punishment memory and relief memory scores are reduced. These defects reflect a true lessening of associative memory strength, as distortions in nonassociative processing (e.g., susceptibility to handling, adaptation, habituation, sensitization), discrimination ability, and changes in the time course of coincidence detection can be ruled out as alternative explanations. Reductions in punishment- and relief-memory strength are also observed upon an RNAi-mediated knock-down of Synapsin, and are rescued both by acutely restoring Synapsin and by locally restoring it in the mushroom bodies of mutant flies. Thus, both punishment memory and relief memory require the Synapsin protein and in this sense share genetic and molecular determinants. We note that corresponding molecular commonalities between punishment memory and relief memory in humans would constrain pharmacological attempts to selectively interfere with excessive associative punishment memories, e.g., after traumatic experiences. PMID:25972175

  20. Bidirectional Quantum States Sharing

    NASA Astrophysics Data System (ADS)

    Peng, Jia-Yin; Bai, Ming-qiang; Mo, Zhi-Wen

    2016-05-01

    With the help of the shared entanglement and LOCC, multidirectional quantum states sharing is considered. We first put forward a protocol for implementing four-party bidirectional states sharing (BQSS) by using eight-qubit cluster state as quantum channel. In order to extend BQSS, we generalize this protocol from four sharers to multi-sharers utilizing two multi-qubit GHZ-type states as channel, and propose two multi-party BQSS schemes. On the other hand, we generalize the three schemes from two senders to multi-senders with multi GHZ-type states of multi-qubit as quantum channel, and give a multidirectional quantum states sharing protocol. In our schemes, all receivers can reconstruct the original unknown single-qubit state if and only if all sharers can cooperate. Only Pauli operations, Bell-state measurement and single-qubit measurement are used in our schemes, so these schemes are easily realized in physical experiment and their successful probabilities are all one.

  1. Learning to Share

    ERIC Educational Resources Information Center

    Raths, David

    2010-01-01

    In the tug-of-war between researchers and IT for supercomputing resources, a centralized approach can help both sides get more bang for their buck. As 2010 began, the University of Washington was preparing to launch its first shared high-performance computing cluster, a 1,500-node system called Hyak, dedicated to research activities. Like other…

  2. Illegal File Sharing 101

    ERIC Educational Resources Information Center

    Wada, Kent

    2008-01-01

    Much of higher education's unease arises from the cost of dealing with illegal file sharing. Illinois State University, for example, calculated a cost of $76 to process a first claim of copyright infringement and $146 for a second. Responses range from simply passing along claims to elaborate programs architected with specific goals in mind.…

  3. Knowledge Sharing at Conferences

    ERIC Educational Resources Information Center

    De Vries, Bregje; Pieters, Jules

    2007-01-01

    To improve the quality in teaching and learning, opportunities need to be provided where practitioners and researchers meet and share visions, disseminate findings, co-construct ideas, and set research agendas together. Visiting a conference is one well-known and established way to do this. But are they effective? A survey was conducted among the…

  4. Sharing Research Results

    ERIC Educational Resources Information Center

    Ashbrook, Peggy

    2011-01-01

    There are many ways to share a collection of data and students' thinking about that data. Explaining the results of science inquiry is important--working scientists and amateurs both contribute information to the body of scientific knowledge. Students can collect data about an activity that is already happening in a classroom (e.g., the qualities…

  5. Sharing Teaching Ideas.

    ERIC Educational Resources Information Center

    Mathematics Teacher, 1981

    1981-01-01

    Three ideas are shared: using geometric figures in motivational practice of order operations with prealgebra students; constructing a test with a holiday theme to increase student interest; and coding greeting cards for students that can be solved mathematically through the use of previously learned concepts. (MP)

  6. Hints on Sharing Books.

    ERIC Educational Resources Information Center

    Dorsey, Mary E., Comp.; Horne, Ulysses G., Comp.

    Based on the realization that each child must be given the opportunity to develop as a unique individual and that exposure to books expands a child's world, stimulating his creative thinking and his desire for new experiences, this booklet presents in outline form a variety of suggestions for encouraging children to share the books they have read.…

  7. Transactive memory systems scale for couples: development and validation

    PubMed Central

    Hewitt, Lauren Y.; Roberts, Lynne D.

    2015-01-01

    People in romantic relationships can develop shared memory systems by pooling their cognitive resources, allowing each person access to more information but with less cognitive effort. Research examining such memory systems in romantic couples largely focuses on remembering word lists or performing lab-based tasks, but these types of activities do not capture the processes underlying couples’ transactive memory systems, and may not be representative of the ways in which romantic couples use their shared memory systems in everyday life. We adapted an existing measure of transactive memory systems for use with romantic couples (TMSS-C), and conducted an initial validation study. In total, 397 participants who each identified as being a member of a romantic relationship of at least 3 months duration completed the study. The data provided a good fit to the anticipated three-factor structure of the components of couples’ transactive memory systems (specialization, credibility and coordination), and there was reasonable evidence of both convergent and divergent validity, as well as strong evidence of test–retest reliability across a 2-week period. The TMSS-C provides a valuable tool that can quickly and easily capture the underlying components of romantic couples’ transactive memory systems. It has potential to help us better understand this intriguing feature of romantic relationships, and how shared memory systems might be associated with other important features of romantic relationships. PMID:25999873

  8. A Beginner's Guide to Memory.

    ERIC Educational Resources Information Center

    Hughes, Elizabeth M.

    1981-01-01

    This article is designed to equip the reader with the information needed to deal with questions of computer memory. Discussed are core memory; semiconductor memory; size of memory; expanding memory; charge-coupled device memories; magnetic bubble memory; and read-only and read-mostly memories. (KC)

  9. Arousal-biased competition in perception and memory

    PubMed Central

    Mather, Mara; Sutherland, Matthew R.

    2010-01-01

    Our everyday surroundings besiege us with information. The battle is for a share of our limited attention and memory, with the brain selecting the winners and discarding the losers. Previous research shows that both bottom-up and top-down factors bias competition in favor of high priority stimuli. We propose that arousal during an event increases this bias both in perception and in long-term memory of the event. Arousal-biased competition theory provides specific predictions about when arousal will enhance and when it will impair memory for events, accounting for some puzzling contradictions in the emotional memory literature. PMID:21660127

  10. Memory Retrieval and Interference: Working Memory Issues

    ERIC Educational Resources Information Center

    Radvansky, Gabriel A.; Copeland, David E.

    2006-01-01

    Working memory capacity has been suggested as a factor that is involved in long-term memory retrieval, particularly when that retrieval involves a need to overcome some sort of interference (Bunting, Conway, & Heitz, 2004; Cantor & Engle, 1993). Previous work has suggested that working memory is related to the acquisition of information during…

  11. Policy enabled information sharing system

    DOEpatents

    Jorgensen, Craig R.; Nelson, Brian D.; Ratheal, Steve W.

    2014-09-02

    A technique for dynamically sharing information includes executing a sharing policy indicating when to share a data object responsive to the occurrence of an event. The data object is created by formatting a data file to be shared with a receiving entity. The data object includes a file data portion and a sharing metadata portion. The data object is encrypted and then automatically transmitted to the receiving entity upon occurrence of the event. The sharing metadata portion includes metadata characterizing the data file and referenced in connection with the sharing policy to determine when to automatically transmit the data object to the receiving entity.

  12. Processor-Group Aware Runtime Support for Shared-and Global-Address Space Models

    SciTech Connect

    Krishnan, Manoj Kumar; Tipparaju, Vinod; Palmer, Bruce; Nieplocha, Jarek

    2004-12-07

    Exploiting multilevel parallelism using processor groups is becoming increasingly important for programming on high-end systems. This paper describes a group-aware run-time support for shared-/global- address space programming models. The current effort has been undertaken in the context of the Aggregate Remote Memory Copy Interface (ARMCI) [5], a portable runtime system used as a communication layer for Global Arrays [6], Co-Array Fortran (CAF) [9], GPSHMEM [10], Co-Array Python [11], and also end-user applications. The paper describes the management of shared memory, integration of shared memory communication and RDMA on clusters with SMP nodes, and registration. These are all required for efficient multi- method and multi-protocol communication on modern systems. Focus is placed on techniques for supporting process groups while maximizing communication performance and efficiently managing global memory system-wide.

  13. Optical memory

    DOEpatents

    Mao, Samuel S; Zhang, Yanfeng

    2013-07-02

    Optical memory comprising: a semiconductor wire, a first electrode, a second electrode, a light source, a means for producing a first voltage at the first electrode, a means for producing a second voltage at the second electrode, and a means for determining the presence of an electrical voltage across the first electrode and the second electrode exceeding a predefined voltage. The first voltage, preferably less than 0 volts, different from said second voltage. The semiconductor wire is optically transparent and has a bandgap less than the energy produced by the light source. The light source is optically connected to the semiconductor wire. The first electrode and the second electrode are electrically insulated from each other and said semiconductor wire.

  14. Grouping and binding in visual short-term memory.

    PubMed

    Quinlan, Philip T; Cohen, Dale J

    2012-09-01

    Findings of 2 experiments are reported that challenge the current understanding of visual short-term memory (VSTM). In both experiments, a single study display, containing 6 colored shapes, was presented briefly and then probed with a single colored shape. At stake is how VSTM retains a record of different objects that share common features: In the 1st experiment, 2 study items sometimes shared a common feature (either a shape or a color). The data revealed a color sharing effect, in which memory was much better for items that shared a common color than for items that did not. The 2nd experiment showed that the size of the color sharing effect depended on whether a single pair of items shared a common color or whether 2 pairs of items were so defined-memory for all items improved when 2 color groups were presented. In explaining performance, an account is advanced in which items compete for a fixed number of slots, but then memory recall for any given stored item is prone to error. A critical assumption is that items that share a common color are stored together in a slot as a chunk. The evidence provides further support for the idea that principles of perceptual organization may determine the manner in which items are stored in VSTM. PMID:22449133

  15. Elastomeric load sharing device

    NASA Technical Reports Server (NTRS)

    Isabelle, Charles J. (Inventor); Kish, Jules G. (Inventor); Stone, Robert A. (Inventor)

    1992-01-01

    An elastomeric load sharing device, interposed in combination between a driven gear and a central drive shaft to facilitate balanced torque distribution in split power transmission systems, includes a cylindrical elastomeric bearing and a plurality of elastomeric bearing pads. The elastomeric bearing and bearing pads comprise one or more layers, each layer including an elastomer having a metal backing strip secured thereto. The elastomeric bearing is configured to have a high radial stiffness and a low torsional stiffness and is operative to radially center the driven gear and to minimize torque transfer through the elastomeric bearing. The bearing pads are configured to have a low radial and torsional stiffness and a high axial stiffness and are operative to compressively transmit torque from the driven gear to the drive shaft. The elastomeric load sharing device has spring rates that compensate for mechanical deviations in the gear train assembly to provide balanced torque distribution between complementary load paths of split power transmission systems.

  16. Order-memory and association-memory.

    PubMed

    Caplan, Jeremy B

    2015-09-01

    Two highly studied memory functions are memory for associations (items presented in pairs, such as SALT-PEPPER) and memory for order (a list of items whose order matters, such as a telephone number). Order- and association-memory are at the root of many forms of behaviour, from wayfinding, to language, to remembering people's names. Most researchers have investigated memory for order separately from memory for associations. Exceptions to this, associative-chaining models build an ordered list from associations between pairs of items, quite literally understanding association- and order-memory together. Alternatively, positional-coding models have been used to explain order-memory as a completely distinct function from association-memory. Both classes of model have found empirical support and both have faced serious challenges. I argue that models that combine both associative chaining and positional coding are needed. One such hybrid model, which relies on brain-activity rhythms, is promising, but remains to be tested rigourously. I consider two relatively understudied memory behaviours that demand a combination of order- and association-information: memory for the order of items within associations (is it William James or James William?) and judgments of relative order (who left the party earlier, Hermann or William?). Findings from these underexplored procedures are already difficult to reconcile with existing association-memory and order-memory models. Further work with such intermediate experimental paradigms has the potential to provide powerful findings to constrain and guide models into the future, with the aim of explaining a large range of memory functions, encompassing both association- and order-memory. PMID:25894964

  17. Shared health governance.

    PubMed

    Ruger, Jennifer Prah

    2011-07-01

    Health and Social Justice (Ruger 2009a ) developed the "health capability paradigm," a conception of justice and health in domestic societies. This idea undergirds an alternative framework of social cooperation called "shared health governance" (SHG). SHG puts forth a set of moral responsibilities, motivational aspirations, and institutional arrangements, and apportions roles for implementation in striving for health justice. This article develops further the SHG framework and explains its importance and implications for governing health domestically. PMID:21745082

  18. Shared Health Governance

    PubMed Central

    Ruger, Jennifer Prah

    2014-01-01

    Health and Social Justice (Ruger 2009a) developed the “health capability paradigm,” a conception of justice and health in domestic societies. This idea undergirds an alternative framework of social cooperation called “shared health governance” (SHG). SHG puts forth a set of moral responsibilities, motivational aspirations, and institutional arrangements, and apportions roles for implementation in striving for health justice. This article develops further the SHG framework and explains its importance and implications for governing health domestically. PMID:21745082

  19. Efficient quantum secret sharing

    NASA Astrophysics Data System (ADS)

    Qin, Huawang; Dai, Yuewei

    2016-05-01

    An efficient quantum secret sharing scheme is proposed, in which the dealer generates some single particles and then uses the operations of quantum-controlled-not and Hadamard gate to encode a determinate secret into these particles. The participants get their shadows by performing the single-particle measurements on their particles, and even the dealer cannot know their shadows. Compared to the existing schemes, our scheme is more practical within the present technologies.

  20. Programming parallel architectures - The BLAZE family of languages

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush

    1989-01-01

    This paper gives an overview of the various approaches to programming multiprocessor architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive, since they remove much of the burden of exploiting parallel architectures from the user. This paper also describes recent work in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described.

  1. Bonobos Share with Strangers

    PubMed Central

    Tan, Jingzhi; Hare, Brian

    2013-01-01

    Humans are thought to possess a unique proclivity to share with others – including strangers. This puzzling phenomenon has led many to suggest that sharing with strangers originates from human-unique language, social norms, warfare and/or cooperative breeding. However, bonobos, our closest living relative, are highly tolerant and, in the wild, are capable of having affiliative interactions with strangers. In four experiments, we therefore examined whether bonobos will voluntarily donate food to strangers. We show that bonobos will forego their own food for the benefit of interacting with a stranger. Their prosociality is in part driven by unselfish motivation, because bonobos will even help strangers acquire out-of-reach food when no desirable social interaction is possible. However, this prosociality has its limitations because bonobos will not donate food in their possession when a social interaction is not possible. These results indicate that other-regarding preferences toward strangers are not uniquely human. Moreover, language, social norms, warfare and cooperative breeding are unnecessary for the evolution of xenophilic sharing. Instead, we propose that prosociality toward strangers initially evolves due to selection for social tolerance, allowing the expansion of individual social networks. Human social norms and language may subsequently extend this ape-like social preference to the most costly contexts. PMID:23300956

  2. Toward worldwide data sharing

    NASA Astrophysics Data System (ADS)

    Walker, Raymond; Joy, Steven; King, Todd

    2012-07-01

    Over the past decade the nature of space science research has changed dramatically. Earlier investigators could carry out meaningful research by looking at observations from a single instrument on a single spacecraft. Today that is rapidly changing and researchers regularly use data from multiple instruments on multiple spacecraft as well as observations from ground observatories. Increasingly those observations come from missions flown by many countries. Recent advances in distributed data management have made it possible for researchers located around the world to access and use data from multiple nations. By using virtual observatory technology it no longer matters where data are housed they can be freely accessed wherever they reside. In this presentation we will discuss two initiatives designed to make space science data access worldwide. One is the International Planetary Data Alliance (IPDA) and the other is the Heliophysics Data and Model Consortium (HDMC). In both cases the key to worldwide data sharing is adopting common metadata standards. In this talk we will review how these two groups are addressing the worldwide data sharing and their progress in achieving their goals. IPDA and HDMC are two of several efforts to promote broad based data sharing. Talks in the remainder of the symposium will discuss this is more detail.

  3. Dynamic Load Balancing for Adaptive Computations on Distributed-Memory Machines

    NASA Technical Reports Server (NTRS)

    1999-01-01

    Dynamic load balancing is central to adaptive mesh-based computations on large-scale parallel computers. The principal investigator has investigated various issues on the dynamic load balancing problem under NASA JOVE and JAG rants. The major accomplishments of the project are two graph partitioning algorithms and a load balancing framework. The S-HARP dynamic graph partitioner is known to be the fastest among the known dynamic graph partitioners to date. It can partition a graph of over 100,000 vertices in 0.25 seconds on a 64- processor Cray T3E distributed-memory multiprocessor while maintaining the scalability of over 16-fold speedup. Other known and widely used dynamic graph partitioners take over a second or two while giving low scalability of a few fold speedup on 64 processors. These results have been published in journals and peer-reviewed flagship conferences.

  4. Feature-Based Memory-Driven Attentional Capture: Visual Working Memory Content Affects Visual Attention

    ERIC Educational Resources Information Center

    Olivers, Christian N. L.; Meijer, Frank; Theeuwes, Jan

    2006-01-01

    In 7 experiments, the authors explored whether visual attention (the ability to select relevant visual information) and visual working memory (the ability to retain relevant visual information) share the same content representations. The presence of singleton distractors interfered more strongly with a visual search task when it was accompanied by…

  5. Microsupercomputers: Design and implementation. Semi-annual technical progress report, April-October 1990

    SciTech Connect

    Hennessy, J.L.; Horowitz, M.A.

    1990-10-01

    Project Goals: (1) Investigate fundamental properties of parallel programs and the implications for multiprocessor architectures and parallel programming and compilers; (2) Explore architectural approaches that can be used to build scalable multiprocessors capable of supporting a shared memory programming paradigm; (3) Develop compiler and programming language technology that supports the construction of efficient, machine-independent parallel programs; (4) Investigate architectural approaches that will lead to substantial improvements in uniprocessor performance and that can be incorporated in scalable multiprocessors; (5) Develop Computer Aided Design tools and Very Large Scale Integration technology needed to construct high performance machines.

  6. Interference from mere thinking: mental rehearsal temporarily disrupts recall of motor memory.

    PubMed

    Yin, Cong; Wei, Kunlin

    2014-08-01

    Interference between successively learned tasks is widely investigated to study motor memory. However, how simultaneously learned motor memories interact with each other has been rarely studied despite its prevalence in daily life. Assuming that motor memory shares common neural mechanisms with declarative memory system, we made unintuitive predictions that mental rehearsal, as opposed to further practice, of one motor memory will temporarily impair the recall of another simultaneously learned memory. Subjects simultaneously learned two sensorimotor tasks, i.e., visuomotor rotation and gain. They retrieved one memory by either practice or mental rehearsal and then had their memory evaluated. We found that mental rehearsal, instead of execution, impaired the recall of unretrieved memory. This impairment was content-independent, i.e., retrieving either gain or rotation impaired the other memory. Hence, conscious recollection of one motor memory interferes with the recall of another memory. This is analogous to retrieval-induced forgetting in declarative memory, suggesting a common neural process across memory systems. Our findings indicate that motor imagery is sufficient to induce interference between motor memories. Mental rehearsal, currently widely regarded as beneficial for motor performance, negatively affects memory recall when it is exercised for a subset of memorized items. PMID:24805082

  7. Memory loss.

    PubMed

    Flicker, Leon A; Ford, Andrew H; Beer, Christopher D; Almeida, Osvaldo P

    2012-02-01

    Most older people with memory loss do not have dementia. Those with mild cognitive impairment are at increased risk of progressing to dementia, but no tests have been shown to enhance the accuracy of assessing this risk. Although no intervention has been convincingly shown to prevent dementia, data from cohort studies and randomised controlled trials are compelling in indicating that physical activity and treatment of hypertension decrease the risk of dementia. There is no evidence that pharmaceutical treatment will benefit people with mild cognitive impairment. In people with Alzheimer's disease, treatment with a cholinesterase inhibitor or memantine (an N-methyl- D-aspartate receptor antagonist) may provide symptomatic relief and enhance quality of life, but does not appear to alter progression of the illness. Non-pharmacological strategies are recommended as first-line treatments for behavioural and psychological symptoms of dementia, which are common in Alzheimer's disease. Atypical antipsychotics have modest benefit in reducing agitation and psychotic symptoms but increase the risk of cardiovascular events. The role of antidepressants in managing depressive symptoms in patients with mild cognitive impairment is uncertain and may increase the risk of delirium and falls. PMID:22304604

  8. Memory Metals

    NASA Technical Reports Server (NTRS)

    1995-01-01

    Under contract to NASA during preparations for the space station, Memry Technologies Inc. investigated shape memory effect (SME). SME is a characteristic of certain metal alloys that can change shape in response to temperature variations. In the late 1980s and early 1990s, Memry used its NASA-acquired expertise to produce a line of home and industrial safety products, and refined the technology in the mid-1990s. Among the new products they developed are three MemrySafe units which prevent scalding from faucets. Each system contains a small valve that reacts to temperature, not pressure. When the water reaches dangerous temperatures, the unit reduces the flow to a trickle; when the scalding temperature subsides, the unit restores normal flow. Other products are the FIRECHEK 2 and 4, heat-activated shutoff valves for industrial process lines, which sense excessive heat and cut off pneumatic pressure. The newest of these products is Memry's Demand Management Water Heater which shifts the electricity requirement from peak to off-peak demands, conserving energy and money.

  9. Shared clinical decision making

    PubMed Central

    AlHaqwi, Ali I.; AlDrees, Turki M.; AlRumayyan, Ahmad; AlFarhan, Ali I.; Alotaibi, Sultan S.; AlKhashan, Hesham I.; Badri, Motasim

    2015-01-01

    Objectives: To determine preferences of patients regarding their involvement in the clinical decision making process and the related factors in Saudi Arabia. Methods: This cross-sectional study was conducted in a major family practice center in King Abdulaziz Medical City, Riyadh, Saudi Arabia, between March and May 2012. Multivariate multinomial regression models were fitted to identify factors associated with patients preferences. Results: The study included 236 participants. The most preferred decision-making style was shared decision-making (57%), followed by paternalistic (28%), and informed consumerism (14%). The preference for shared clinical decision making was significantly higher among male patients and those with higher level of education, whereas paternalism was significantly higher among older patients and those with chronic health conditions, and consumerism was significantly higher in younger age groups. In multivariate multinomial regression analysis, compared with the shared group, the consumerism group were more likely to be female [adjusted odds ratio (AOR) =2.87, 95% confidence interval [CI] 1.31-6.27, p=0.008] and non-dyslipidemic (AOR=2.90, 95% CI: 1.03-8.09, p=0.04), and the paternalism group were more likely to be older (AOR=1.03, 95% CI: 1.01-1.05, p=0.04), and female (AOR=2.47, 95% CI: 1.32-4.06, p=0.008). Conclusion: Preferences of patients for involvement in the clinical decision-making varied considerably. In our setting, underlying factors that influence these preferences identified in this study should be considered and tailored individually to achieve optimal treatment outcomes. PMID:26620990

  10. Fixed Access Network Sharing

    NASA Astrophysics Data System (ADS)

    Cornaglia, Bruno; Young, Gavin; Marchetta, Antonio

    2015-12-01

    Fixed broadband network deployments are moving inexorably to the use of Next Generation Access (NGA) technologies and architectures. These NGA deployments involve building fiber infrastructure increasingly closer to the customer in order to increase the proportion of fiber on the customer's access connection (Fibre-To-The-Home/Building/Door/Cabinet… i.e. FTTx). This increases the speed of services that can be sold and will be increasingly required to meet the demands of new generations of video services as we evolve from HDTV to "Ultra-HD TV" with 4k and 8k lines of video resolution. However, building fiber access networks is a costly endeavor. It requires significant capital in order to cover any significant geographic coverage. Hence many companies are forming partnerships and joint-ventures in order to share the NGA network construction costs. One form of such a partnership involves two companies agreeing to each build to cover a certain geographic area and then "cross-selling" NGA products to each other in order to access customers within their partner's footprint (NGA coverage area). This is tantamount to a bi-lateral wholesale partnership. The concept of Fixed Access Network Sharing (FANS) is to address the possibility of sharing infrastructure with a high degree of flexibility for all network operators involved. By providing greater configuration control over the NGA network infrastructure, the service provider has a greater ability to define the network and hence to define their product capabilities at the active layer. This gives the service provider partners greater product development autonomy plus the ability to differentiate from each other at the active network layer.

  11. Quantum state sharing

    NASA Astrophysics Data System (ADS)

    Lance, Andrew M.; Symul, Thomas; Bowen, Warwick P.; Sanders, Barry C.; Lam, Ping Koy

    2004-05-01

    We demonstrate a multipartite protocol that utilizes entanglement to securely distribute and reconstruct a quantum state. A secret quantum state is encoded into a tripartite entangled state and distributed to three players. By collaborating together, a majority of the players can reconstruct the state, whilst the remaining player obtains nothing. This (2,3) threshold quantum state sharing scheme is characterized in terms of fidelity (F), signal transfer (T) and reconstruction noise (V). We demonstrate a fidelity averaged over all reconstruction permutations of 0.73 +/- 0.04, a level achievable only using quantum resources.

  12. Tripartite quantum state sharing.

    PubMed

    Lance, Andrew M; Symul, Thomas; Bowen, Warwick P; Sanders, Barry C; Lam, Ping Koy

    2004-04-30

    We demonstrate a multipartite protocol to securely distribute and reconstruct a quantum state. A secret quantum state is encoded into a tripartite entangled state and distributed to three players. By collaborating, any two of the three players can reconstruct the state, while individual players obtain nothing. We characterize this (2,3) threshold quantum state sharing scheme in terms of fidelity, signal transfer, and reconstruction noise. We demonstrate a fidelity averaged over all reconstruction permutations of 0.73+/-0.04, a level achievable only using quantum resources. PMID:15169193

  13. Tripartite Quantum State Sharing

    NASA Astrophysics Data System (ADS)

    Lance, Andrew M.; Symul, Thomas; Bowen, Warwick P.; Sanders, Barry C.; Lam, Ping Koy

    2004-04-01

    We demonstrate a multipartite protocol to securely distribute and reconstruct a quantum state. A secret quantum state is encoded into a tripartite entangled state and distributed to three players. By collaborating, any two of the three players can reconstruct the state, while individual players obtain nothing. We characterize this (2,3) threshold quantum state sharing scheme in terms of fidelity, signal transfer, and reconstruction noise. We demonstrate a fidelity averaged over all reconstruction permutations of 0.73±0.04, a level achievable only using quantum resources.

  14. Can power be shared?

    PubMed

    Ten Pas, William S

    2013-01-01

    Dental insurance began with a partnership between dental service organizations and state dental associations with a view toward expanding the number of Americans receiving oral health care and as a means for permitting firms and other organizations to offer employee benefits. The goals have been achieved, but the alliance between dentistry and insurance has become strained. A lack of dialogue has fostered mutual misconceptions, some of which are reviewed in this paper. It is possible that the public, the profession, and the dental insurance industry can all be strengthened, but only through power-sharing around the original common objective. PMID:24761578

  15. Jim Thomas: A Collection of Memories

    SciTech Connect

    Wong, Pak C.

    2010-12-01

    Jim Thomas, a guest editor and a long-time associate editor of Information Visualization (IVS), died in Richland, WA, on August 6, 2010 due to complications from a brain tumor. His friends and colleagues from around the world have since expressed their sadness and paid tribute to a visionary scientist in multiple public forums. For those who didn't get the chance to know Jim, I share a collection of my own memories of Jim Thomas and memories from some of his colleagues.

  16. Shared materials management and warehousing: a feasibility study.

    PubMed

    Kowalski, J C; Dickow, J F

    1985-06-01

    In response to significant changes in the political, competitive, and economic environment of Massachusetts, five independent hospitals sought ways to strengthen their capabilities to deal with these new challenges through collaboration. These hospitals, located in suburban Boston, prepared and submitted a proposal to the Blue Cross/Massachusetts Hospital Association Fund for Cooperative Innovation, requesting participative funding to conduct a "Shared Materials Management and Warehousing Feasibility Study". The objective was to determine the feasibility of several independent hospitals, operating in the same geographic area, sharing materials management and related functions. The hospitals were: Framingham Union, Glover Memorial, Leonard Morse, Newton-Wellesley, and Marlborough. PMID:10272277

  17. 76 FR 55065 - Change in Bank Control Notices; Acquisitions of Shares of a Bank or Bank Holding Company

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-09-06

    ... Denney, Assistant Vice President) 1 Memorial Drive, Kansas City, Missouri 64198-0001: 1. Gregory J. Weed, Cheyenne Wells, Colorado; to acquire voting shares of Weed Investment Group, Inc., and thereby...

  18. Vaccines, our shared responsibility.

    PubMed

    Pagliusi, Sonia; Jain, Rishabh; Suri, Rajinder Kumar

    2015-05-01

    The Developing Countries Vaccine Manufacturers' Network (DCVMN) held its fifteenth annual meeting from October 27-29, 2014, New Delhi, India. The DCVMN, together with the co-organizing institution Panacea Biotec, welcomed over 240 delegates representing high-profile governmental and nongovernmental global health organizations from 36 countries. Over the three-day meeting, attendees exchanged information about their efforts to achieve their shared goal of preventing death and disability from known and emerging infectious diseases. Special praise was extended to all stakeholders involved in the success of polio eradication in South East Asia and highlighted challenges in vaccine supply for measles-rubella immunization over the coming decades. Innovative vaccines and vaccine delivery technologies indicated creative solutions for achieving global immunization goals. Discussions were focused on three major themes including regulatory challenges for developing countries that may be overcome with better communication; global collaborations and partnerships for leveraging investments and enable uninterrupted supply of affordable and suitable vaccines; and leading innovation in vaccines difficult to develop, such as dengue, Chikungunya, typhoid-conjugated and EV71, and needle-free technologies that may speed up vaccine delivery. Moving further into the Decade of Vaccines, participants renewed their commitment to shared responsibility toward a world free of vaccine-preventable diseases. PMID:25749248

  19. Handling debugger breakpoints in a shared instruction system

    DOEpatents

    Gooding, Thomas Michael; Shok, Richard Michael

    2014-01-21

    A debugger debugs processes that execute shared instructions so that a breakpoint set for one process will not cause a breakpoint to occur in the other processes. A breakpoint is set by recording the original instruction at the desired location and writing a trap instruction to the shared instructions at that location. When a process encounters the breakpoint, the process passes control to the debugger for breakpoint processing if the breakpoint was set at that location for that process. If the trap was not set at that location for that process, the cacheline containing the trap is copied to a small scratchpad memory, and the virtual memory mappings are changed to translate the virtual address of the cacheline to the scratchpad. The original instruction is then written to replace the trap instruction in the scratchpad, so that process can execute the instructions in the scatchpad thereby avoiding the trap instruction.

  20. Model Sharing and Collaboration using HydroShare

    NASA Astrophysics Data System (ADS)

    Goodall, J. L.; Morsy, M. M.; Castronova, A. M.; Miles, B.; Merwade, V.; Tarboton, D. G.

    2015-12-01

    HydroShare is a web-based system funded by the National Science Foundation (NSF) for sharing hydrologic data and models as resources. Resources in HydroShare can either be assigned a generic type, meaning the resource only has Dublin Core metadata properties, or one of a growing number of specific resource types with enhanced metadata profiles defined by the HydroShare development team. Examples of specific resource types in the current release of HydroShare (http://www.hydroshare.org) include time series, geographic raster, Multidimensional (NetCDF), model program, and model instance. Here we describe research and development efforts in HydroShare project for model-related resources types. This work has included efforts to define metadata profiles for common modeling resources, execute models directly through the HydroShare user interface using Docker containers, and interoperate with the 3rd party application SWATShare for model execution and visualization. These examples demonstrate the benefit of HydroShare to support model sharing and address collaborative problems involving modeling. The presentation will conclude with plans for future modeling-related development in HydroShare including supporting the publication of workflow resources, enhanced metadata for additional hydrologic models, and linking model resources with other resources in HydroShare to capture model provenance.

  1. Domestic Role Sharing in Sweden.

    ERIC Educational Resources Information Center

    Haas, Linda

    1981-01-01

    Investigated the extent to which Swedish couples (N=128) share domestic tasks using a mail survey. Suggests Swedish couples shared household chores more evenly than American couples. Results indicated variables measuring social exchange theory, family life-cycle stage, and socialization had the greatest influence on role sharing behavior.…

  2. Fractions: How to Fair Share

    ERIC Educational Resources Information Center

    Wilson, P. Holt; Edgington, Cynthia P.; Nguyen, Kenny H.; Pescosolido, Ryan S.; Confrey, Jere

    2011-01-01

    Children learn from a very early age what it means to get their "fair share." Whether it is candy or birthday cake, many children successfully create equal-size groups or parts of a collection or whole but later struggle to create fair shares of multiple wholes, such as fairly sharing four pies among a family of seven. Recent research suggests…

  3. Frameworks for Sharing Teaching Practices

    ERIC Educational Resources Information Center

    Carroll, John M.; Rosson, Mary Beth; Dunlap, Dan; Isenhour, Philip

    2005-01-01

    In many organizations, collaborating with peers, sharing resources, and codifying know-how are not typical facets of work activity. For such organizations, knowledge management support must help people identify and orient to opportunities for collaboration and sharing, articulate values and best practices, and assimilate sharing knowledge as an…

  4. Sharing Educational Services. PREP-13.

    ERIC Educational Resources Information Center

    Jongeward, Ray; Heesacker, Frank

    The focus of this report is on shared services in the rural setting. The kit contains three documents of useful information for any school planning a shared service activity to improve rural education. 13-A identifies 215 shared services in 50 states along with an indexing of each service by subject area and by state. 13-B is a series of 10…

  5. Memory beyond expression.

    PubMed

    Delorenzi, A; Maza, F J; Suárez, L D; Barreiro, K; Molina, V A; Stehberg, J

    2014-01-01

    The idea that memories are not invariable after the consolidation process has led to new perspectives about several mnemonic processes. In this framework, we review our studies on the modulation of memory expression during reconsolidation. We propose that during both memory consolidation and reconsolidation, neuromodulators can determine the probability of the memory trace to guide behavior, i.e. they can either increase or decrease its behavioral expressibility without affecting the potential of persistent memories to be activated and become labile. Our hypothesis is based on the findings that positive modulation of memory expression during reconsolidation occurs even if memories are behaviorally unexpressed. This review discusses the original approach taken in the studies of the crab Neohelice (Chasmagnathus) granulata, which was then successfully applied to test the hypothesis in rodent fear memory. Data presented offers a new way of thinking about both weak trainings and experimental amnesia: memory retrieval can be dissociated from memory expression. Furthermore, the strategy presented here allowed us to show in human declarative memory that the periods in which long-term memory can be activated and become labile during reconsolidation exceeds the periods in which that memory is expressed, providing direct evidence that conscious access to memory is not needed for reconsolidation. Specific controls based on the constraints of reminders to trigger reconsolidation allow us to distinguish between obliterated and unexpressed but activated long-term memories after amnesic treatments, weak trainings and forgetting. In the hypothesis discussed, memory expressibility--the outcome of experience-dependent changes in the potential to behave--is considered as a flexible and modulable attribute of long-term memories. Expression seems to be just one of the possible fates of re-activated memories. PMID:25102126

  6. Memory bias for negative emotional words in recognition memory is driven by effects of category membership

    PubMed Central

    White, Corey N.; Kapucu, Aycan; Bruno, Davide; Rotello, Caren M.; Ratcliff, Roger

    2014-01-01

    Recognition memory studies often find that emotional items are more likely than neutral items to be labeled as studied. Previous work suggests this bias is driven by increased memory strength/familiarity for emotional items. We explored strength and bias interpretations of this effect with the conjecture that emotional stimuli might seem more familiar because they share features with studied items from the same category. Categorical effects were manipulated in a recognition task by presenting lists with a small, medium, or large proportion of emotional words. The liberal memory bias for emotional words was only observed when a medium or large proportion of categorized words were presented in the lists. Similar, though weaker, effects were observed with categorized words that were not emotional (animal names). These results suggest that liberal memory bias for emotional items may be largely driven by effects of category membership. PMID:24303902

  7. Detailed sensory memory, sloppy working memory.

    PubMed

    Sligte, Ilja G; Vandenbroucke, Annelinde R E; Scholte, H Steven; Lamme, Victor A F

    2010-01-01

    Visual short-term memory (VSTM) enables us to actively maintain information in mind for a brief period of time after stimulus disappearance. According to recent studies, VSTM consists of three stages - iconic memory, fragile VSTM, and visual working memory - with increasingly stricter capacity limits and progressively longer lifetimes. Still, the resolution (or amount of visual detail) of each VSTM stage has remained unexplored and we test this in the present study. We presented people with a change detection task that measures the capacity of all three forms of VSTM, and we added an identification display after each change trial that required people to identify the "pre-change" object. Accurate change detection plus pre-change identification requires subjects to have a high-resolution representation of the "pre-change" object, whereas change detection or identification only can be based on the hunch that something has changed, without exactly knowing what was presented before. We observed that people maintained 6.1 objects in iconic memory, 4.6 objects in fragile VSTM, and 2.1 objects in visual working memory. Moreover, when people detected the change, they could also identify the pre-change object on 88% of the iconic memory trials, on 71% of the fragile VSTM trials and merely on 53% of the visual working memory trials. This suggests that people maintain many high-resolution representations in iconic memory and fragile VSTM, but only one high-resolution object representation in visual working memory. PMID:21897823

  8. An implicit spatial memory alignment effect.

    PubMed

    Cerles, Mélanie; Gomez, Alice; Rousset, Stéphane

    2015-09-01

    The memory alignment effect is the advantage of reasoning from a perspective which is aligned with the frame of reference used to encode an environment in memory. It usually occurs when participants have to consciously take a perspective to perform a spatial memory task. The present experiment assesses whether the memory alignment effect can occur without requiring to consciously take a given perspective, when the misaligned perspective is only perceptively provided. In others words, does the memory alignment effect still arise when it is only implicitly prompted? Thirty participants learned a sequence of four objects' positions in a room from a north-as-up survey perspective. During the testing phase, they had to point to the direction of a target object from another object ('the reference') with a fixed north-up orientation. The background behind the reference object displayed either a uniform color (control condition) or a misaligned ground-level perspective. The latter displayed a reference object's position information which was either congruent with the studied environment (congruent misaligned condition) or incongruent (incongruent misaligned condition). Mean pointing errors were higher in the congruent misaligned condition than in the control condition, whereas the incongruent misaligned condition did not differ from the control one. The present study shows that the memory alignment effect can arise without requiring a conscious misaligned perspective taking. Moreover, the perceived misaligned perspective must share the same spatial content as the memorized spatial representation in order to induce an alignment effect. PMID:26233526

  9. Memories of AB

    NASA Astrophysics Data System (ADS)

    Vaks, V. G.

    2013-06-01

    I had the good fortune to be a student of A. B. Migdal - AB, as we called him in person or in his absence - and to work in the sector he headed at the Kurchatov Institute, along with his other students and my friends, including Vitya Galitsky, Spartak Belyayev and Tolya Larkin. I was especially close with AB in the second half of the 1950s, the years most important for my formation, and AB's contribution to this formation was very great. To this day, I've often quoted AB on various occasions, as it's hard to put things better or more precisely than he did; I tell friends stories heard from AB, because these stories enhance life as AB himself enhanced it; my daughter is named Tanya after AB's wife Tatyana Lvovna, and so on. In what follows, I'll recount a few episodes in my life in which AB played an important or decisive role, and then will share some other memories of AB...

  10. SHARED TECHNOLOGY TRANSFER PROGRAM

    SciTech Connect

    GRIFFIN, JOHN M. HAUT, RICHARD C.

    2008-03-07

    The program established a collaborative process with domestic industries for the purpose of sharing Navy-developed technology. Private sector businesses were educated so as to increase their awareness of the vast amount of technologies that are available, with an initial focus on technology applications that are related to the Hydrogen, Fuel Cells and Infrastructure Technologies (Hydrogen) Program of the U.S. Department of Energy. Specifically, the project worked to increase industry awareness of the vast technology resources available to them that have been developed with taxpayer funding. NAVSEA-Carderock and the Houston Advanced Research Center teamed with Nicholls State University to catalog NAVSEA-Carderock unclassified technologies, rated the level of readiness of the technologies and established a web based catalog of the technologies. In particular, the catalog contains technology descriptions, including testing summaries and overviews of related presentations.

  11. Sharing a disparate landscape

    NASA Astrophysics Data System (ADS)

    Ali-Khan, Carolyne

    2010-06-01

    Working across boundaries of power, identity, and political geography is fraught with difficulties and contradictions. In Tali Tal and Iris Alkaher's, " Collaborative environmental projects in a multicultural society: Working from within separate or mutual landscapes?" the authors describe their efforts to do this in the highly charged atmosphere of Israel. This forum article offers a response to their efforts. Writing from a framework of critical pedagogy, I use the concepts of space and time to anchor my analysis, as I examine the issue of power in this Jew/Arab collaborative environmental project. This response problematizes "sharing" in a landscape fraught with disparities. It also looks to further Tal and Alkaher's work by geographically and politically grounding it in the broader current conflict and by juxtaposing sustainability with equity.

  12. Ageing-related stereotypes in memory: When the beliefs come true.

    PubMed

    Bouazzaoui, Badiâa; Follenfant, Alice; Ric, François; Fay, Séverine; Croizet, Jean-Claude; Atzeni, Thierry; Taconnat, Laurence

    2016-05-01

    Age-related stereotype concerns culturally shared beliefs about the inevitable decline of memory with age. In this study, stereotype priming and stereotype threat manipulations were used to explore the impact of age-related stereotype on metamemory beliefs and episodic memory performance. Ninety-two older participants who reported the same perceived memory functioning were divided into two groups: a threatened group and a non-threatened group (control). First, the threatened group was primed with an ageing stereotype questionnaire. Then, both groups were administered memory complaints and memory self-efficacy questionnaires to measure metamemory beliefs. Finally, both groups were administered the Logical Memory task to measure episodic memory, for the threatened group the instructions were manipulated to enhance the stereotype threat. Results indicated that the threatened individuals reported more memory complaints and less memory efficacy, and had lower scores than the control group on the logical memory task. A multiple mediation analysis revealed that the stereotype threat effect on the episodic memory performance was mediated by both memory complaints and memory self-efficacy. This study revealed that stereotype threat impacts belief in one's own memory functioning, which in turn impairs episodic memory performance. PMID:26057336

  13. A Factor Analytic Approach to the Validation of the Word Memory Test and Test of Memory Malingering as Measures of Effort and Not Memory.

    PubMed

    Heyanka, Daniel J; Thaler, Nicholas S; Linck, John F; Pastorek, Nicholas J; Miller, Brian; Romesser, Jennifer; Sim, Anita H

    2015-08-01

    Research has demonstrated the utility of performance validity tests (PVTs) as a method of determining adequate effort during a neuropsychological evaluation. Although some studies affirm that forced-choice PVTs measure effort rather than memory, doubts remain in the literature. The purpose of the current study was to evaluate the relationship between effort and memory variables in a mild traumatic brain injury (TBI) sample (n = 160) by separating memory and effort as distinct factors while statistically controlling for the shared covariance between the variables. A two-factor solution was extracted such that the five PVT variables loaded on Factor 1 and the four memory variables loaded on Factor 2. The pattern matrix, which controls for the covariance between variables, provided clear support of two highly distinct factors with minimal cross-loadings. Our findings support assertions that PVTs measure effort independent of memory in veterans with mild TBI. PMID:25964105

  14. Scaling Linear Algebra Kernels using Remote Memory Access

    SciTech Connect

    Krishnan, Manoj Kumar; Lewis, Robert R.; Vishnu, Abhinav

    2010-09-13

    This paper describes the scalability of linear algebra kernels based on remote memory access approach. The current approach differs from the other linear algebra algorithms by the explicit use of shared memory and remote memory access (RMA) communication rather than message passing. It is suitable for clusters and scalable shared memory systems. The experimental results on large scale systems (Linux-Infiniband cluster, Cray XT) demonstrate consistent performance advantages over ScaLAPACK suite, the leading implementation of parallel linear algebra algorithms used today. For example, on a Cray XT4 for a matrix size of 102400, our RMA-based matrix multiplication achieved over 55 teraflops while ScaLAPACK’s pdgemm measured close to 42 teraflops on 10000 processes.

  15. The Effects of Task Structure on Time-sharing Efficiency and Resource Allocation Optimality

    NASA Technical Reports Server (NTRS)

    Tsang, P. S.; Wickens, C. D.

    1984-01-01

    A distinction was made between two aspects of time sharing performance: time sharing efficiency and attention allocation optimality. A secondary task technique was employed to evaluate the effects of the task structures of the component time shared tasks on both aspects of the time sharing performance. Five pairs of dual tasks differing in their structural configurations were investigated. The primary task was a visual/manual tracking task which requires spatial processing. The secondary task was either another tracking task or a verbal memory task with one of four different input/output configurations. Congruent to a common finding, time-sharing efficiency was observed to decrease with an increasing overlap of resources utilized by the time shared tasks. Research also tends to support the hypothesis that resource allocation is more optimal when the time shared tasks placed heavy demands on common processing resources than when they utilized separate resources.

  16. A report on the Sisal language project

    SciTech Connect

    Feo, J.T.; Cann, D.C. ); Oldehoeft, R.R. )

    1990-12-01

    Sisal (Streams and Iterations in Single Assignment Language) is a general-purpose applicative language intended for use on both conventional and novel multiprocessor systems. In this report the authors discuss the project's objectives, philosophy, and accomplishments and state our future plans. Four significant results of the Sisal project are compilation techniques for high-performance parallel applicative computation, a microtasking environment that supports dataflow on conventional shared-memory architectures, execution times comparable to those of Fortran, and cost-effective speed-up on shared-memory multiprocessors.

  17. Computer memory access technique

    NASA Technical Reports Server (NTRS)

    Zottarelli, L. J.

    1967-01-01

    Computer memory access commutator and steering gate configuration produces bipolar current pulses while still employing only the diodes and magnetic cores of the classic commutator, thereby appreciably reducing the complexity of the memory assembly.

  18. Understanding Memory Loss

    MedlinePlus

    ... memory problems—causes and treatments Help for serious memory problems What you need to know Where can I get more information? Words to know ... of Health U.S. Department of Health & Human Services USA.gov

  19. Optimizing Memory Transactions for Multicore Systems

    NASA Astrophysics Data System (ADS)

    Adl-Tabatabai, Ali-Reza; Kozyrakis, Christos; Saha, Bratin

    The shift to multicore architectures will require new programming technologies that enable mainstream developers to write parallel programs that can safely take advantage of the parallelism offered by multicore processors. One challenging aspect of shared memory parallel programming is synchronization. Programmers have traditionally used locks for synchronization, but lock-based synchronization has well-known pitfalls that make it hard to use for building thread-safe and scalable software components. Memory transactions have emerged as a promising alternative to lock-based synchronization because they promise to eliminate many of the problems associated with locks. Transactional programming constructs, however, have overheads and require optimizations to make them practical. Transactions can also benefit significantly from hardware support, and multicore processors with their large transistor budgets and on-chip memory hierarchies have the opportunity to provide this support.

  20. The science of sharing and the sharing of science

    PubMed Central

    Milkman, Katherine L.; Berger, Jonah

    2014-01-01

    Why do members of the public share some scientific findings and not others? What can scientists do to increase the chances that their findings will be shared widely among nonscientists? To address these questions, we integrate past research on the psychological drivers of interpersonal communication with a study examining the sharing of hundreds of recent scientific discoveries. Our findings offer insights into (i) how attributes of a discovery and the way it is described impact sharing, (ii) who generates discoveries that are likely to be shared, and (iii) which types of people are most likely to share scientific discoveries. The results described here, combined with a review of recent research on interpersonal communication, suggest how scientists can frame their work to increase its dissemination. They also provide insights about which audiences may be the best targets for the diffusion of scientific content. PMID:25225360

  1. The science of sharing and the sharing of science.

    PubMed

    Milkman, Katherine L; Berger, Jonah

    2014-09-16

    Why do members of the public share some scientific findings and not others? What can scientists do to increase the chances that their findings will be shared widely among nonscientists? To address these questions, we integrate past research on the psychological drivers of interpersonal communication with a study examining the sharing of hundreds of recent scientific discoveries. Our findings offer insights into (i) how attributes of a discovery and the way it is described impact sharing, (ii) who generates discoveries that are likely to be shared, and (iii) which types of people are most likely to share scientific discoveries. The results described here, combined with a review of recent research on interpersonal communication, suggest how scientists can frame their work to increase its dissemination. They also provide insights about which audiences may be the best targets for the diffusion of scientific content. PMID:25225360

  2. Emotional Memory Persists Longer than Event Memory

    ERIC Educational Resources Information Center

    Kuriyama, Kenichi; Soshi, Takahiro; Fujii, Takeshi; Kim, Yoshiharu

    2010-01-01

    The interaction between amygdala-driven and hippocampus-driven activities is expected to explain why emotion enhances episodic memory recognition. However, overwhelming behavioral evidence regarding the emotion-induced enhancement of immediate and delayed episodic memory recognition has not been obtained in humans. We found that the recognition…

  3. Data sharing in neuroimaging research

    PubMed Central

    Poline, Jean-Baptiste; Breeze, Janis L.; Ghosh, Satrajit; Gorgolewski, Krzysztof; Halchenko, Yaroslav O.; Hanke, Michael; Haselgrove, Christian; Helmer, Karl G.; Keator, David B.; Marcus, Daniel S.; Poldrack, Russell A.; Schwartz, Yannick; Ashburner, John; Kennedy, David N.

    2012-01-01

    Significant resources around the world have been invested in neuroimaging studies of brain function and disease. Easier access to this large body of work should have profound impact on research in cognitive neuroscience and psychiatry, leading to advances in the diagnosis and treatment of psychiatric and neurological disease. A trend toward increased sharing of neuroimaging data has emerged in recent years. Nevertheless, a number of barriers continue to impede momentum. Many researchers and institutions remain uncertain about how to share data or lack the tools and expertise to participate in data sharing. The use of electronic data capture (EDC) methods for neuroimaging greatly simplifies the task of data collection and has the potential to help standardize many aspects of data sharing. We review here the motivations for sharing neuroimaging data, the current data sharing landscape, and the sociological or technical barriers that still need to be addressed. The INCF Task Force on Neuroimaging Datasharing, in conjunction with several collaborative groups around the world, has started work on several tools to ease and eventually automate the practice of data sharing. It is hoped that such tools will allow researchers to easily share raw, processed, and derived neuroimaging data, with appropriate metadata and provenance records, and will improve the reproducibility of neuroimaging studies. By providing seamless integration of data sharing and analysis tools within a commodity research environment, the Task Force seeks to identify and minimize barriers to data sharing in the field of neuroimaging. PMID:22493576

  4. Make-believe memories.

    PubMed

    Loftus, Elizabeth F

    2003-11-01

    Research on memory distortion has shown that postevent suggestion can contaminate what a person remembers. Moreover, suggestion can lead to false memories being injected outright into the minds of people. These findings have implications for police investigation, clinical practice, and other settings in which memory reports are solicited. PMID:14609374

  5. Make-Believe Memories

    ERIC Educational Resources Information Center

    Loftus, Elizabeth F.

    2003-01-01

    Research on memory distortion has shown that postevent suggestion can contaminate what a person remembers. Moreover, suggestion can lead to false memories being injected outright into the minds of people. These findings have implications for police investigation, clinical practice, and other settings in which memory reports are solicited.

  6. Attending to auditory memory.

    PubMed

    Zimmermann, Jacqueline F; Moscovitch, Morris; Alain, Claude

    2016-06-01

    Attention to memory describes the process of attending to memory traces when the object is no longer present. It has been studied primarily for representations of visual stimuli with only few studies examining attention to sound object representations in short-term memory. Here, we review the interplay of attention and auditory memory with an emphasis on 1) attending to auditory memory in the absence of related external stimuli (i.e., reflective attention) and 2) effects of existing memory on guiding attention. Attention to auditory memory is discussed in the context of change deafness, and we argue that failures to detect changes in our auditory environments are most likely the result of a faulty comparison system of incoming and stored information. Also, objects are the primary building blocks of auditory attention, but attention can also be directed to individual features (e.g., pitch). We review short-term and long-term memory guided modulation of attention based on characteristic features, location, and/or semantic properties of auditory objects, and propose that auditory attention to memory pathways emerge after sensory memory. A neural model for auditory attention to memory is developed, which comprises two separate pathways in the parietal cortex, one involved in attention to higher-order features and the other involved in attention to sensory information. This article is part of a Special Issue entitled SI: Auditory working memory. PMID:26638836

  7. Music, memory and emotion.

    PubMed

    Jäncke, Lutz

    2008-01-01

    Because emotions enhance memory processes and music evokes strong emotions, music could be involved in forming memories, either about pieces of music or about episodes and information associated with particular music. A recent study in BMC Neuroscience has given new insights into the role of emotion in musical memory. PMID:18710596

  8. Generation and Context Memory

    ERIC Educational Resources Information Center

    Mulligan, Neil W.; Lozito, Jeffrey P.; Rosner, Zachary A.

    2006-01-01

    Generation enhances memory for occurrence but may not enhance other aspects of memory. The present study further delineates the negative generation effect in context memory reported in N. W. Mulligan (2004). First, the negative generation effect occurred for perceptual attributes of the target item (its color and font) but not for extratarget…

  9. Memory and the Self

    ERIC Educational Resources Information Center

    Conway, Martin A.

    2005-01-01

    The Self-Memory System (SMS) is a conceptual framework that emphasizes the interconnectedness of self and memory. Within this framework memory is viewed as the data base of the self. The self is conceived as a complex set of active goals and associated self-images, collectively referred to as the "working self." The relationship between the…

  10. The Bush Memorial Library.

    ERIC Educational Resources Information Center

    Hamline University Bulletin, 1971

    1971-01-01

    The Bush Memorial Library was formally dedicated on October 9, 1971. As part of Hamline University in St. Paul, Minnesota, the Bush Memorial Library has a reading room, audio booths, and audio-visual classroom as well as an audio control room. The Bush Memorial Library is a member of the Cooperating Libraries in Consortium which is a cooperative…

  11. Associative Memory Acceptors.

    ERIC Educational Resources Information Center

    Card, Roger

    The properties of an associative memory are examined in this paper from the viewpoint of automata theory. A device called an associative memory acceptor is studied under real-time operation. The family "L" of languages accepted by real-time associative memory acceptors is shown to properly contain the family of languages accepted by one-tape,…

  12. Contexts as Shared Commitments

    PubMed Central

    García-Carpintero, Manuel

    2015-01-01

    Contemporary semantics assumes two influential notions of context: one coming from Kaplan (1989), on which contexts are sets of predetermined parameters, and another originating in Stalnaker (1978), on which contexts are sets of propositions that are “common ground.” The latter is deservedly more popular, given its flexibility in accounting for context-dependent aspects of language beyond manifest indexicals, such as epistemic modals, predicates of taste, and so on and so forth; in fact, properly dealing with demonstratives (perhaps ultimately all indexicals) requires that further flexibility. Even if we acknowledge Lewis (1980)'s point that, in a sense, Kaplanian contexts already include common ground contexts, it is better to be clear and explicit about what contexts constitutively are. Now, Stalnaker (1978, 2002, 2014) defines context-as-common-ground as a set of propositions, but recent work shows that this is not an accurate conception. The paper explains why, and provides an alternative. The main reason is that several phenomena (presuppositional treatments of pejoratives and predicates of taste, forces other than assertion) require that the common ground includes non-doxastic attitudes such as appraisals, emotions, etc. Hence the common ground should not be taken to include merely contents (propositions), but those together with attitudes concerning them: shared commitments, as I will defend. PMID:26733087

  13. Contexts as Shared Commitments.

    PubMed

    García-Carpintero, Manuel

    2015-01-01

    Contemporary semantics assumes two influential notions of context: one coming from Kaplan (1989), on which contexts are sets of predetermined parameters, and another originating in Stalnaker (1978), on which contexts are sets of propositions that are "common ground." The latter is deservedly more popular, given its flexibility in accounting for context-dependent aspects of language beyond manifest indexicals, such as epistemic modals, predicates of taste, and so on and so forth; in fact, properly dealing with demonstratives (perhaps ultimately all indexicals) requires that further flexibility. Even if we acknowledge Lewis (1980)'s point that, in a sense, Kaplanian contexts already include common ground contexts, it is better to be clear and explicit about what contexts constitutively are. Now, Stalnaker (1978, 2002, 2014) defines context-as-common-ground as a set of propositions, but recent work shows that this is not an accurate conception. The paper explains why, and provides an alternative. The main reason is that several phenomena (presuppositional treatments of pejoratives and predicates of taste, forces other than assertion) require that the common ground includes non-doxastic attitudes such as appraisals, emotions, etc. Hence the common ground should not be taken to include merely contents (propositions), but those together with attitudes concerning them: shared commitments, as I will defend. PMID:26733087

  14. Cooperative Data Sharing: Simple Support for Clusters of SMP Nodes

    NASA Technical Reports Server (NTRS)

    DiNucci, David C.; Balley, David H. (Technical Monitor)

    1997-01-01

    Libraries like PVM and MPI send typed messages to allow for heterogeneous cluster computing. Lower-level libraries, such as GAM, provide more efficient access to communication by removing the need to copy messages between the interface and user space in some cases. still lower-level interfaces, such as UNET, get right down to the hardware level to provide maximum performance. However, these are all still interfaces for passing messages from one process to another, and have limited utility in a shared-memory environment, due primarily to the fact that message passing is just another term for copying. This drawback is made more pertinent by today's hybrid architectures (e.g. clusters of SMPs), where it is difficult to know beforehand whether two communicating processes will share memory. As a result, even portable language tools (like HPF compilers) must either map all interprocess communication, into message passing with the accompanying performance degradation in shared memory environments, or they must check each communication at run-time and implement the shared-memory case separately for efficiency. Cooperative Data Sharing (CDS) is a single user-level API which abstracts all communication between processes into the sharing and access coordination of memory regions, in a model which might be described as "distributed shared messages" or "large-grain distributed shared memory". As a result, the user programs to a simple latency-tolerant abstract communication specification which can be mapped efficiently to either a shared-memory or message-passing based run-time system, depending upon the available architecture. Unlike some distributed shared memory interfaces, the user still has complete control over the assignment of data to processors, the forwarding of data to its next likely destination, and the queuing of data until it is needed, so even the relatively high latency present in clusters can be accomodated. CDS does not require special use of an MMU, which

  15. Self-defining memories, scripts, and the life story: narrative identity in personality and psychotherapy.

    PubMed

    Singer, Jefferson A; Blagov, Pavel; Berry, Meredith; Oost, Kathryn M

    2013-12-01

    An integrative model of narrative identity builds on a dual memory system that draws on episodic memory and a long-term self to generate autobiographical memories. Autobiographical memories related to critical goals in a lifetime period lead to life-story memories, which in turn become self-defining memories when linked to an individual's enduring concerns. Self-defining memories that share repetitive emotion-outcome sequences yield narrative scripts, abstracted templates that filter cognitive-affective processing. The life story is the individual's overarching narrative that provides unity and purpose over the life course. Healthy narrative identity combines memory specificity with adaptive meaning-making to achieve insight and well-being, as demonstrated through a literature review of personality and clinical research, as well as new findings from our own research program. A clinical case study drawing on this narrative identity model is also presented with implications for treatment and research. PMID:22925032

  16. Memory: sins and virtues

    PubMed Central

    Schacter, Daniel L.

    2013-01-01

    Memory plays an important role in everyday life but does not provide an exact and unchanging record of experience: research has documented that memory is a constructive process that is subject to a variety of errors and distortions. Yet these memory “sins” also reflect the operation of adaptive aspects of memory. Memory can thus be characterized as an adaptive constructive process, which plays a functional role in cognition but produces distortions, errors, or illusions as a consequence of doing so. PMID:23909686

  17. A multiplexed quantum memory.

    PubMed

    Lan, S-Y; Radnaev, A G; Collins, O A; Matsukevich, D N; Kennedy, T A; Kuzmich, A

    2009-08-01

    A quantum repeater is a system for long-distance quantum communication that employs quantum memory elements to mitigate optical fiber transmission losses. The multiplexed quantum memory (O. A. Collins, S. D. Jenkins, A. Kuzmich, and T. A. B. Kennedy, Phys. Rev. Lett. 98, 060502 (2007)) has been shown theoretically to reduce quantum memory time requirements. We present an initial implementation of a multiplexed quantum memory element in a cold rubidium gas. We show that it is possible to create atomic excitations in arbitrary memory element pairs and demonstrate the violation of Bell's inequality for light fields generated during the write and read processes. PMID:19654771

  18. Information Sharing among Untrustworthy Entities

    NASA Astrophysics Data System (ADS)

    Tamura, Shinsuke; Yanase, Tatsuro

    Most of current technologies that enable secure information sharing assume that entities that share information are mutually trustworthy. However, in recent applications this assumption is not realistic. As applications become sophisticated, information systems are required to share information securely even among untrustworthy entities. This paper discusses two kinds of problems about information sharing among untrustworthy entities, i.e. secure statistical data gathering and anonymous authentication, and proposes their solutions. The former is a problem to calculate statistics while ensuring that raw data are not disclosed to any entity including ones that calculate statistics, and the latter is a problem to authenticate entities while keeping their identities confidential.

  19. Towards memory-aware services and browsing through lifelogging sensing.

    PubMed

    Arcega, Lorena; Font, Jaime; Cetina, Carlos

    2013-01-01

    Every day we receive lots of information through our senses that is lost forever, because it lacked the strength or the repetition needed to generate a lasting memory. Combining the emerging Internet of Things and lifelogging sensors, we believe it is possible to build up a Digital Memory (Dig-Mem) in order to complement the fallible memory of people. This work shows how to realize the Dig-Mem in terms of interactions, affinities, activities, goals and protocols. We also complement this Dig-Mem with memory-aware services and a Dig-Mem browser. Furthermore, we propose a RFID Tag-Sharing technique to speed up the adoption of Dig-Mem. Experimentation reveals an improvement of the user understanding of Dig-Mem as time passes, compared to natural memories where the level of detail decreases over time. PMID:24196436

  20. Parallel garbage collection on a virtual memory system

    SciTech Connect

    Abraham, S.G.; Patel, J.H.

    1987-01-01

    Since most artificial intelligence applications are programmed in list processing languages, it is important to design architectures to support efficient garbage collection. This paper presents an architecture and an associated algorithm for parallel garbage collection on a virtual memory system. All the previously proposed parallel algorithms attempt to collect cells released by the list processor during the garbage collection cycle. We do not attempt to collect such cells. As a consequence, the list processor incurs little overhead in the proposed scheme, since it need not synchronize with the collector. Most parallel algorithms are designed for shared memory machines which have certain implicit synchronization functions on variable access. The proposed algorithm is designed for virtual memory systems where both the list processor and the garbage collector have private memories. The enforcement of coherence between the two private memories can be expensive and is not necessary in our scheme. 15 refs., 3 figs.