Science.gov

Sample records for shared memory multiprocessors

  1. Shared versus distributed memory multiprocessors

    NASA Technical Reports Server (NTRS)

    Jordan, Harry F.

    1991-01-01

    The question of whether multiprocessors should have shared or distributed memory has attracted a great deal of attention. Some researchers argue strongly for building distributed memory machines, while others argue just as strongly for programming shared memory multiprocessors. A great deal of research is underway on both types of parallel systems. Special emphasis is placed on systems with a very large number of processors for computation intensive tasks and considers research and implementation trends. It appears that the two types of systems will likely converge to a common form for large scale multiprocessors.

  2. Dynamic Program Phase Detection in Distributed Shared-Memory Multiprocessors

    SciTech Connect

    Ipek, E; Martinez, J F; de Supinski, B R; McKee, S A; Schulz, M

    2006-03-06

    We present a novel hardware mechanism for dynamic program phase detection in distributed shared-memory (DSM) multiprocessors. We show that successful hardware mechanisms for phase detection in uniprocessors do not necessarily work well in DSM systems, since they lack the ability to incorporate the parallel application's global execution information and memory access behavior based on data distribution. We then propose a hardware extension to a well-known uniprocessor mechanism that significantly improves phase detection in the context of DSM multiprocessors. The resulting mechanism is modest in size and complexity, and is transparent to the parallel application.

  3. Cache-based error recovery for shared memory multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Wu, Kun-Lung; Fuchs, W. Kent; Patel, Janak H.

    1989-01-01

    A multiprocessor cache-based checkpointing and recovery scheme for of recovering from transient processor errors in a shared-memory multiprocessor with private caches is presented. New implementation techniques that use checkpoint identifiers and recovery stacks to reduce performance degradation in processor utilization during normal execution are examined. This cache-based checkpointing technique prevents rollback propagation, provides for rapid recovery, and can be integrated into standard cache coherence protocols. An analytical model is used to estimate the relative performance of the scheme during normal execution. Extensions that take error latency into account are presented.

  4. Efficient ICCG on a shared memory multiprocessor

    NASA Technical Reports Server (NTRS)

    Hammond, Steven W.; Schreiber, Robert

    1989-01-01

    Different approaches are discussed for exploiting parallelism in the ICCG (Incomplete Cholesky Conjugate Gradient) method for solving large sparse symmetric positive definite systems of equations on a shared memory parallel computer. Techniques for efficiently solving triangular systems and computing sparse matrix-vector products are explored. Three methods for scheduling the tasks in solving triangular systems are implemented on the Sequent Balance 21000. Sample problems that are representative of a large class of problems solved using iterative methods are used. We show that a static analysis to determine data dependences in the triangular solve can greatly improve its parallel efficiency. We also show that ignoring symmetry and storing the whole matrix can reduce solution time substantially.

  5. Multiprocessor system on chip with shared memory using crossbar topology

    NASA Astrophysics Data System (ADS)

    de Macedo Mourelle, Luiza; Nedjah, Nadia; Gonçalves Pessanha, Fábio

    2015-01-01

    Multiprocessor system on chip offers a set of processors, embedded in one single chip. A parallel application can, then, be scheduled to each processor, in order to accelerate its execution, using either shared memory or message passing for exchanging data. In this case, the use of a shared bus is no longer a viable solution, due to its high contention. In order to allow for non-blocking parallelism, we implemented the interconnection network based on the crossbar topology. In this kind of interconnection, processors have full access to their own memory module simultaneously, besides been able to address the whole memory. One processor accesses the memory module of another processor only when it needs to retrieve data generated by the latter. This paper presents the specification and modelling of an interconnection network based on the crossbar topology. The aim of this work is to investigate the performance characteristics of a parallel application running on this platform.

  6. Performance and scalability aspects of directory-based cache coherence in shared-memory multiprocessors

    SciTech Connect

    Picano, S.; Meyer, D.G.; Brooks, E.D. III; Hoag, J.E.

    1993-05-01

    We present a study that accentuates the performance and scalability aspects of directory-based cache coherence in multiprocessor systems. Using a multiprocessor with a software-based coherence scheme, efficient implementations rely heavily on the programmer`s ability to explicitly manage the memory system, which is typically handled by hardware support on other bus-based, shared memory multiprocessors. We describe a scalable, shared memory, cache coherent multiprocessor and present simulation results obtained on three parallel programs. This multiprocessor configuration exhibits high performance at no additional parallel programming cost.

  7. A robot arm simulation with a shared memory multiprocessor machine

    NASA Technical Reports Server (NTRS)

    Kim, Sung-Soo; Chuang, Li-Ping

    1989-01-01

    A parallel processing scheme for a single chain robot arm is presented for high speed computation on a shared memory multiprocessor. A recursive formulation that is derived from a virtual work form of the d'Alembert equations of motion is utilized for robot arm dynamics. A joint drive system that consists of a motor rotor and gears is included in the arm dynamics model, in order to take into account gyroscopic effects due to the spinning of the rotor. The fine grain parallelism of mechanical and control subsystem models is exploited, based on independent computation associated with bodies, joint drive systems, and controllers. Efficiency and effectiveness of the parallel scheme are demonstrated through simulations of a telerobotic manipulator arm. Two different mechanical subsystem models, i.e., with and without gyroscopic effects, are compared, to show the trade-off between efficiency and accuracy.

  8. Dynamic programming on a shared-memory multiprocessor

    NASA Technical Reports Server (NTRS)

    Edmonds, Phil; Chu, Eleanor; George, Alan

    1993-01-01

    Three new algorithms for solving dynamic programming problems on a shared-memory parallel computer are described. All three algorithms attempt to balance work load, while keeping synchronization cost low. In particular, for a multiprocessor having p processors, an analysis of the best algorithm shows that the arithmetic cost is O(n-cubed/6p) and that the synchronization cost is O(absolute value of log sub C n) if p much less than n, where C = (2p-1)/(2p + 1) and n is the size of the problem. The low synchronization cost is important for machines where synchronization is expensive. Analysis and experiments show that the best algorithm is effective in balancing the work load and producing high efficiency.

  9. Access ordering and coherence in shared-memory multi-processors

    SciTech Connect

    Scheurich, C.E.

    1989-01-01

    Shared memory forms a convenient communication medium in a multitasking multiprocessor system. However, different multiprocessors can execute the same program in different manners, possibly yielding incorrect results because the machines adhere to different rules. Differences in behavior are due to the varying approaches of designers to attack the shared memory access latency problem in multiprocessors. In particular, the manner in which multiple copies of data are controlled and the manner in which memory accesses are sequenced, propagated, and buffered has impact on the behavior of the multiprocessor. Three shared memory execution models, referred to as concurrency models, are defined. The precise properties of processors, memories, and interconnection networks are derived to comply to each of the concurrency models. The usefulness of these concurrency models is demonstrated by showing the simplicity with which their rules can be applied to allow buffering of memory accesses, implement combining networks, prove cache coherence protocols correct, and design lockup-free caches. Specific examples are provided, both of a cache-based multiprocessor potentially without bottlenecks and of a cache-based multiprocessor employing lockup-free caches which can continue to service the processor while concurrently servicing one of several access misses. The paradigms and associated conditions presented in this thesis form a set of powerful tools allowing multiprocessor designers to concentrate on functionality while being burdened less with side-effect analysis.

  10. Error recovery in shared memory multiprocessors using private caches

    NASA Technical Reports Server (NTRS)

    Wu, Kun-Lung; Fuchs, W. Kent; Patel, Janak H.

    1990-01-01

    The problem of recovering from processor transient faults in shared memory multiprocesses systems is examined. A user-transparent checkpointing and recovery scheme using private caches is presented. Processes can recover from errors due to faulty processors by restarting from the checkpointed computation state. Implementation techniques using checkpoint identifiers and recovery stacks are examined as a means of reducing performance degradation in processor utilization during normal execution. This cache-based checkpointing technique prevents rollback propagation, provides rapid recovery, and can be integrated into standard cache coherence protocols. An analytical model is used to estimate the relative performance of the scheme during normal execution. Extensions to take error latency into account are presented.

  11. Multiprocessor memory contention

    SciTech Connect

    Knadler, C.E. Jr.

    1989-01-01

    Caches are frequently incorporated in processor architectures to increase the effective memory speed and to reduce memory contention. However, task switches and the coherency problems of large n-way, mainframe-class multiprocessors lessen the effectiveness of cache architectures for general-purpose applications. A proposed alternative approach is to increase the effective memory bandwidth and decrease memory-access delays through instruction prefetch, operand buffering, highly interleave memory, and multiple-word width processor-memory data paths. This approach was evaluated by comparing cache and noncache system performance, using discrete-event simulation. Since the performance of a multiprocessor architecture is a function of its operating environment was well as its design, the system workload was defined. General-purpose applications, running under multitasking operating systems, were characterized with respect to addressing patterns, paging rates, and frequency of input/output operations. The proposed noncache architecture was found to have performance comparable to that of the cache architectures and obviated then need to solve the cache coherency problem.

  12. Performance Modeling and Measurement of Parallelized Code for Distributed Shared Memory Multiprocessors

    NASA Technical Reports Server (NTRS)

    Waheed, Abdul; Yan, Jerry

    1998-01-01

    This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. With increasing popularity of shared address space architectures, it is essential to understand their performance impact on programs that benefit from shared memory multiprocessing. We present a simple model to characterize the performance of programs that are parallelized using compiler directives for shared memory multiprocessing. We parallelized the sequential implementation of NAS benchmarks using native Fortran77 compiler directives for an Origin2000, which is a DSM system based on a cache-coherent Non Uniform Memory Access (ccNUMA) architecture. We report measurement based performance of these parallelized benchmarks from four perspectives: efficacy of parallelization process; scalability; parallelization overhead; and comparison with hand-parallelized and -optimized version of the same benchmarks. Our results indicate that sequential programs can conveniently be parallelized for DSM systems using compiler directives but realizing performance gains as predicted by the performance model depends primarily on minimizing architecture-specific data locality overhead.

  13. Reader set encoding for directory of shared cache memory in multiprocessor system

    SciTech Connect

    Ahn, Dnaiel; Ceze, Luis H.; Gara, Alan; Ohmacht, Martin; Xiaotong, Zhuang

    2014-06-10

    In a parallel processing system with speculative execution, conflict checking occurs in a directory lookup of a cache memory that is shared by all processors. In each case, the same physical memory address will map to the same set of that cache, no matter which processor originated that access. The directory includes a dynamic reader set encoding, indicating what speculative threads have read a particular line. This reader set encoding is used in conflict checking. A bitset encoding is used to specify particular threads that have read the line.

  14. Shared performance monitor in a multiprocessor system

    DOEpatents

    Chiu, George; Gara, Alan G; Salapura, Valentina

    2014-12-02

    A performance monitoring unit (PMU) and method for monitoring performance of events occurring in a multiprocessor system. The multiprocessor system comprises a plurality of processor devices units, each processor device for generating signals representing occurrences of events in the processor device, and, a single shared counter resource for performance monitoring. The performance monitor unit is shared by all processor cores in the multiprocessor system. The PMU is further programmed to monitor event signals issued from non-processor devices.

  15. Modeling and measuring multiprogramming and system overheads on a shared-memory multiprocessor - Case study

    NASA Technical Reports Server (NTRS)

    Dimpsey, Robert T.; Iyer, Ravi K.

    1991-01-01

    The present discussion of methods for quantifying multiprogramming (MP) overhead on a computer system illustrates two such techniques, respectively for quantifying MP overheads' lower bound and determining the MP overload of real workloads, in light of the percentage of parallel processing time that is consumed by MP overhead on Alliant multiprocessors. Kernel lock spinning is found to be a major factor in MP overhead, which accounts for more than half of total system overhead. It is noted that parallel environments' MP overhead is not statistically dependent on the number of parallel jobs undergoing multiprogramming.

  16. Shared performance monitor in a multiprocessor system

    DOEpatents

    Chiu, George; Gara, Alan G.; Salapura, Valentina

    2012-07-24

    A performance monitoring unit (PMU) and method for monitoring performance of events occurring in a multiprocessor system. The multiprocessor system comprises a plurality of processor devices units, each processor device for generating signals representing occurrences of events in the processor device, and, a single shared counter resource for performance monitoring. The performance monitor unit is shared by all processor cores in the multiprocessor system. The PMU comprises: a plurality of performance counters each for counting signals representing occurrences of events from one or more the plurality of processor units in the multiprocessor system; and, a plurality of input devices for receiving the event signals from one or more processor devices of the plurality of processor units, the plurality of input devices programmable to select event signals for receipt by one or more of the plurality of performance counters for counting, wherein the PMU is shared between multiple processing units, or within a group of processors in the multiprocessing system. The PMU is further programmed to monitor event signals issued from non-processor devices.

  17. A general model for memory interference in a multiprocessor system with memory hierarchy

    NASA Technical Reports Server (NTRS)

    Taha, Badie A.; Standley, Hilda M.

    1989-01-01

    The problem of memory interference in a multiprocessor system with a hierarchy of shared buses and memories is addressed. The behavior of the processors is represented by a sequence of memory requests with each followed by a determined amount of processing time. A statistical queuing network model for determining the extent of memory interference in multiprocessor systems with clusters of memory hierarchies is presented. The performance of the system is measured by the expected number of busy memory clusters. The results of the analytic model are compared with simulation results, and the correlation between them is found to be very high.

  18. Preliminary basic performance analysis of the Cedar multiprocessor memory system

    NASA Technical Reports Server (NTRS)

    Gallivan, K.; Jalby, W.; Turner, S.; Veidenbaum, A.; Wijshoff, H.

    1991-01-01

    Some preliminary basic results on the performance of the Cedar multiprocessor memory system are presented. Empirical results are presented and used to calibrate a memory system simulator which is then used to discuss the scalability of the system.

  19. Scalable Triadic Analysis of Large-Scale Graphs: Multi-Core vs. Multi-Processor vs. Multi-Threaded Shared Memory Architectures

    SciTech Connect

    Chin, George; Marquez, Andres; Choudhury, Sutanay; Feo, John T.

    2012-09-01

    Triadic analysis encompasses a useful set of graph mining methods that is centered on the concept of a triad, which is a subgraph of three nodes and the configuration of directed edges across the nodes. Such methods are often applied in the social sciences as well as many other diverse fields. Triadic methods commonly operate on a triad census that counts the number of triads of every possible edge configuration in a graph. Like other graph algorithms, triadic census algorithms do not scale well when graphs reach tens of millions to billions of nodes. To enable the triadic analysis of large-scale graphs, we developed and optimized a triad census algorithm to efficiently execute on shared memory architectures. We will retrace the development and evolution of a parallel triad census algorithm. Over the course of several versions, we continually adapted the code’s data structures and program logic to expose more opportunities to exploit parallelism on shared memory that would translate into improved computational performance. We will recall the critical steps and modifications that occurred during code development and optimization. Furthermore, we will compare the performances of triad census algorithm versions on three specific systems: Cray XMT, HP Superdome, and AMD multi-core NUMA machine. These three systems have shared memory architectures but with markedly different hardware capabilities to manage parallelism.

  20. Program partitioning for NUMA multiprocessor computer systems. [Nonuniform memory access

    SciTech Connect

    Wolski, R.M.; Feo, J.T. )

    1993-11-01

    Program partitioning and scheduling are essential steps in programming non-shared-memory computer systems. Partitioning is the separation of program operations into sequential tasks, and scheduling is the assignment of tasks to processors. To be effective, automatic methods require an accurate representation of the model of computation and the target architecture. Current partitioning methods assume today's most prevalent models -- macro dataflow and a homogeneous/two-level multicomputer system. Based on communication channels, neither model represents well the emerging class of NUMA multiprocessor computer systems consisting of hierarchical read/write memories. Consequently, the partitions generated by extant methods do not execute well on these systems. In this paper, the authors extend the conventional graph representation of the macro-dataflow model to enable mapping heuristics to consider the complex communication options supported by NUMA architectures. They describe two such heuristics. Simulated execution times of program graphs show that the model and heuristics generate higher quality program mappings than current methods for NUMA architectures.

  1. A VLSI chip set for a multiprocessor workstation; Part II: A memory management unit and cache controller

    SciTech Connect

    Jeong, D.K.; Wood, D.A.; Gibson, G.A.; Eggers, S.J.; Hodges, D.A.; Katz, R.H.; Patterson, D.A. )

    1989-12-01

    This paper describes a memory management unit and a cache controller (MMU/CC) for a shared memory multiprocessor. The MMU/CC implements a novel memory management scheme, called in-cache address translation, that does not require a translation lookaside buffer (TLB). It also implements a snooping but protocol to maintain data consistency across all caches in the system. Both chips are implemented in a 1.6-{mu}m double-layer-metal CMOS technology, and are being used in a multiprocessor workstation (SPUR) successfully executing a UNIX-like network-based operating system called Sprite as well as many applications including LISP programs.

  2. Using Pin as a Memory Reference Generator for Multiprocessor Simulation

    SciTech Connect

    McCurdy, C

    2005-10-22

    In this paper we describe how we have used Pin to generate a multithreaded reference stream for simulation of a multiprocessor on a uniprocessor. We have taken special care to model as accurately as possible the effects of cache coherence protocol state, and lock and barrier synchronization on the performance of multithreaded applications running on multiprocessor hardware. We first describe a simplified version of the algorithm, which uses semaphores to synchronize instrumented application threads and the simulator on every memory reference. We then describe modifications to that algorithm to model the microarchitectural features of the Itanium2 that affect the timing of memory reference issue. An experimental evaluation determines that while cycle-accurate multithreaded simulation is possible using our approach, the use of semaphores has a negative impact on the performance of the simulator.

  3. Supernodal synmbolic Cholesky factorization on a local-memory multiprocessor

    SciTech Connect

    Ng, E.

    1991-06-01

    In this paper, we consider the symbolic factorization step in computing the Cholesky factorization of a sparse symmetric positive definite matrix on distributed-memory multiprocessor systems. By exploiting the supernodal structure in the Cholesky factor, the performance of a previous parallel symbolic factorization algorithm is improved. Empirical tests demonstrate that there can be drastic reduction in the execution time required by the new algorithm on an Intel iPSC/2 hypercube. 23 refs., 8 figs.

  4. Low Latency Messages on Distributed Memory Multiprocessors

    DOE PAGES

    Rosing, Matt; Saltz, Joel

    1995-01-01

    This article describes many of the issues in developing an efficient interface for communication on distributed memory machines. Although the hardware component of message latency is less than 1 ws on many distributed memory machines, the software latency associated with sending and receiving typed messages is on the order of 50 μs. The reason for this imbalance is that the software interface does not match the hardware. By changing the interface to match the hardware more closely, applications with fine grained communication can be put on these machines. This article describes several tests performed and many of the issues involvedmore » in supporting low latency messages on distributed memory machines.« less

  5. Low latency messages on distributed memory multiprocessors

    NASA Technical Reports Server (NTRS)

    Rosing, Matthew; Saltz, Joel

    1993-01-01

    Many of the issues in developing an efficient interface for communication on distributed memory machines are described and a portable interface is proposed. Although the hardware component of message latency is less than one microsecond on many distributed memory machines, the software latency associated with sending and receiving typed messages is on the order of 50 microseconds. The reason for this imbalance is that the software interface does not match the hardware. By changing the interface to match the hardware more closely, applications with fine grained communication can be put on these machines. Based on several tests that were run on the iPSC/860, an interface that will better match current distributed memory machines is proposed. The model used in the proposed interface consists of a computation processor and a communication processor on each node. Communication between these processors and other nodes in the system is done through a buffered network. Information that is transmitted is either data or procedures to be executed on the remote processor. The dual processor system is better suited for efficiently handling asynchronous communications compared to a single processor system. The ability to send data or procedure is very flexible for minimizing message latency, based on the type of communication being performed. The test performed and the proposed interface are described.

  6. Software Coherence in Multiprocessor Memory Systems. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Bolosky, William Joseph

    1993-01-01

    Processors are becoming faster and multiprocessor memory interconnection systems are not keeping up. Therefore, it is necessary to have threads and the memory they access as near one another as possible. Typically, this involves putting memory or caches with the processors, which gives rise to the problem of coherence: if one processor writes an address, any other processor reading that address must see the new value. This coherence can be maintained by the hardware or with software intervention. Systems of both types have been built in the past; the hardware-based systems tended to outperform the software ones. However, the ratio of processor to interconnect speed is now so high that the extra overhead of the software systems may no longer be significant. This issue is explored both by implementing a software maintained system and by introducing and using the technique of offline optimal analysis of memory reference traces. It finds that in properly built systems, software maintained coherence can perform comparably to or even better than hardware maintained coherence. The architectural features necessary for efficient software coherence to be profitable include a small page size, a fast trap mechanism, and the ability to execute instructions while remote memory references are outstanding.

  7. Denelcor HEP multiprocessor simulator

    SciTech Connect

    Dunigan, T.H.

    1986-06-01

    The structure and use of a simulator for the Denelcor HEP multiprocessor are described. The simulator provides a multitasking environment for the development of parallel programs in C or FORTRAN using a library of subroutines that simulate the parallel programming constructs available on the HEP, a shared-memory multiprocessor. The simulator also provides a trace file that can be used for debugging, performance analysis, or graphical display. 7 refs., 4 figs.

  8. Generation-based memory synchronization in a multiprocessor system with weakly consistent memory accesses

    SciTech Connect

    Ohmacht, Martin

    2014-09-09

    In a multiprocessor system, a central memory synchronization module coordinates memory synchronization requests responsive to memory access requests in flight, a generation counter, and a reclaim pointer. The central module communicates via point-to-point communication. The module includes a global OR reduce tree for each memory access requesting device, for detecting memory access requests in flight. An interface unit is implemented associated with each processor requesting synchronization. The interface unit includes multiple generation completion detectors. The generation count and reclaim pointer do not pass one another.

  9. Optical RAM-enabled cache memory and optical routing for chip multiprocessors: technologies and architectures

    NASA Astrophysics Data System (ADS)

    Pleros, Nikos; Maniotis, Pavlos; Alexoudi, Theonitsa; Fitsios, Dimitris; Vagionas, Christos; Papaioannou, Sotiris; Vyrsokinos, K.; Kanellos, George T.

    2014-03-01

    The processor-memory performance gap, commonly referred to as "Memory Wall" problem, owes to the speed mismatch between processor and electronic RAM clock frequencies, forcing current Chip Multiprocessor (CMP) configurations to consume more than 50% of the chip real-estate for caching purposes. In this article, we present our recent work spanning from Si-based integrated optical RAM cell architectures up to complete optical cache memory architectures for Chip Multiprocessor configurations. Moreover, we discuss on e/o router subsystems with up to Tb/s routing capacity for cache interconnection purposes within CMP configurations, currently pursued within the FP7 PhoxTrot project.

  10. Vienna FORTRAN: A FORTRAN language extension for distributed memory multiprocessors

    NASA Technical Reports Server (NTRS)

    Chapman, Barbara; Mehrotra, Piyush; Zima, Hans

    1991-01-01

    Exploiting the performance potential of distributed memory machines requires a careful distribution of data across the processors. Vienna FORTRAN is a language extension of FORTRAN which provides the user with a wide range of facilities for such mapping of data structures. However, programs in Vienna FORTRAN are written using global data references. Thus, the user has the advantage of a shared memory programming paradigm while explicitly controlling the placement of data. The basic features of Vienna FORTRAN are presented along with a set of examples illustrating the use of these features.

  11. A single-assignment language in a distributed memory multiprocessor

    NASA Technical Reports Server (NTRS)

    Evripidou, P.; Najjar, W.; Gaudiot, J.-L.

    1989-01-01

    The implementation of the single-assignment programming language SISAL (McGraw et al., 1985) on a Symult 2010 parallel computer is described. The advantages of single-assignment languages over imperative languages in a multiprocessor environment are reviewed; the characteristics of SISAL are summarized; the program-graph generation and dynamic data partitioning procedures are explained; and the application of SISAL in constructing a concurrent iterative multigrid algorithm is discussed in detail and illustrated with diagrams.

  12. Conditional load and store in a shared memory

    SciTech Connect

    Blumrich, Matthias A; Ohmacht, Martin

    2015-02-03

    A method, system and computer program product for implementing load-reserve and store-conditional instructions in a multi-processor computing system. The computing system includes a multitude of processor units and a shared memory cache, and each of the processor units has access to the memory cache. In one embodiment, the method comprises providing the memory cache with a series of reservation registers, and storing in these registers addresses reserved in the memory cache for the processor units as a result of issuing load-reserve requests. In this embodiment, when one of the processor units makes a request to store data in the memory cache using a store-conditional request, the reservation registers are checked to determine if an address in the memory cache is reserved for that processor unit. If an address in the memory cache is reserved for that processor, the data are stored at this address.

  13. A cache-aided multiprocessor rollback recovery scheme

    NASA Technical Reports Server (NTRS)

    Wu, Kun-Lung; Fuchs, W. Kent

    1989-01-01

    This paper demonstrates how previous uniprocessor cache-aided recovery schemes can be applied to multiprocessor architectures, for recovering from transient processor failures, utilizing private caches and a global shared memory. As with cache-aided uniprocessor recovery, the multiprocessor cache-aided recovery scheme of this paper can be easily integrated into standard bus-based snoopy cache coherence protocols. A consistent shared memory state is maintained without the necessity of global check-pointing.

  14. Multiprocessor execution of functional programs

    SciTech Connect

    Goldberg, B. )

    1988-10-01

    Functional languages have recently gained attention as vehicles for programming in a concise and element manner. In addition, it has been suggested that functional programming provides a natural methodology for programming multiprocessor computers. This paper describes research that was performed to demonstrate that multiprocessor execution of functional programs on current multiprocessors is feasible, and results in a significant reduction in their execution times. Two implementations of the functional language ALFL were built on commercially available multiprocessors. Alfalfa is an implementation on the Intel iPSC hypercube multiprocessor, and Buckwheat is an implementation on the Encore Multimax shared-memory multiprocessor. Each implementation includes a compiler that performs automatic decomposition of ALFL programs and a run-time system that supports their execution. The compiler is responsible for detecting the inherent parallelism in a program, and decomposing the program into a collection of tasks, called serial combinators, that can be executed in parallel. The abstract machine model supported by Alfalfa and Buckwheat is called heterogeneous graph reduction, which is a hybrid of graph reduction and conventional stack-oriented execution. This model supports parallelism, lazy evaluation, and higher order functions while at the same time making efficient use of the processors in the system. The Alfalfa and Buckwheat runtime systems support dynamic load balancing, interprocessor communication (if required), and storage management. A large number of experiments were performed on Alfalfa and Buckwheat for a variety of programs. The results of these experiments, as well as the conclusions drawn from them, are presented.

  15. Memory access in shared virtual memory

    SciTech Connect

    Berrendorf, R.

    1992-09-01

    Shared virtual memory (SVM) is a virtual memory layer with a single address space on top of a distributed real memory on parallel computers. We examine the behavior and performance of SVM running a parallel program with medium-grained, loop-level parallelism on top of it. A simulator for the underlying parallel architecture can be used to examine the behavior of SVM more deeply. The influence of several parameters, such as the number of processors, page size, cold or warm start, and restricted page replication, is studied.

  16. Memory access in shared virtual memory

    SciTech Connect

    Berrendorf, R. )

    1992-01-01

    Shared virtual memory (SVM) is a virtual memory layer with a single address space on top of a distributed real memory on parallel computers. We examine the behavior and performance of SVM running a parallel program with medium-grained, loop-level parallelism on top of it. A simulator for the underlying parallel architecture can be used to examine the behavior of SVM more deeply. The influence of several parameters, such as the number of processors, page size, cold or warm start, and restricted page replication, is studied.

  17. Shared Memory Parallelization of an Implicit ADI-type CFD Code

    NASA Technical Reports Server (NTRS)

    Hauser, Th.; Huang, P. G.

    1999-01-01

    A parallelization study designed for ADI-type algorithms is presented using the OpenMP specification for shared-memory multiprocessor programming. Details of optimizations specifically addressed to cache-based computer architectures are described and performance measurements for the single and multiprocessor implementation are summarized. The paper demonstrates that optimization of memory access on a cache-based computer architecture controls the performance of the computational algorithm. A hybrid MPI/OpenMP approach is proposed for clusters of shared memory machines to further enhance the parallel performance. The method is applied to develop a new LES/DNS code, named LESTool. A preliminary DNS calculation of a fully developed channel flow at a Reynolds number of 180, Re(sub tau) = 180, has shown good agreement with existing data.

  18. Multiprogramming performance degradation - Case study on a shared memory multiprocessor

    NASA Technical Reports Server (NTRS)

    Dimpsey, R. T.; Iyer, R. K.

    1989-01-01

    The performance degradation due to multiprogramming overhead is quantified for a parallel-processing machine. Measurements of real workloads were taken, and it was found that there is a moderate correlation between the completion time of a program and the amount of system overhead measured during program execution. Experiments in controlled environments were then conducted to calculate a lower bound on the performance degradation of parallel jobs caused by multiprogramming overhead. The results show that the multiprogramming overhead of parallel jobs consumes at least 4 percent of the processor time. When two or more serial jobs are introduced into the system, this amount increases to 5.3 percent

  19. Solution of large nonlinear quasistatic structural mechanics problems on distributed-memory multiprocessor computers

    SciTech Connect

    Blanford, M.

    1997-12-31

    Most commercially-available quasistatic finite element programs assemble element stiffnesses into a global stiffness matrix, then use a direct linear equation solver to obtain nodal displacements. However, for large problems (greater than a few hundred thousand degrees of freedom), the memory size and computation time required for this approach becomes prohibitive. Moreover, direct solution does not lend itself to the parallel processing needed for today`s multiprocessor systems. This talk gives an overview of the iterative solution strategy of JAS3D, the nonlinear large-deformation quasistatic finite element program. Because its architecture is derived from an explicit transient-dynamics code, it does not ever assemble a global stiffness matrix. The author describes the approach he used to implement the solver on multiprocessor computers, and shows examples of problems run on hundreds of processors and more than a million degrees of freedom. Finally, he describes some of the work he is presently doing to address the challenges of iterative convergence for ill-conditioned problems.

  20. A multiprocessor computer simulation model employing a feedback scheduler/allocator for memory space and bandwidth matching and TMR processing

    NASA Technical Reports Server (NTRS)

    Bradley, D. B.; Irwin, J. D.

    1974-01-01

    A computer simulation model for a multiprocessor computer is developed that is useful for studying the problem of matching multiprocessor's memory space, memory bandwidth and numbers and speeds of processors with aggregate job set characteristics. The model assumes an input work load of a set of recurrent jobs. The model includes a feedback scheduler/allocator which attempts to improve system performance through higher memory bandwidth utilization by matching individual job requirements for space and bandwidth with space availability and estimates of bandwidth availability at the times of memory allocation. The simulation model includes provisions for specifying precedence relations among the jobs in a job set, and provisions for specifying precedence execution of TMR (Triple Modular Redundant and SIMPLEX (non redundant) jobs.

  1. Message-passing multiprocessor simulator

    SciTech Connect

    Dunigan, T.H.

    1986-05-01

    The structure and use of a message-passing multiprocessor simulator are described. The simulator provides a multitasking environment for the development of algorithms for parallel processors using either shared or local memories. The simulator may be used from C or FORTRAN and provides a library of subroutines for task control and message passing. The simulator produces a trace file that can be used for debugging, performance analysis, or graphical display. 9 refs., 7 figs.

  2. Compiler-directed cache management in multiprocessors

    NASA Technical Reports Server (NTRS)

    Cheong, Hoichi; Veidenbaum, Alexander V.

    1990-01-01

    The necessity of finding alternatives to hardware-based cache coherence strategies for large-scale multiprocessor systems is discussed. Three different software-based strategies sharing the same goals and general approach are presented. They consist of a simple invalidation approach, a fast selective invalidation scheme, and a version control scheme. The strategies are suitable for shared-memory multiprocessor systems with interconnection networks and a large number of processors. Results of trace-driven simulations conducted on numerical benchmark routines to compare the performance of the three schemes are presented.

  3. Experimental evaluation of multiprocessor cache-based error recovery

    NASA Technical Reports Server (NTRS)

    Janssens, Bob; Fuchs, W. K.

    1991-01-01

    Several variations of cache-based checkpointing for rollback error recovery in shared-memory multiprocessors have been recently developed. By modifying the cache replacement policy, these techniques use the inherent redundancy in the memory hierarchy to periodically checkpoint the computation state. Three schemes, different in the manner in which they avoid rollback propagation, are evaluated. By simulation with address traces from parallel applications running on an Encore Multimax shared-memory multiprocessor, the performance effect of integrating the recovery schemes in the cache coherence protocol are evaluated. The results indicate that the cache-based schemes can provide checkpointing capability with low performance overhead but uncontrollable high variability in the checkpoint interval.

  4. A simple modern correctness condition for a space-based high-performance multiprocessor

    NASA Technical Reports Server (NTRS)

    Probst, David K.; Li, Hon F.

    1992-01-01

    A number of U.S. national programs, including space-based detection of ballistic missile launches, envisage putting significant computing power into space. Given sufficient progress in low-power VLSI, multichip-module packaging and liquid-cooling technologies, we will see design of high-performance multiprocessors for individual satellites. In very high speed implementations, performance depends critically on tolerating large latencies in interprocessor communication; without latency tolerance, performance is limited by the vastly differing time scales in processor and data-memory modules, including interconnect times. The modern approach to tolerating remote-communication cost in scalable, shared-memory multiprocessors is to use a multithreaded architecture, and alter the semantics of shared memory slightly, at the price of forcing the programmer either to reason about program correctness in a relaxed consistency model or to agree to program in a constrained style. The literature on multiprocessor correctness conditions has become increasingly complex, and sometimes confusing, which may hinder its practical application. We propose a simple modern correctness condition for a high-performance, shared-memory multiprocessor; the correctness condition is based on a simple interface between the multiprocessor architecture and a high-performance, shared-memory multiprocessor; the correctness condition is based on a simple interface between the multiprocessor architecture and the parallel programming system.

  5. Firefly: A multiprocessor workstation

    SciTech Connect

    Thacker, C.P.; Stewart, L.C.; Satterthwaite, E.H.

    1988-08-01

    Firefly is a shared memory multiprocessor workstation developed at the Digital Equipment Corporation Systems Research Center (SRC). A Firefly system consists of from one to nine VLSI VAX processors, each with a floating point accelerator and a cache. The caches are coherent, so that all processors see a consistent view of main memory. The Firefly runs a software system that emulates the Ultrix system call interface, and in addition provides support for multiprocessing through multiple threads of control in a single address space. Communication is provided uniformly through the use of remote procedure call. The authors describe the goals, hardware, software system, and performance of the Firefly, and discuss the extent to which SRC has been successful in providing software to take advantage of multi-processing.

  6. Multiprocessor execution of functional programs

    SciTech Connect

    Goldberg, B.F.

    1988-01-01

    Functional languages have recently gained attention as vehicles for programming in a concise and elegant manner. In addition, it has been suggested that functional programming provides a natural methodology for programming multiprocessor computers. This dissertation demonstrates that multiprocessor execution of functional programs is feasible, and results in a significant reduction in their execution times. Two implementations of the functional language ALFL were built on commercially available multiprocessors. ALFL is an implementation on the Intel iPSC hypercube multiprocessor, and Buckwheat is an implementation on the Encore Multimax shared-memory multiprocessor. Each implementation includes a compiler that performs automatic decomposition of ALFL programs. The compiler is responsible for detecting the inherent parallelism in a program, and decomposing the program into a collection of tasks, called serial combinators, that can be executed in parallel. One of the primary goals of the compiler is to generate serial combinators exhibiting the coarsest granularity possibly without sacrificing useful parallelism. This dissertation describes the algorithms used by the compiler to analyze, decompose, and optimize functional programs. The abstract machine model supported by Alfalfa and Buckwheat is called heterogeneous graph reduction, which is a hybrid of graph reduction and conventional stack-oriented execution. This model supports parallelism, lazy evaluation, and higher order functions while at the same time making efficient use of the processors in the system. The Alfalfa and Buckwheat run-time systems support dynamic load balancing, interprocessor communication (if required) and storage management. A large number of experiments were performed on Alfalfa and Buckwheat for a variety of programs. The results of these experiments, as well as the conclusions drawn from them, are presented.

  7. Performing an allreduce operation using shared memory

    SciTech Connect

    Archer, Charles J; Dozsa, Gabor; Ratterman, Joseph D; Smith, Brian E

    2014-06-10

    Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.

  8. Performing an allreduce operation using shared memory

    DOEpatents

    Archer, Charles J.; Dozsa, Gabor; Ratterman, Joseph D.; Smith, Brian E.

    2012-04-17

    Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.

  9. Multiprocessor architectural study

    NASA Technical Reports Server (NTRS)

    Kosmala, A. L.; Stanten, S. F.; Vandever, W. H.

    1972-01-01

    An architectural design study was made of a multiprocessor computing system intended to meet functional and performance specifications appropriate to a manned space station application. Intermetrics, previous experience, and accumulated knowledge of the multiprocessor field is used to generate a baseline philosophy for the design of a future SUMC* multiprocessor. Interrupts are defined and the crucial questions of interrupt structure, such as processor selection and response time, are discussed. Memory hierarchy and performance is discussed extensively with particular attention to the design approach which utilizes a cache memory associated with each processor. The ability of an individual processor to approach its theoretical maximum performance is then analyzed in terms of a hit ratio. Memory management is envisioned as a virtual memory system implemented either through segmentation or paging. Addressing is discussed in terms of various register design adopted by current computers and those of advanced design.

  10. Recoverable distributed shared virtual memory - Memory coherence and storage structures

    NASA Technical Reports Server (NTRS)

    Wu, Kun-Lung; Fuchs, W. Kent

    1989-01-01

    This paper examines the problem of implementing rollback recovery in multicomputer distributed shared virtual memory environments, in which the shared memory is implemented in software and exists only virtually. A user-transparent checkpointing recovery scheme and new twin-page disk storage management are presented to implement a recoverable distributed shared virtual memory. The checkpointing scheme is integrated with the shared virtual memory management. The twin-page disk approach allows incremental checkpointing without an explicit undo at the time of recovery. A single consistent checkpoint state is maintained on stable disk storage. The recoverable distributed shared virtual memory allows the system to restart computation from a previous checkpoint due to a processor failure without a global restart.

  11. Compiling Lisp for evaluation on a tightly coupled multiprocessor

    SciTech Connect

    Harrison, W.L. III

    1986-03-20

    The problem of compiling Lisp for efficient evaluation on a large, tightly coupled, shared memory multiprocessor is investigated. A representation for s-expressions which facilitates parallel evaluation is proposed, along with a sequence of transformations, to be applied to the functions comprising a Lisp program, which reveal and exploit parallelism. 26 refs., 170 figs.

  12. Expanding symmetric multiprocessor capability through gang scheduling

    SciTech Connect

    Jette, M.A.

    1998-03-01

    Symmetric Multiprocessor (SMP) systems normally provide both space- sharing and time-sharing to insure high system utilization and good responsiveness. However the prevailing lack of concurrent scheduling for parallel programs precludes SMP use in addressing many large-scale problems. Tightly synchronized communications are impractical and normal time-sharing reduces the benefit of cache memory. Evidence gathered at Lawrence Livermore National Laboratory (LLNL) indicates that gang scheduling can increase the capability of SMP systems and parallel program performance without adverse impact upon system utilization or responsiveness.

  13. Direct access inter-process shared memory

    SciTech Connect

    Brightwell, Ronald B; Pedretti, Kevin; Hudson, Trammell B

    2013-10-22

    A technique for directly sharing physical memory between processes executing on processor cores is described. The technique includes loading a plurality of processes into the physical memory for execution on a corresponding plurality of processor cores sharing the physical memory. An address space is mapped to each of the processes by populating a first entry in a top level virtual address table for each of the processes. The address space of each of the processes is cross-mapped into each of the processes by populating one or more subsequent entries of the top level virtual address table with the first entry in the top level virtual address table from other processes.

  14. A shared memory environment for hypercubes

    SciTech Connect

    Agarwala, A.; Das, C.R.

    1994-12-31

    This paper describes the design and implementation of a shared virtual memory (SVM) system for the nCUBE 2 hypercube multicomputer. The SVM system provides the user a single coherent address space across all nodes. It is implemented at the user level in a C programming environment using high level constructs to support data sharing. Shared variables are treated as objects rather than pages. We have improved upon an existing algorithm for maintaining coherency in the SVM system, thus achieving a reduction in the number of inter-node messages required in coherency maintenance. Detailed timing analysis is conducted to analyze the feasibility of this shared environment. Experimental results indicate the parallel programs running under an SVM system show linear speedup, suggesting that SVM systems could provide an effective programming environment for the next generation of distributed memory parallel computers. A bottleneck of this implementation seems to be the expensive interrupt handling by the nCUBE 2 kernel.

  15. The fault-tolerant multiprocessor computer

    SciTech Connect

    Smith, T.B. III; Lala, J.H.; Goldberg, J.; Kautz, W.H.; Melliar-Smith, P.M.; Green, M.W.; Levitt, K.N.; Schwartz, R.L.; Weinstock, C.B.; Palumbo, D.; Butler, R.W.

    1986-01-01

    This book presents studies of two fault-tolerant computer systems designed to meet the extreme reliability requirements for safety- critical functions in advanced NASA vehicles , plus a study of potential architectures for future flight control fault-tolerant systems, which might succeed the current generation of computers. While it is understood that these studies were done for NASA, they also have practical commercial applicability. The fault-tolerant multiprocessor (FTMP) architecture is a high reliability computer concept. The basic organization of the FTMP is that of a general purpose homogeneous multiprocessor. Three processors operate on a shared system (memory and l/O) bus. Replication and tight synchronization of all elements and hardware voting are employed to detect and correct any single fault. Reconfiguration is then employed to ''repair'' a fault. Multiple faults may be tolerated as a sequence of single faults with repair between fault occurrences.

  16. C-MOS array design techniques: SUMC multiprocessor system study

    NASA Technical Reports Server (NTRS)

    Clapp, W. A.; Helbig, W. A.; Merriam, A. S.

    1972-01-01

    The current capabilities of LSI techniques for speed and reliability, plus the possibilities of assembling large configurations of LSI logic and storage elements, have demanded the study of multiprocessors and multiprocessing techniques, problems, and potentialities. Evaluated are three previous systems studies for a space ultrareliable modular computer multiprocessing system, and a new multiprocessing system is proposed that is flexibly configured with up to four central processors, four 1/0 processors, and 16 main memory units, plus auxiliary memory and peripheral devices. This multiprocessor system features a multilevel interrupt, qualified S/360 compatibility for ground-based generation of programs, virtual memory management of a storage hierarchy through 1/0 processors, and multiport access to multiple and shared memory units.

  17. Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 1: FTMP principles of operation

    NASA Technical Reports Server (NTRS)

    Smith, T. B., Jr.; Lala, J. H.

    1983-01-01

    The basic organization of the fault tolerant multiprocessor, (FTMP) is that of a general purpose homogeneous multiprocessor. Three processors operate on a shared system (memory and I/O) bus. Replication and tight synchronization of all elements and hardware voting is employed to detect and correct any single fault. Reconfiguration is then employed to repair a fault. Multiple faults may be tolerated as a sequence of single faults with repair between fault occurrences.

  18. Parallel Navier-Stokes computations on shared and distributed memory architectures

    NASA Technical Reports Server (NTRS)

    Hayder, M. Ehtesham; Jayasimha, D. N.; Pillay, Sasi Kumar

    1995-01-01

    We study a high order finite difference scheme to solve the time accurate flow field of a jet using the compressible Navier-Stokes equations. As part of our ongoing efforts, we have implemented our numerical model on three parallel computing platforms to study the computational, communication, and scalability characteristics. The platforms chosen for this study are a cluster of workstations connected through fast networks (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and a distributed memory multiprocessor (the IBM SPI). Our focus in this study is on the LACE testbed. We present some results for the Cray YMP and the IBM SP1 mainly for comparison purposes. On the LACE testbed, we study: (1) the communication characteristics of Ethernet, FDDI, and the ALLNODE networks and (2) the overheads induced by the PVM message passing library used for parallelizing the application. We demonstrate that clustering of workstations is effective and has the potential to be computationally competitive with supercomputers at a fraction of the cost.

  19. Checkpointing Shared Memory Programs at the Application-level

    SciTech Connect

    Bronevetsky, G; Schulz, M; Szwed, P; Marques, D; Pingali, K

    2004-09-08

    Trends in high-performance computing are making it necessary for long-running applications to tolerate hardware faults. The most commonly used approach is checkpoint and restart(CPR)-the state of the computation is saved periodically on disk, and when a failure occurs, the computation is restarted from the last saved state. At present, it is the responsibility of the programmer to instrument applications for CPR. Our group is investigating the use of compiler technology to instrument codes to make them self-checkpointing and self-restarting, thereby providing an automatic solution to the problem of making long-running scientific applications resilient to hardware faults. Our previous work focused on message-passing programs. In this paper, we describe such a system for shared-memory programs running on symmetric multiprocessors. The system has two components: (i)a pre-compiler for source-to-source modification of applications, and (ii) a runtime system that implements a protocol for coordinating CPR among the threads of the parallel application. For the sake of concreteness, we focus on a non-trivial subset of OpenMP that includes barriers and locks. One of the advantages of this approach is that the ability to tolerate faults becomes embedded within the application itself, so applications become self-checkpointing and self-restarting on any platform. We demonstrate this by showing that our transformed benchmarks can checkpoint and restart on three different platforms (Windows/x86, Linux/x86, and Tru64/Alpha). Our experiments show that the overhead introduced by this approach is usually quite small; they also suggest ways in which the current implementation can be tuned to reduced overheads further.

  20. Exploring Shared Memory Protocols in FLASH

    SciTech Connect

    Horowitz, Mark; Kunz, Robert; Hall, Mary; Lucas, Robert; Chame, Jacqueline

    2007-04-01

    ABSTRACT The goal of this project was to improve the performance of large scientific and engineering applications through collaborative hardware and software mechanisms to manage the memory hierarchy of non-uniform memory access time (NUMA) shared-memory machines, as well as their component individual processors. In spite of the programming advantages of shared-memory platforms, obtaining good performance for large scientific and engineering applications on such machines can be challenging. Because communication between processors is managed implicitly by the hardware, rather than expressed by the programmer, application performance may suffer from unintended communication – communication that the programmer did not consider when developing his/her application. In this project, we developed and evaluated a collection of hardware, compiler, languages and performance monitoring tools to obtain high performance on scientific and engineering applications on NUMA platforms by managing communication through alternative coherence mechanisms. Alternative coherence mechanisms have often been discussed as a means for reducing unintended communication, although architecture implementations of such mechanisms are quite rare. This report describes an actual implementation of a set of coherence protocols that support coherent, non-coherent and write-update accesses for a CC-NUMA shared-memory architecture, the Stanford FLASH machine. Such an approach has the advantages of using alternative coherence only where it is beneficial, and also provides an evolutionary migration path for improving application performance. We present data on two computations, RandomAccess from the HPC Challenge benchmarks and a forward solver derived from LS-DYNA, showing the performance advantages of the alternative coherence mechanisms. For RandomAccess, the non-coherent and write-update versions can outperform the coherent version by factors of 5 and 2.5, respectively. In LS-DYNA, we obtain

  1. Shared memory for a fault-tolerant computer

    NASA Technical Reports Server (NTRS)

    Gilley, G. C. (Inventor)

    1976-01-01

    A system is described for sharing a memory in a fault-tolerant computer. The memory is under the direct control and monitoring of error detecting and error diagnostic units in the fault-tolerant computer. This computer verifies that data to and from the memory is legally encoded and verifies that words read from the memory at a desired address are, in fact, actually delivered from that desired address. The means are provided for a second processor, which is independent of the direct control and monitoring of the error checking and diagnostic units of the fault-tolerant computer, and to share the memory of the fault-tolerant computer. Circuitry is included to verify that: (1) the processor has properly accessed a desired memory location in the memory; (2) a data word read-out from the memory is properly coded; and (3) no inactive memory was erroneously outputting data onto the shared memory bus.

  2. Reducing communication costs in the conjugate gradient algorithm on distributed memory multiprocessors

    SciTech Connect

    D`Azevedo, E.F.; Romine, C.H.

    1992-09-01

    The standard formulation of the conjugate gradient algorithm involves two inner product computations. The results of these two inner products are needed to update the search direction and the computed solution. In a distributed memory parallel environment, the computation and subsequent distribution of these two values requires two separate communication and synchronization phases. In this paper, we present a mathematically equivalent rearrangement of the standard algorithm that reduces the number of communication phases. We give a second derivation of the modified conjugate gradient algorithm in terms of the natural relationship with the underlying Lanczos process. We also present empirical evidence of the stability of this modified algorithm.

  3. Verifying a multiprocessor cache controller using random case generation

    SciTech Connect

    Wood, D.A.; Gibson, G.A.; Katz, R.H. )

    1989-01-01

    The newest generation of cache controller chips provide coherency to support multiprocessor systems, i.e., the controllers coordinate access to the cache memories to guarantee a single global view of memory. The cache coherency protocols they implement complicate the controller design, making design verification difficult. In the design of the cache controller for SPUR, a shared memory multiprocessor designed and built at U.C. Berkeley, the authors developed a random tester to generate and verify the complex interactions between multiple processors in the the functional simulation. Replacing the CPU model, the tester generates memory references by randomly selecting from a script of actions and checks. The checks verify correct completion of their corresponding actions. The tester was easy to develop, and detected over half of the functional bugs uncovered during simulation. They used an assembly language version of the random tester to verify the prototype hardware. A multiprocessor system is operational; it runs the Sprite operating system and is being used for experiments in parallel programming.

  4. Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT

    SciTech Connect

    Secchi, Simone; Tumeo, Antonino; Villa, Oreste

    2011-07-27

    Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the main factors that limit performance scaling of such architectures. Modern high-performance computing DSM systems have evolved toward exploitation of massive hardware multi-threading and fine-grained memory hashing to tolerate irregular latencies, avoid network hot-spots and enable high scaling. In order to model the performance of such large-scale machines, parallel simulation has been proved to be a promising approach to achieve good accuracy in reasonable times. One of the most critical factors in solving the simulation speed-accuracy trade-off is network modeling. The Cray XMT is a massively multi-threaded supercomputing architecture that belongs to the DSM class, since it implements a globally-shared address space abstraction on top of a physically distributed memory substrate. In this paper, we discuss the development of a contention-aware network model intended to be integrated in a full-system XMT simulator. We start by measuring the effects of network contention in a 128-processor XMT machine and then investigate the trade-off that exists between simulation accuracy and speed, by comparing three network models which operate at different levels of accuracy. The comparison and model validation is performed by executing a string-matching algorithm on the full-system simulator and on the XMT, using three datasets that generate noticeably different contention patterns.

  5. Impact of Load Balancing on Unstructured Adaptive Grid Computations for Distributed-Memory Multiprocessors

    NASA Technical Reports Server (NTRS)

    Biswas, Rupak; Simon, Horst D.; Sohn, Andrew

    1996-01-01

    The computational requirements for an adaptive solution of unsteady problems change as the simulation progresses. This causes workload imbalance among processors on a parallel machine which, in turn, requires significant data movement at runtime. We present a new dynamic load-balancing framework, called JOVE, that balances the workload across all processors with a global view. Whenever the computational mesh is adapted, JOVE is activated to eliminate the load imbalance. JOVE has been implemented on an IBM SP2 distributed-memory machine in MPI for portability. Experimental results for two model meshes demonstrate that mesh adaption with load balancing gives more than a sixfold improvement over one without load balancing. We also show that JOVE gives a 24-fold speedup on 64 processors compared to sequential execution.

  6. Shared virtual memory and generalized speedup

    NASA Technical Reports Server (NTRS)

    Sun, Xian-He; Zhu, Jianping

    1994-01-01

    Generalized speedup is defined as parallel speed over sequential speed. The generalized speedup and its relation with other existing performance metrics, such as traditional speedup, efficiency, scalability, etc., are carefully studied. In terms of the introduced asymptotic speed, it was shown that the difference between the generalized speedup and the traditional speedup lies in the definition of the efficiency of uniprocessor processing, which is a very important issue in shared virtual memory machines. A scientific application was implemented on a KSR-1 parallel computer. Experimental and theoretical results show that the generalized speedup is distinct from the traditional speedup and provides a more reasonable measurement. In the study of different speedups, various causes of superlinear speedup are also presented.

  7. Final Report: Programming Models for Shared Memory Clusters

    SciTech Connect

    May, J.; de Supinski, B.; Pudliner, B.; Taylor, S.; Baden, S.

    2000-01-04

    Most large parallel computers now built use a hybrid architecture called a shared memory cluster. In this design, a computer consists of several nodes connected by an interconnection network. Each node contains a pool of memory and multiple processors that share direct access to it. Because shared memory clusters combine architectural features of shared memory computers and distributed memory computers, they support several different styles of parallel programming or programming models. (Further information on the design of these systems and their programming models appears in Section 2.) The purpose of this project was to investigate the programming models available on these systems and to answer three questions: (1) How easy to use are the different programming models in real applications? (2) How do the hardware and system software on different computers affect the performance of these programming models? (3) What are the performance characteristics of different programming models for typical LLNL applications on various shared memory clusters?

  8. A parallel numerical simulation for supersonic flows using zonal overlapped grids and local time steps for common and distributed memory multiprocessors

    SciTech Connect

    Patel, N.R.; Sturek, W.B.; Hiromoto, R.

    1989-01-01

    Parallel Navier-Stokes codes are developed to solve both two- dimensional and three-dimensional flow fields in and around ramjet and nose tip configurations. A multi-zone overlapped grid technique is used to extend an explicit finite-difference method to more complicated geometries. Parallel implementations are developed for execution on both distributed and common-memory multiprocessor architectures. For the steady-state solutions, the use of the local time-step method has the inherent advantage of reducing the communications overhead commonly incurred by parallel implementations. Computational results of the codes are given for a series of test problems. The parallel partitioning of computational zones is also discussed. 5 refs., 18 figs.

  9. Solution of the Euler and Navier-Stokes equations on MIMD distributed memory multiprocessors using cyclic reduction

    SciTech Connect

    Curchitser, E.N.; Pelz, R.B.; Marconi, F. Grumman Aerospace Corp., Bethpage, NY )

    1992-01-01

    The Euler and Navier-Stokes equations are solved for the steady, two-dimensional flow over a NACA 0012 airfoil using a 1024 node nCUBE/2 multiprocessor. Second-order, upwind-discretized difference equations are solved implicitly using ADI factorization. Parallel cyclic reduction is employed to solve the block tridiagonal systems. For realistic problems, communication times are negligible compared to calculation times. The processors are tightly synchronized, and their loads are well balanced. When the flux Jacobians flux are frozen, the wall-clock time for one implicit timestep is about equal to that of a multistage explicit scheme. 10 refs.

  10. The hierarchical spatial decomposition of three-dimensional particle- in-cell plasma simulations on MIMD distributed memory multiprocessors

    SciTech Connect

    Walker, D.W.

    1992-07-01

    The hierarchical spatial decomposition method is a promising approach to decomposing the particles and computational grid in parallel particle-in-cell application codes, since it is able to maintain approximate dynamic load balance while keeping communication costs low. In this paper we investigate issues in implementing a hierarchical spatial decomposition on a hypercube multiprocessor. Particular attention is focused on the communication needed to update guard ring data, and on the load balancing method. The hierarchical approach is compared with other dynamic load balancing schemes.

  11. Validation of multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Siewiorek, D. P.; Segall, Z.; Kong, T.

    1982-01-01

    Experiments that can be used to validate fault free performance of multiprocessor systems in aerospace systems integrating flight controls and avionics are discussed. Engineering prototypes for two fault tolerant multiprocessors are tested.

  12. Working memory resources are shared across sensory modalities.

    PubMed

    Salmela, V R; Moisala, M; Alho, K

    2014-10-01

    A common assumption in the working memory literature is that the visual and auditory modalities have separate and independent memory stores. Recent evidence on visual working memory has suggested that resources are shared between representations, and that the precision of representations sets the limit for memory performance. We tested whether memory resources are also shared across sensory modalities. Memory precision for two visual (spatial frequency and orientation) and two auditory (pitch and tone duration) features was measured separately for each feature and for all possible feature combinations. Thus, only the memory load was varied, from one to four features, while keeping the stimuli similar. In Experiment 1, two gratings and two tones-both containing two varying features-were presented simultaneously. In Experiment 2, two gratings and two tones-each containing only one varying feature-were presented sequentially. The memory precision (delayed discrimination threshold) for a single feature was close to the perceptual threshold. However, as the number of features to be remembered was increased, the discrimination thresholds increased more than twofold. Importantly, the decrease in memory precision did not depend on the modality of the other feature(s), or on whether the features were in the same or in separate objects. Hence, simultaneously storing one visual and one auditory feature had an effect on memory precision equal to those of simultaneously storing two visual or two auditory features. The results show that working memory is limited by the precision of the stored representations, and that working memory can be described as a resource pool that is shared across modalities.

  13. Working memory resources are shared across sensory modalities.

    PubMed

    Salmela, V R; Moisala, M; Alho, K

    2014-10-01

    A common assumption in the working memory literature is that the visual and auditory modalities have separate and independent memory stores. Recent evidence on visual working memory has suggested that resources are shared between representations, and that the precision of representations sets the limit for memory performance. We tested whether memory resources are also shared across sensory modalities. Memory precision for two visual (spatial frequency and orientation) and two auditory (pitch and tone duration) features was measured separately for each feature and for all possible feature combinations. Thus, only the memory load was varied, from one to four features, while keeping the stimuli similar. In Experiment 1, two gratings and two tones-both containing two varying features-were presented simultaneously. In Experiment 2, two gratings and two tones-each containing only one varying feature-were presented sequentially. The memory precision (delayed discrimination threshold) for a single feature was close to the perceptual threshold. However, as the number of features to be remembered was increased, the discrimination thresholds increased more than twofold. Importantly, the decrease in memory precision did not depend on the modality of the other feature(s), or on whether the features were in the same or in separate objects. Hence, simultaneously storing one visual and one auditory feature had an effect on memory precision equal to those of simultaneously storing two visual or two auditory features. The results show that working memory is limited by the precision of the stored representations, and that working memory can be described as a resource pool that is shared across modalities. PMID:24935809

  14. Parallel implementation and evaluation of motion estimation system algorithms on a distributed memory multiprocessor using knowledge based mappings

    NASA Technical Reports Server (NTRS)

    Choudhary, Alok Nidhi; Leung, Mun K.; Huang, Thomas S.; Patel, Janak H.

    1989-01-01

    Several techniques to perform static and dynamic load balancing techniques for vision systems are presented. These techniques are novel in the sense that they capture the computational requirements of a task by examining the data when it is produced. Furthermore, they can be applied to many vision systems because many algorithms in different systems are either the same, or have similar computational characteristics. These techniques are evaluated by applying them on a parallel implementation of the algorithms in a motion estimation system on a hypercube multiprocessor system. The motion estimation system consists of the following steps: (1) extraction of features; (2) stereo match of images in one time instant; (3) time match of images from different time instants; (4) stereo match to compute final unambiguous points; and (5) computation of motion parameters. It is shown that the performance gains when these data decomposition and load balancing techniques are used are significant and the overhead of using these techniques is minimal.

  15. Shared memory, cache, and frontwidth considerations in multifrontal algorithm development

    SciTech Connect

    Benner, R.E.

    1986-01-23

    A concurrent, multifrontal algorithm (Benner and Weigand 1986) for solution of finite element equations was modified to better use the cache and shared memories on the ELXSI 6400, and to achieve better load balancing between 'child' processes via frontwidth reduction. The changes were also tailored to use distributed memory machines efficiently by making most local to individual processors. The test code initially used 8 Mbytes of incached shared memory and 155 cp (concurrent processor) sec (a speedup of 1.4) when run on 4 processors. The changes left only 50 Kbytes of uncached, and 470 Kbytes of cached, shared memory, plus 530 Kbytes of data local to each 'child' process. Total cp time was reduced to 57 sec and speedup increased to 2.8 on 4 processors. Based on those results an addition to the ELXSI multitasking software, asynchronous I/O between processes, is proposed that would further decrease the shared memory requirements of the algorithm and make the ELXSI look like a distributed memory machine as far as algorithm development is concerned. This would make the ELXSI an extremely useful tool for further development of special-purpose, finite element computations. 16 refs., 8 tabs.

  16. Utilizing a multiprocessor architecture - The performance of MIDAS

    SciTech Connect

    Maples, C.; Logan, D.; Meng, J.; Rathbun, W.; Weaver, D.

    1983-10-01

    The MIDAS architecture organizes multiple CPUs into clusters called distributed subsystems. Each subsystem consists of an array of processors controlled by a supervisory CPU. The multiprocessor array is composed of commercial CPUs (with floating point hardware) and specialized processing elements. Interprocessor communication within the array may occur either through switched memory modules or common shared memory. The architecture permits multiple processors to be focused on single problems. A distributed subsystem has been constructed and tested. It currently consists of a supervisor CPU; 16 blocks of independently switchable memory; 9 general purpose, VAX-class CPUs; and 2 specialized pipelined processors to handle I/O. Results on a variety of problems indicate that the subsystem performs 8 to 15 times faster than a standard computer with an identical CPU. The difference in performance represents the effect of differing CPU and I/O requirements.

  17. Parallel processing and medium-scale multiprocessors

    SciTech Connect

    Wouk, A.

    1989-01-01

    For some time, the community interested in large-scale scientific computing has been attempting to come to terms with parallel computation using a number of processors sufficient to make their concurrent utilization interesting, challenging, and, in the long run, beneficial. Unexpected consequences of parallelization have been discovered. It is possible to obtain reduced performance, both relative and absolute, from an increased number of processors, as a result of inappropriate use of resources in a multiprocessor environment. This exemplifies one of the paradoxes which result from our cultural bias towards sequential thought processes. As a consequence there is a bias for sequential styles of program development in a multiprocessor environment. The authors have learned that the problem of automatic optimization in compilation of parallel programs is computationally hard. Early hopes that automatic, optimal parallelization of sequentially conceived programs would be as achievable as earlier automatic vectorization had been, have been dashed. The authors lack the insights and folklore which are needed to develop useful methodologies and heuristics in the area of parallel computation. The authors are embarked on a voyage of exploration of this new territory, and the work described in this volume can provide helpful guidance. The authors have to explore fully the differences between distributed memory systems, shared memory systems, and combinations, as well as the relative applicability of SIMD and MIMD architectures. Based on the information obtained in such exploration, useful steps towards efficient utilization of many processors should become possible. This paper covers several areas: systems programming, parallel/language/programming systems, and applications programming.

  18. Supporting shared data structures on distributed memory architectures

    NASA Technical Reports Server (NTRS)

    Koelbel, Charles; Mehrotra, Piyush; Vanrosendale, John

    1990-01-01

    Programming nonshared memory systems is more difficult than programming shared memory systems, since there is no support for shared data structures. Current programming languages for distributed memory architectures force the user to decompose all data structures into separate pieces, with each piece owned by one of the processors in the machine, and with all communication explicitly specified by low-level message-passing primitives. A new programming environment is presented for distributed memory architectures, providing a global name space and allowing direct access to remote parts of data values. The analysis and program transformations required to implement this environment are described, and the efficiency of the resulting code on the NCUBE/7 and IPSC/2 hypercubes are described.

  19. A comparison of distributed memory and virtual shared memory parallel programming models

    SciTech Connect

    Keane, J.A.; Grant, A.J.; Xu, M.Q.

    1993-04-01

    The virtues of the different parallel programming models, shared memory and distributed memory, have been much debated. Conventionally the debate could be reduced to programming convenience on the one hand, and high salability factors on the other. More recently the debate has become somewhat blurred with the provision of virtual shared memory models built on machines with physically distributed memory. The intention of such models/machines is to provide scalable shared memory, i.e. to provide both programmer convenience and high salability. In this paper, the different models are considered from experiences gained with a number of system ranging from applications in both commerce and science to languages and operating systems. Case studies are introduced as appropriate.

  20. Analysis and comparison of cache coherence protocols for a packet-switched multiprocessor

    SciTech Connect

    Yang, Q.; Bhuyan, L.N.; Liu, B.C.

    1989-08-01

    The use of private caches in a multiprocessor system causes inconsistency of the shared data among the caches and among caches and the main memory. A large number of protocols have been proposed to solve this coherence problem. In this paper, the authors develop analytical models for seven existing cache protocols, namely: Write-Once, Write-Through, Synapse, Berkeley, Illinois, Firefly, and Dragon. The protocols are implemented on a multiprocessor with a packet-switched shared bus. The models are based on queueing networks that consist of both open and closed classes of customers. The models incorporate the requests for invalidation signals, write-through, and write-back operations and the solution is based on the mean value analysis (MVA) algorithm. Performance comparison among these protocols under various system parameters is carried out based on our models.

  1. Distributed simulation using a real-time shared memory network

    NASA Technical Reports Server (NTRS)

    Simon, Donald L.; Mattern, Duane L.; Wong, Edmond; Musgrave, Jeffrey L.

    1993-01-01

    The Advanced Control Technology Branch of the NASA Lewis Research Center performs research in the area of advanced digital controls for aeronautic and space propulsion systems. This work requires the real-time implementation of both control software and complex dynamical models of the propulsion system. We are implementing these systems in a distributed, multi-vendor computer environment. Therefore, a need exists for real-time communication and synchronization between the distributed multi-vendor computers. A shared memory network is a potential solution which offers several advantages over other real-time communication approaches. A candidate shared memory network was tested for basic performance. The shared memory network was then used to implement a distributed simulation of a ramjet engine. The accuracy and execution time of the distributed simulation was measured and compared to the performance of the non-partitioned simulation. The ease of partitioning the simulation, the minimal time required to develop for communication between the processors and the resulting execution time all indicate that the shared memory network is a real-time communication technique worthy of serious consideration.

  2. Comparison of two paradigms for distributed shared memory

    SciTech Connect

    Levelt, W.G.; Kaashoek, M.F.; Bal, H.E.; Tanenbaum, A.S.

    1990-08-01

    The paper compares two paradigms for Distributed Shared Memory on loosely coupled computing systems: the shared data-object model as used in Orca, a programming language specially designed for loosely coupled computing systems and the Shared Virtual Memory model. For both paradigms the authors have implemented two systems, one using only point-to-point messages, the other using broadcasting as well. They briefly describe these two paradigms and their implementations. Then they compare their performance on four applications: the traveling salesman problem, alpha-beta search, matrix multiplication and the all pairs shortest paths problem. The measurements show that both paradigms can be used efficiently for programming large-grain parallel applications. Significant speedups were obtained on all applications. The unstructured Shared Virtual Memory paradigm achieves the best absolute performance, although this is largely due to the preliminary nature of the Orca compiler used. The structured shared data-object model achieves the highest speedups and is much easier to program and to debug.

  3. Performance Evaluation of Remote Memory Access (RMA) Programming on Shared Memory Parallel Computers

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Jost, Gabriele; Biegel, Bryan A. (Technical Monitor)

    2002-01-01

    The purpose of this study is to evaluate the feasibility of remote memory access (RMA) programming on shared memory parallel computers. We discuss different RMA based implementations of selected CFD application benchmark kernels and compare them to corresponding message passing based codes. For the message-passing implementation we use MPI point-to-point and global communication routines. For the RMA based approach we consider two different libraries supporting this programming model. One is a shared memory parallelization library (SMPlib) developed at NASA Ames, the other is the MPI-2 extensions to the MPI Standard. We give timing comparisons for the different implementation strategies and discuss the performance.

  4. Multi-processor including data flow accelerator module

    DOEpatents

    Davidson, George S.; Pierce, Paul E.

    1990-01-01

    An accelerator module for a data flow computer includes an intelligent memory. The module is added to a multiprocessor arrangement and uses a shared tagged memory architecture in the data flow computer. The intelligent memory module assigns locations for holding data values in correspondence with arcs leading to a node in a data dependency graph. Each primitive computation is associated with a corresponding memory cell, including a number of slots for operands needed to execute a primitive computation, a primitive identifying pointer, and linking slots for distributing the result of the cell computation to other cells requiring that result as an operand. Circuitry is provided for utilizing tag bits to determine automatically when all operands required by a processor are available and for scheduling the primitive for execution in a queue. Each memory cell of the module may be associated with any of the primitives, and the particular primitive to be executed by the processor associated with the cell is identified by providing an index, such as the cell number for the primitive, to the primitive lookup table of starting addresses. The module thus serves to perform functions previously performed by a number of sections of data flow architectures and coexists with conventional shared memory therein. A multiprocessing system including the module operates in a hybrid mode, wherein the same processing modules are used to perform some processing in a sequential mode, under immediate control of an operating system, while performing other processing in a data flow mode.

  5. Visual Tutoring System for Programming Multiprocessor Networks.

    ERIC Educational Resources Information Center

    Trichina, Elena

    1996-01-01

    Describes a visual tutoring system for programming distributive-memory multiprocessor networks. Highlights include difficulties of parallel programming, and three instructional modes in the system, including a hypertext-like lecture, a question-answer mode, and an expert aid mode. (Author/LRW)

  6. Rollback Hardware For Time Warp Multiprocessor Systems

    NASA Technical Reports Server (NTRS)

    Robb, Michael J.; Buzzell, Calvin A.

    1996-01-01

    Rollback Chip (RBC) module is computer circuit board containing special-purpose memory circuits for use in multiprocessor computer system. Designed to help realize speedup potential of parallel processing for simulation of discrete events by use of Time Warp operating system.

  7. Neural networks and MIMD-multiprocessors

    NASA Technical Reports Server (NTRS)

    Vanhala, Jukka; Kaski, Kimmo

    1990-01-01

    Two artificial neural network models are compared. They are the Hopfield Neural Network Model and the Sparse Distributed Memory model. Distributed algorithms for both of them are designed and implemented. The run time characteristics of the algorithms are analyzed theoretically and tested in practice. The storage capacities of the networks are compared. Implementations are done using a distributed multiprocessor system.

  8. The SPUR instruction unit: An on-chip instruction cache memory for a high performance VLSI multiprocessor

    SciTech Connect

    Duncombe, R.R.

    1987-01-01

    On-chip instruction caches reduce this contention problem by supplying many of the instructions executed by the microprocessor. The SPUR instruction unit is a direct mapped cache with 512 bytes or 128 instructions. It is organized in sub-blocks to provide efficient instruction fetching and prefetching from the external memory. The SPUR instruction unit is controlled by two finite state machines: one for instruction fetching and one for instruction prefetching. These control functions are implemented using PLAs and standard logic cells. The standard cells are implemented in domino logic to meet speed and area constraints. SPICE simulations indicate that the slowest signal delay path in the instruction unit is 14.7 ns.

  9. Performance Analysis of Multilevel Parallel Applications on Shared Memory Architectures

    NASA Technical Reports Server (NTRS)

    Jost, Gabriele; Jin, Haoqiang; Labarta, Jesus; Gimenez, Judit; Caubet, Jordi; Biegel, Bryan A. (Technical Monitor)

    2002-01-01

    In this paper we describe how to apply powerful performance analysis techniques to understand the behavior of multilevel parallel applications. We use the Paraver/OMPItrace performance analysis system for our study. This system consists of two major components: The OMPItrace dynamic instrumentation mechanism, which allows the tracing of processes and threads and the Paraver graphical user interface for inspection and analyses of the generated traces. We describe how to use the system to conduct a detailed comparative study of a benchmark code implemented in five different programming paradigms applicable for shared memory

  10. A garbage collection algorithm for shared memory parallel processors

    SciTech Connect

    Crammond, J. )

    1988-12-01

    This paper describes a technique for adapting the Morris sliding garbage collection algorithm to execute on parallel machines with shared memory. The algorithm is described within the framework of an implementation of the parallel logic language Parlog. However, the algorithm is a general one and can easily be adapted to parallel Prolog systems and to other languages. The performance of the algorithm executing a few simple Parlog benchmarks is analyzed. Finally, it is shown how the technique for parallelizing the sequential algorithm can be adapted for a semi-space copying algorithm.

  11. Efficient partitioning and assignment on programs for multiprocessor execution

    NASA Technical Reports Server (NTRS)

    Standley, Hilda M.

    1993-01-01

    The general problem studied is that of segmenting or partitioning programs for distribution across a multiprocessor system. Efficient partitioning and the assignment of program elements are of great importance since the time consumed in this overhead activity may easily dominate the computation, effectively eliminating any gains made by the use of the parallelism. In this study, the partitioning of sequentially structured programs (written in FORTRAN) is evaluated. Heuristics, developed for similar applications are examined. Finally, a model for queueing networks with finite queues is developed which may be used to analyze multiprocessor system architectures with a shared memory approach to the problem of partitioning. The properties of sequentially written programs form obstacles to large scale (at the procedure or subroutine level) parallelization. Data dependencies of even the minutest nature, reflecting the sequential development of the program, severely limit parallelism. The design of heuristic algorithms is tied to the experience gained in the parallel splitting. Parallelism obtained through the physical separation of data has seen some success, especially at the data element level. Data parallelism on a grander scale requires models that accurately reflect the effects of blocking caused by finite queues. A model for the approximation of the performance of finite queueing networks is developed. This model makes use of the decomposition approach combined with the efficiency of product form solutions.

  12. A VME multiprocessor data acquisition system combining a UNIX workstation and real-time microprocessors

    SciTech Connect

    Barome, N.; Bossu, Y.; Douet, R.; Harroch, H.; Tran-Khanh, T. )

    1990-08-01

    A data acquisition system combining a UNIX workstation and one or several real-time microprocessors has been designed and built for the Tandem accelerator at IPN. The hardware and software options chosen for reading, processing, storing, and displaying real-time experimental data are detailed. The fixed components of the hardware architecture are the VME bus for data processing and the CMAC system for transferring digital data. A multitask multiprocessor software based on shared memory and message passing has been developed around a mixed UNIX/pSOS kernel.

  13. Performance degradation due to multiprogramming and system overheads in real workloads - Case study on a shared memory multiprocessor

    NASA Technical Reports Server (NTRS)

    Dimpsey, R. T.; Iyer, R. K.

    1990-01-01

    The performance degradation due to the multiprogramming (MP) overhead in a parallel execution environment is quantified. In addition, total system overhead is also measured. A methodology, which estimates the MP overhead present in real workloads, is illustrated with real measurents. It is found that MP overhead usually consumes between 10 and 23 percent of the processing power available to parallel programs. The mean MP overhead is determined to be 16 percent which is well over half the total system overhead executed on the system (the mean system overhead is determined to be 24 percent of the processing power). It is found that MP overhead, total system overhead, and application completion time are all moderately correlated.

  14. A Parallel Saturation Algorithm on Shared Memory Architectures

    NASA Technical Reports Server (NTRS)

    Ezekiel, Jonathan; Siminiceanu

    2007-01-01

    Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.

  15. Ensuring correct rollback recovery in distributed shared memory systems

    NASA Technical Reports Server (NTRS)

    Janssens, Bob; Fuchs, W. Kent

    1995-01-01

    Distributed shared memory (DSM) implemented on a cluster of workstations is an increasingly attractive platform for executing parallel scientific applications. Checkpointing and rollback techniques can be used in such a system to allow the computation to progress in spite of the temporary failure of one or more processing nodes. This paper presents the design of an independent checkpointing method for DSM that takes advantage of DSM's specific properties to reduce error-free and rollback overhead. The scheme reduces the dependencies that need to be considered for correct rollback to those resulting from transfers of pages. Furthermore, in-transit messages can be recovered without the use of logging. We extend the scheme to a DSM implementation using lazy release consistency, where the frequency of dependencies is further reduced.

  16. Translation techniques for distributed-shared memory programming models

    SciTech Connect

    Fuller, Douglas James

    2005-01-01

    The high performance computing community has experienced an explosive improvement in distributed-shared memory hardware. Driven by increasing real-world problem complexity, this explosion has ushered in vast numbers of new systems. Each new system presents new challenges to programmers and application developers. Part of the challenge is adapting to new architectures with new performance characteristics. Different vendors release systems with widely varying architectures that perform differently in different situations. Furthermore, since vendors need only provide a single performance number (total MFLOPS, typically for a single benchmark), they only have strong incentive initially to optimize the API of their choice. Consequently, only a fraction of the available APIs are well optimized on most systems. This causes issues porting and writing maintainable software, let alone issues for programmers burdened with mastering each new API as it is released. Also, programmers wishing to use a certain machine must choose their API based on the underlying hardware instead of the application. This thesis argues that a flexible, extensible translator for distributed-shared memory APIs can help address some of these issues. For example, a translator might take as input code in one API and output an equivalent program in another. Such a translator could provide instant porting for applications to new systems that do not support the application's library or language natively. While open-source APIs are abundant, they do not perform optimally everywhere. A translator would also allow performance testing using a single base code translated to a number of different APIs. Most significantly, this type of translator frees programmers to select the most appropriate API for a given application based on the application (and developer) itself instead of the underlying hardware.

  17. Coupled cluster algorithms for networks of shared memory parallel processors

    NASA Astrophysics Data System (ADS)

    Bentz, Jonathan L.; Olson, Ryan M.; Gordon, Mark S.; Schmidt, Michael W.; Kendall, Ricky A.

    2007-05-01

    As the popularity of using SMP systems as the building blocks for high performance supercomputers increases, so too increases the need for applications that can utilize the multiple levels of parallelism available in clusters of SMPs. This paper presents a dual-layer distributed algorithm, using both shared-memory and distributed-memory techniques to parallelize a very important algorithm (often called the "gold standard") used in computational chemistry, the single and double excitation coupled cluster method with perturbative triples, i.e. CCSD(T). The algorithm is presented within the framework of the GAMESS [M.W. Schmidt, K.K. Baldridge, J.A. Boatz, S.T. Elbert, M.S. Gordon, J.J. Jensen, S. Koseki, N. Matsunaga, K.A. Nguyen, S. Su, T.L. Windus, M. Dupuis, J.A. Montgomery, General atomic and molecular electronic structure system, J. Comput. Chem. 14 (1993) 1347-1363]. (General Atomic and Molecular Electronic Structure System) program suite and the Distributed Data Interface [M.W. Schmidt, G.D. Fletcher, B.M. Bode, M.S. Gordon, The distributed data interface in GAMESS, Comput. Phys. Comm. 128 (2000) 190]. (DDI), however, the essential features of the algorithm (data distribution, load-balancing and communication overhead) can be applied to more general computational problems. Timing and performance data for our dual-level algorithm is presented on several large-scale clusters of SMPs.

  18. Multiprocessor programming environment

    SciTech Connect

    Smith, M.B.; Fornaro, R.

    1988-12-01

    Programming tools and techniques have been well developed for traditional uniprocessor computer systems. The focus of this research project is on the development of a programming environment for a high speed real time heterogeneous multiprocessor system, with special emphasis on languages and compilers. The new tools and techniques will allow a smooth transition for programmers with experience only on single processor systems.

  19. Distributed parallel messaging for multiprocessor systems

    SciTech Connect

    Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M; Steinmacher-Burrow, Burhard; Sugawara, Yutaka

    2013-06-04

    A method and apparatus for distributed parallel messaging in a parallel computing system. The apparatus includes, at each node of a multiprocessor network, multiple injection messaging engine units and reception messaging engine units, each implementing a DMA engine and each supporting both multiple packet injection into and multiple reception from a network, in parallel. The reception side of the messaging unit (MU) includes a switch interface enabling writing of data of a packet received from the network to the memory system. The transmission side of the messaging unit, includes switch interface for reading from the memory system when injecting packets into the network.

  20. Power-Aware Compiler Controllable Chip Multiprocessor

    NASA Astrophysics Data System (ADS)

    Shikano, Hiroaki; Shirako, Jun; Wada, Yasutaka; Kimura, Keiji; Kasahara, Hironori

    A power-aware compiler controllable chip multiprocessor (CMP) is presented and its performance and power consumption are evaluated with the optimally scheduled advanced multiprocessor (OSCAR) parallelizing compiler. The CMP is equipped with power control registers that change clock frequency and power supply voltage to functional units including processor cores, memories, and an interconnection network. The OSCAR compiler carries out coarse-grain task parallelization of programs and reduces power consumption using architectural power control support and the compiler's power saving scheme. The performance evaluation shows that MPEG-2 encoding on the proposed CMP with four CPUs results in 82.6% power reduction in real-time execution mode with a deadline constraint on its sequential execution time. Furthermore, MP3 encoding on a heterogeneous CMP with four CPUs and four accelerators results in 53.9% power reduction at 21.1-fold speed-up in performance against its sequential execution in the fastest execution mode.

  1. Is sharing specific autobiographical memories a distinct form of self-disclosure?

    PubMed

    Beike, Denise R; Brandon, Nicole R; Cole, Holly E

    2016-04-01

    Theories of autobiographical memory posit a social function, meaning that recollecting and sharing memories of specific discrete events creates and maintains relationship intimacy. Eight studies with 1,271 participants tested whether sharing specific autobiographical memories in conversations increases feelings of closeness among conversation partners, relative to sharing other self-related information. The first 2 studies revealed that conversations in which specific autobiographical memories were shared were also accompanied by feelings of closeness among conversation partners. The next 5 studies experimentally introduced specific autobiographical memories versus general information about the self into conversations between mostly unacquainted pairs of participants. Discussing specific autobiographical memories led to greater closeness among conversation partners than discussing nonself-related topics, but no greater closeness than discussing other, more general self-related information. In the final study unacquainted pairs in whom feelings of closeness had been experimentally induced through shared humor were more likely to discuss specific autobiographical memories than unacquainted control participant pairs. We conclude that sharing specific autobiographical memories may express more than create relationship closeness, and discuss how relationship closeness may afford sharing of specific autobiographical memories by providing common ground, a social display, or a safety signal.

  2. Low latency memory access and synchronization

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Heidelberger, Philip; Hoenicke, Dirk; Ohmacht, Martin; Steinmacher-Burow, Burkhard D.; Takken, Todd E. , Vranas; Pavlos M.

    2010-10-19

    A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Bach processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.

  3. Low latency memory access and synchronization

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Heidelberger, Philip; Hoenicke, Dirk; Ohmacht, Martin; Steinmacher-Burow, Burkhard D.; Takken, Todd E.; Vranas, Pavlos M.

    2007-02-06

    A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.

  4. Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments

    SciTech Connect

    Jin, Shuangshuang; Chen, Yousu; Wu, Di; Diao, Ruisheng; Huang, Zhenyu

    2015-12-09

    Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Message Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.

  5. Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver

    NASA Astrophysics Data System (ADS)

    Moustafa, Salli; Dutka-Malen, Ivan; Plagne, Laurent; Ponçot, Angélique; Ramet, Pierre

    2014-06-01

    This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that implements two nested levels of parallelism (multicore+SIMD) on shared memory computation nodes. DOMINO is written in C++, a multi-paradigm programming language that enables the use of powerful and generic parallel programming tools such as Intel TBB and Eigen. These two libraries allow us to combine multi-thread parallelism with vector operations in an efficient and yet portable way. As a result, DOMINO can exploit the full power of modern multi-core processors and is able to tackle very large simulations, that usually require large HPC clusters, using a single computing node. For example, DOMINO solves a 3D full core PWR eigenvalue problem involving 26 energy groups, 288 angular directions (S16), 46 × 106 spatial cells and 1 × 1012 DoFs within 11 hours on a single 32-core SMP node. This represents a sustained performance of 235 GFlops and 40:74% of the SMP node peak performance for the DOMINO sweep implementation. The very high Flops/Watt ratio of DOMINO makes it a very interesting building block for a future many-nodes nuclear simulation tool.

  6. Predicting multiprocessing efficiency on the Cray multiprocessors in a (CTSS) time-sharing environment/application to a 3-D magnetohydrodynamics code

    SciTech Connect

    Mirin, A.A.

    1988-07-01

    A formula is derived for predicting multiprocessing efficiency on Cray supercomputers equipped with the Cray Time-Sharing System (CTSS). The model is applicable to an intensive time-sharing environment. The actual efficiency estimate depends on three factors: the code size, task length, and job mix. The implementation of multitasking in a three-dimensional plasma magnetohydrodynamics (MHD) code, TEMCO, is discussed. TEMCO solves the primitive one-fluid compressible MHD equations and includes resistive and Hall effects in Ohm's law. Virtually all segments of the main time-integration loop are multitasked. The multiprocessing efficiency model is applied to TEMCO. Excellent agreement is obtained between the actual multiprocessing efficiency and the theoretical prediction.

  7. Matrix factorization on a hypercube multiprocessor

    SciTech Connect

    Geist, G.A.; Heath, M.T.

    1985-08-01

    This paper is concerned with parallel algorithms for matrix factorization on distributed-memory, message-passing multiprocessors, with special emphasis on the hypercube. Both Cholesky factorization of symmetric positive definite matrices and LU factorization of nonsymmetric matrices using partial pivoting are considered. The use of the resulting triangular factors to solve systems of linear equations by forward and back substitutions is also considered. Efficiencies of various parallel computational approaches are compared in terms of empirical results obtained on an Intel iPSC hypercube. 19 refs., 6 figs., 2 tabs.

  8. Programmable partitioning for high-performance coherence domains in a multiprocessor system

    DOEpatents

    Blumrich, Matthias A.; Salapura, Valentina

    2011-01-25

    A multiprocessor computing system and a method of logically partitioning a multiprocessor computing system are disclosed. The multiprocessor computing system comprises a multitude of processing units, and a multitude of snoop units. Each of the processing units includes a local cache, and the snoop units are provided for supporting cache coherency in the multiprocessor system. Each of the snoop units is connected to a respective one of the processing units and to all of the other snoop units. The multiprocessor computing system further includes a partitioning system for using the snoop units to partition the multitude of processing units into a plurality of independent, memory-consistent, adjustable-size processing groups. Preferably, when the processor units are partitioned into these processing groups, the partitioning system also configures the snoop units to maintain cache coherency within each of said groups.

  9. Innovative design methodology for implementing heterogeneous multiprocessor architectures in VLSI

    SciTech Connect

    Tientien Li

    1983-01-01

    Considering the design cost of today's VLSI systems, advanced VLSI technology may not be cost-effective for implementing complex computer systems. In the paper, an innovative design approach which can drastically reduce the cost of implementing heterogeneous multiprocessor architectures in VLSI is presented. The author introduces high-level architectural design tools for assisting the design of multiprocessor systems with distributed memory modules and communication networks, and presents a logic/firmware synthesis scheme for automatically implementing multitasking structures and system service functions for multiprocessor architectures. Furthermore, the importance of the firmware synthesis aspect of VLSI system design is emphasized. Most logic of complex VLSI systems can be implemented very easily in firmware using the design approach introduced here. 10 references.

  10. Implementation of a parallel unstructured Euler solver on shared and distributed memory architectures

    NASA Technical Reports Server (NTRS)

    Mavriplis, D. J.; Das, Raja; Saltz, Joel; Vermeland, R. E.

    1992-01-01

    An efficient three dimensional unstructured Euler solver is parallelized on a Cray Y-MP C90 shared memory computer and on an Intel Touchstone Delta distributed memory computer. This paper relates the experiences gained and describes the software tools and hardware used in this study. Performance comparisons between two differing architectures are made.

  11. Working Memory Span Development: A Time-Based Resource-Sharing Model Account

    ERIC Educational Resources Information Center

    Barrouillet, Pierre; Gavens, Nathalie; Vergauwe, Evie; Gaillard, Vinciane; Camos, Valerie

    2009-01-01

    The time-based resource-sharing model (P. Barrouillet, S. Bernardin, & V. Camos, 2004) assumes that during complex working memory span tasks, attention is frequently and surreptitiously switched from processing to reactivate decaying memory traces before their complete loss. Three experiments involving children from 5 to 14 years of age…

  12. A shared neural ensemble links distinct contextual memories encoded close in time.

    PubMed

    Cai, Denise J; Aharoni, Daniel; Shuman, Tristan; Shobe, Justin; Biane, Jeremy; Song, Weilin; Wei, Brandon; Veshkini, Michael; La-Vu, Mimi; Lou, Jerry; Flores, Sergio E; Kim, Isaac; Sano, Yoshitake; Zhou, Miou; Baumgaertel, Karsten; Lavi, Ayal; Kamata, Masakazu; Tuszynski, Mark; Mayford, Mark; Golshani, Peyman; Silva, Alcino J

    2016-05-23

    Recent studies suggest that a shared neural ensemble may link distinct memories encoded close in time. According to the memory allocation hypothesis, learning triggers a temporary increase in neuronal excitability that biases the representation of a subsequent memory to the neuronal ensemble encoding the first memory, such that recall of one memory increases the likelihood of recalling the other memory. Here we show in mice that the overlap between the hippocampal CA1 ensembles activated by two distinct contexts acquired within a day is higher than when they are separated by a week. Several findings indicate that this overlap of neuronal ensembles links two contextual memories. First, fear paired with one context is transferred to a neutral context when the two contexts are acquired within a day but not across a week. Second, the first memory strengthens the second memory within a day but not across a week. Older mice, known to have lower CA1 excitability, do not show the overlap between ensembles, the transfer of fear between contexts, or the strengthening of the second memory. Finally, in aged mice, increasing cellular excitability and activating a common ensemble of CA1 neurons during two distinct context exposures rescued the deficit in linking memories. Taken together, these findings demonstrate that contextual memories encoded close in time are linked by directing storage into overlapping ensembles. Alteration of these processes by ageing could affect the temporal structure of memories, thus impairing efficient recall of related information.

  13. A shared neural ensemble links distinct contextual memories encoded close in time.

    PubMed

    Cai, Denise J; Aharoni, Daniel; Shuman, Tristan; Shobe, Justin; Biane, Jeremy; Song, Weilin; Wei, Brandon; Veshkini, Michael; La-Vu, Mimi; Lou, Jerry; Flores, Sergio E; Kim, Isaac; Sano, Yoshitake; Zhou, Miou; Baumgaertel, Karsten; Lavi, Ayal; Kamata, Masakazu; Tuszynski, Mark; Mayford, Mark; Golshani, Peyman; Silva, Alcino J

    2016-06-01

    Recent studies suggest that a shared neural ensemble may link distinct memories encoded close in time. According to the memory allocation hypothesis, learning triggers a temporary increase in neuronal excitability that biases the representation of a subsequent memory to the neuronal ensemble encoding the first memory, such that recall of one memory increases the likelihood of recalling the other memory. Here we show in mice that the overlap between the hippocampal CA1 ensembles activated by two distinct contexts acquired within a day is higher than when they are separated by a week. Several findings indicate that this overlap of neuronal ensembles links two contextual memories. First, fear paired with one context is transferred to a neutral context when the two contexts are acquired within a day but not across a week. Second, the first memory strengthens the second memory within a day but not across a week. Older mice, known to have lower CA1 excitability, do not show the overlap between ensembles, the transfer of fear between contexts, or the strengthening of the second memory. Finally, in aged mice, increasing cellular excitability and activating a common ensemble of CA1 neurons during two distinct context exposures rescued the deficit in linking memories. Taken together, these findings demonstrate that contextual memories encoded close in time are linked by directing storage into overlapping ensembles. Alteration of these processes by ageing could affect the temporal structure of memories, thus impairing efficient recall of related information. PMID:27251287

  14. A shared neural ensemble links distinct contextual memories encoded close in time

    NASA Astrophysics Data System (ADS)

    Cai, Denise J.; Aharoni, Daniel; Shuman, Tristan; Shobe, Justin; Biane, Jeremy; Song, Weilin; Wei, Brandon; Veshkini, Michael; La-Vu, Mimi; Lou, Jerry; Flores, Sergio E.; Kim, Isaac; Sano, Yoshitake; Zhou, Miou; Baumgaertel, Karsten; Lavi, Ayal; Kamata, Masakazu; Tuszynski, Mark; Mayford, Mark; Golshani, Peyman; Silva, Alcino J.

    2016-06-01

    Recent studies suggest that a shared neural ensemble may link distinct memories encoded close in time. According to the memory allocation hypothesis, learning triggers a temporary increase in neuronal excitability that biases the representation of a subsequent memory to the neuronal ensemble encoding the first memory, such that recall of one memory increases the likelihood of recalling the other memory. Here we show in mice that the overlap between the hippocampal CA1 ensembles activated by two distinct contexts acquired within a day is higher than when they are separated by a week. Several findings indicate that this overlap of neuronal ensembles links two contextual memories. First, fear paired with one context is transferred to a neutral context when the two contexts are acquired within a day but not across a week. Second, the first memory strengthens the second memory within a day but not across a week. Older mice, known to have lower CA1 excitability, do not show the overlap between ensembles, the transfer of fear between contexts, or the strengthening of the second memory. Finally, in aged mice, increasing cellular excitability and activating a common ensemble of CA1 neurons during two distinct context exposures rescued the deficit in linking memories. Taken together, these findings demonstrate that contextual memories encoded close in time are linked by directing storage into overlapping ensembles. Alteration of these processes by ageing could affect the temporal structure of memories, thus impairing efficient recall of related information.

  15. Aerodynamic Shape Optimization Using A Combined Distributed/Shared Memory Paradigm

    NASA Technical Reports Server (NTRS)

    Cheung, Samson; Holst, Terry

    1999-01-01

    Current parallel computational approaches involve distributed and shared memory paradigms. In the distributed memory paradigm, each processor has its own independent memory. Message passing typically uses a function library such as MPI or PVM. In the shared memory paradigm, such as that used on the SGI Origin 2000 machine, compiler directives are used to instruct the compiler to schedule multiple threads to perform calculations. In this paradigm, it must be assured that processors (threads) do not simultaneously access regions of memory in such away that errors would occur. This paper utilizes the latest version of the SGI MPI function library to combine the two parallelization paradigms to perform aerodynamic shape optimization of a generic wing/body.

  16. Method and apparatus for single-stepping coherence events in a multiprocessor system under software control

    DOEpatents

    Blumrich, Matthias A.; Salapura, Valentina

    2010-11-02

    An apparatus and method are disclosed for single-stepping coherence events in a multiprocessor system under software control in order to monitor the behavior of a memory coherence mechanism. Single-stepping coherence events in a multiprocessor system is made possible by adding one or more step registers. By accessing these step registers, one or more coherence requests are processed by the multiprocessor system. The step registers determine if the snoop unit will operate by proceeding in a normal execution mode, or operate in a single-step mode.

  17. A system for simulating shared memory in heterogeneous distributed-memory networks with specialization for robotics applications

    SciTech Connect

    Jones, J.P.; Bangs, A.L.; Butler, P.L.

    1991-01-01

    Hetero Helix is a programming environment which simulates shared memory on a heterogeneous network of distributed-memory computers. The machines in the network may vary with respect to their native operating systems and internal representation of numbers. Hetero Helix presents a simple programming model to developers, and also considers the needs of designers, system integrators, and maintainers. The key software technology underlying Hetero Helix is the use of a compiler'' which analyzes the data structures in shared memory and automatically generates code which translates data representations from the format native to each machine into a common format, and vice versa. The design of Hetero Helix was motivated in particular by the requirements of robotics applications. Hetero Helix has been used successfully in an integration effort involving 27 CPUs in a heterogeneous network and a body of software totaling roughly 100,00 lines of code. 25 refs., 6 figs.

  18. DiFX: A Software Correlator for Very Long Baseline Interferometry Using Multiprocessor Computing Environments

    NASA Astrophysics Data System (ADS)

    Deller, A. T.; Tingay, S. J.; Bailes, M.; West, C.

    2007-03-01

    We describe the development of an FX-style correlator for very long baseline interferometry (VLBI), implemented in software and intended to run in multiprocessor computing environments, such as large clusters of commodity machines (Beowulf clusters) or computers specifically designed for high-performance computing, such as multiprocessor shared-memory machines. We outline the scientific and practical benefits for VLBI correlation, these chiefly being due to the inherent flexibility of software and the fact that the highly parallel and scalable nature of the correlation task is well suited to a multiprocessor computing environment. We suggest scientific applications where such an approach to VLBI correlation is most suited and will give the best returns. We report detailed results from the Distributed FX (DiFX) software correlator running on the Swinburne supercomputer (a Beowulf cluster of ~300 commodity processors), including measures of the performance of the system. For example, to correlate all Stokes products for a 10 antenna array with an aggregate bandwidth of 64 MHz per station, and using typical time and frequency resolution, currently requires an order of 100 desktop-class compute nodes. Due to the effect of Moore's law on commodity computing performance, the total number and cost of compute nodes required to meet a given correlation task continues to decrease rapidly with time. We show detailed comparisons between DiFX and two existing hardware-based correlators: the Australian Long Baseline Array S2 correlator and the NRAO Very Long Baseline Array correlator. In both cases, excellent agreement was found between the correlators. Finally, we describe plans for the future operation of DiFX on the Swinburne supercomputer for both astrophysical and geodetic science.

  19. Principles for problem aggregation and assignment in medium scale multiprocessors

    NASA Technical Reports Server (NTRS)

    Nicol, David M.; Saltz, Joel H.

    1987-01-01

    One of the most important issues in parallel processing is the mapping of workload to processors. This paper considers a large class of problems having a high degree of potential fine grained parallelism, and execution requirements that are either not predictable, or are too costly to predict. The main issues in mapping such a problem onto medium scale multiprocessors are those of aggregation and assignment. We study a method of parameterized aggregation that makes few assumptions about the workload. The mapping of aggregate units of work onto processors is uniform, and exploits locality of workload intensity to balance the unknown workload. In general, a finer aggregate granularity leads to a better balance at the price of increased communication/synchronization costs; the aggregation parameters can be adjusted to find a reasonable granularity. The effectiveness of this scheme is demonstrated on three model problems: an adaptive one-dimensional fluid dynamics problem with message passing, a sparse triangular linear system solver on both a shared memory and a message-passing machine, and a two-dimensional time-driven battlefield simulation employing message passing. Using the model problems, the tradeoffs are studied between balanced workload and the communication/synchronization costs. Finally, an analytical model is used to explain why the method balances workload and minimizes the variance in system behavior.

  20. Automatic Generation of Directive-Based Parallel Programs for Shared Memory Parallel Systems

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Yan, Jerry; Frumkin, Michael

    2000-01-01

    The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. Due to its ease of programming and its good performance, the technique has become very popular. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate directive-based, OpenMP, parallel programs. We outline techniques used in the implementation of the tool and present test results on the NAS parallel benchmarks and ARC3D, a CFD application. This work demonstrates the great potential of using computer-aided tools to quickly port parallel programs and also achieve good performance.

  1. Parallel-access memory management using fast-fits

    SciTech Connect

    Johnson, T.

    1994-12-01

    The two most common approaches to managing shared-access memory-free lists and buddy system-have significant drawbacks. Free list algorithms have poor memory access characteristics, and buddy systems utilize their space inefficiently. In this paper, we present an alternative approach to parallel-access memory management based on the fast-fits algorithm. A fast-fits memory manager stores free blocks in a tree structure, providing fast access and efficient space use. Since the fast-fits algorithm accesses fewer blocks than a free list algorithm, it reduces the amount of cache invalidation overhead due to the memory manager. Our performance experiments show that the parallel-access fast-fits memory manager allows significantly greater access rates than a serial-access fast-fits memory manager does. We not that shared-memory multiprocessor systems need efficient dynamic storage allocators, both for system purposes and to support parallel programs.

  2. On the design of parallel numerical methods in message passing and shared memory environments

    SciTech Connect

    Saad, Y.

    1987-02-01

    This paper presents a comparative view of the methodologies used in parallel methods for scientific computing. Our goal is to put in contrast the approaches taken when developing scientific software for two broad classes of parallel machines namely shared memory machines and distributed memory machines. We will illustrate our discussion with two specific algorithms namely the Alternating Directions Implicit method and the Preconditioned Conjugate Gradient method.

  3. HyperForest: A high performance multi-processor architecture for real-time intelligent systems

    SciTech Connect

    Garcia, P. Jr.; Rebeil, J.P.; Pollard, H.

    1997-04-01

    Intelligent Systems are characterized by the intensive use of computer power. The computer revolution of the last few years is what has made possible the development of the first generation of Intelligent Systems. Software for second generation Intelligent Systems will be more complex and will require more powerful computing engines in order to meet real-time constraints imposed by new robots, sensors, and applications. A multiprocessor architecture was developed that merges the advantages of message-passing and shared-memory structures: expendability and real-time compliance. The HyperForest architecture will provide an expandable real-time computing platform for computationally intensive Intelligent Systems and open the doors for the application of these systems to more complex tasks in environmental restoration and cleanup projects, flexible manufacturing systems, and DOE`s own production and disassembly activities.

  4. Socially shared mourning: construction and consumption of collective memory

    NASA Astrophysics Data System (ADS)

    Harju, Anu

    2015-04-01

    Social media, such as YouTube, is increasingly a site of collective remembering where personal tributes to celebrity figures become sites of public mourning. YouTube, especially, is rife with celebrity commemorations. Examining fans' online mourning practices on YouTube, this paper examines video tributes dedicated to the late Steve Jobs, with a focus on collective remembering and collective construction of memory. Combining netnography with critical discourse analysis, the analysis focuses on the user comments where the past unfolds in interaction and meanings are negotiated and contested. The paper argues that celebrity death may, for avid fans, be a source of disenfranchised grief, a type of grief characterised by inadequate social support, usually arising from lack of empathy for the loss. The paper sheds light on the functions digital memorials have for mourning fans (and fandom) and argues that social media sites have come to function as spaces of negotiation, legitimisation and alleviation of disenfranchised grief. It is also suggested that when it comes to disenfranchised grief, and grief work generally, the concept of community be widened to include communities of weak ties, a typical form of communal belonging on social media.

  5. A new shared-memory programming paradigm for molecular dynamics simulations on the Intel Paragon

    SciTech Connect

    D`Azevedo, E.F.; Romine, C.H.

    1994-12-01

    This report describes the use of shared memory emulation with DOLIB (Distributed Object Library) to simplify parallel programming on the Intel Paragon. A molecular dynamics application is used as an example to illustrate the use of the DOLIB shared memory library. SOTON-PAR, a parallel molecular dynamics code with explicit message-passing using a Lennard-Jones 6-12 potential, is rewritten using DOLIB primitives. The resulting code has no explicit message primitives and resembles a serial code. The new code can perform dynamic load balancing and achieves better performance than the original parallel code with explicit message-passing.

  6. A New Shared-Memory Programming Paradigm for Molecular Dynamics Simulations on the Intel Paragon

    SciTech Connect

    D'Azevedo, E.F.

    1995-01-01

    This report describes the use of shared memory emulation with DOLIB (Distributed Object Library) to simplify parallel programming on the Intel Paragon. A molecular dynamics application is used as an example to illustrate the use of the DOLIB shared memory library. SOTON PAR, a parallel molecular dynamics code with explicit message-passing using a Lennard-Jones 6-12 potential, is rewritten using DOLIB primitives. The resulting code has no explicit message primitives and resembles a serial code. The new code can perform dynamic load balancing and achieves better performance than the original parallel code with explicit message-passing.

  7. High Performance Programming Using Explicit Shared Memory Model on Cray T3D1

    NASA Technical Reports Server (NTRS)

    Simon, Horst D.; Saini, Subhash; Grassi, Charles

    1994-01-01

    The Cray T3D system is the first-phase system in Cray Research, Inc.'s (CRI) three-phase massively parallel processing (MPP) program. This system features a heterogeneous architecture that closely couples DEC's Alpha microprocessors and CRI's parallel-vector technology, i.e., the Cray Y-MP and Cray C90. An overview of the Cray T3D hardware and available programming models is presented. Under Cray Research adaptive Fortran (CRAFT) model four programming methods (data parallel, work sharing, message-passing using PVM, and explicit shared memory model) are available to the users. However, at this time data parallel and work sharing programming models are not available to the user community. The differences between standard PVM and CRI's PVM are highlighted with performance measurements such as latencies and communication bandwidths. We have found that the performance of neither standard PVM nor CRI s PVM exploits the hardware capabilities of the T3D. The reasons for the bad performance of PVM as a native message-passing library are presented. This is illustrated by the performance of NAS Parallel Benchmarks (NPB) programmed in explicit shared memory model on Cray T3D. In general, the performance of standard PVM is about 4 to 5 times less than obtained by using explicit shared memory model. This degradation in performance is also seen on CM-5 where the performance of applications using native message-passing library CMMD on CM-5 is also about 4 to 5 times less than using data parallel methods. The issues involved (such as barriers, synchronization, invalidating data cache, aligning data cache etc.) while programming in explicit shared memory model are discussed. Comparative performance of NPB using explicit shared memory programming model on the Cray T3D and other highly parallel systems such as the TMC CM-5, Intel Paragon, Cray C90, IBM-SP1, etc. is presented.

  8. Embedded Multiprocessor Technology for VHSIC Insertion

    NASA Technical Reports Server (NTRS)

    Hayes, Paul J.

    1990-01-01

    Viewgraphs on embedded multiprocessor technology for VHSIC insertion are presented. The objective was to develop multiprocessor system technology providing user-selectable fault tolerance, increased throughput, and ease of application representation for concurrent operation. The approach was to develop graph management mapping theory for proper performance, model multiprocessor performance, and demonstrate performance in selected hardware systems.

  9. Shared Representations in Language Processing and Verbal Short-Term Memory: The Case of Grammatical Gender

    ERIC Educational Resources Information Center

    Schweppe, Judith; Rummer, Ralf

    2007-01-01

    The general idea of language-based accounts of short-term memory is that retention of linguistic materials is based on representations within the language processing system. In the present sentence recall study, we address the question whether the assumption of shared representations holds for morphosyntactic information (here: grammatical gender…

  10. Functions of Memory Sharing and Mother-Child Reminiscing Behaviors: Individual and Cultural Variations

    ERIC Educational Resources Information Center

    Kulkofsky, Sarah; Wang, Qi; Koh, Jessie Bee Kim

    2009-01-01

    This study examined maternal beliefs about the functions of memory sharing and the relations between these beliefs and mother-child reminiscing behaviors in a cross-cultural context. Sixty-three European American and 47 Chinese mothers completed an open-ended questionnaire concerning their beliefs about the functions of parent-child memory…

  11. Sharing memories and telling stories: American and Chinese mothers and their 3-year-olds.

    PubMed

    Wang, Q; Leichtman, M D; Davies, K I

    2000-05-01

    American and Chinese mothers were asked to talk with their 3-year-old children at home about two shared past events and a story (41 mother-child dyads). Results revealed between-culture variation in the content and style of mother-child conversations when sharing memories and telling stories. American mothers and children showed a high-elaborative, independently oriented conversational style in which they co-constructed their memories and stories by elaborating on each other's responses and focusing on the child's personal predilections and opinions. In contrast, Chinese Mother-child dyads employed a low-elaborative, interdependently oriented conversational style where mothers frequently posed and repeated factual questions and showed great concern with moral rules and behavioural standards with their children. Findings suggest that children's early social-linguistic environments shape autobiographical remembering and contribute to cultural differences in the age and content of earliest childhood memories.

  12. Using memory in the Cedar system

    SciTech Connect

    McGrath, R.E.; Emrath, P.

    1987-01-01

    The design of the virtual memory system for the Cedar multiprocessor under construction at the University of Illinois is discussed. The Cedar architecture features a hierarchy of memory, some shared by all processors, and some shared by subsets of processors. The Xylem operating system is based on Alliant Computer Systems CONCENTRIX(TM) operating system, which is based on 4.2BSD UNIX(TM). Xylem supports multi-tasking and demand paging of parts of the memory hierarchy into a linear virtual address space. Memory may be private to a task or shared between all the tasks. The locality and attributes of a page may be modified during the execution of a program. Examples of how these mechanisms can be used are discussed. 14 figs.

  13. Optical Shared Memory Computing and Multiple Access Protocols for Photonic Networks

    NASA Astrophysics Data System (ADS)

    Li, Kuang-Yu.

    In this research we investigate potential applications of optics in massively parallel computer systems, especially focusing on design issues in three-dimensional optical data storage and free-space photonic networks. An optical implementation of a shared memory uses a single photorefractive crystal and can realize the set of memory modules in a digital shared memory computer. A complete instruction set consists of R sc EAD, W sc RITE, S sc ELECTIVE E sc RASE, and R sc EFRESH, which can be applied to any memory module independent of (and in parallel with) instructions to the other memory modules. In addition, a memory module can execute a sequence of R sc EAD operations simultaneously with the execution of a W sc RITE operation to accommodate differences in optical recording and readout times common to optical volume storage media. An experimental shared memory system is demonstrated and its projected performance is analyzed. A multiplexing technique is presented to significantly reduce both grating- and beam-degeneracy crosstalk in volume holographic systems, by incorporating space, angle, and wavelength as the multiplexing parameters. In this approach, each hologram, which results from the interference between a single input node and an object array, partially overlaps with the other holograms in its neighborhood. This technique can offer improved interconnection density, optical throughput, signal fidelity, and space-bandwidth product utilization. Design principles and numerical simulation results are presented. A free-space photonic cellular hypercube parallel computer, with emphasis on the design of a collisionless multiple access protocol, is presented. This design incorporates wavelength-, space-, and time-multiplexing to achieve multiple access, wavelength reuse, dense connectivity, collisionless communications, and a simple control mechanism. Analytic models based on semi-Markov processes are employed to analyze this protocol. The performance of the

  14. Visual and Spatial Working Memory Are Not that Dissociated after All: A Time-Based Resource-Sharing Account

    ERIC Educational Resources Information Center

    Vergauwe, Evie; Barrouillet, Pierre; Camos, Valerie

    2009-01-01

    Examinations of interference between visual and spatial materials in working memory have suggested domain- and process-based fractionations of visuo-spatial working memory. The present study examined the role of central time-based resource sharing in visuo-spatial working memory and assessed its role in obtained interference patterns. Visual and…

  15. Parallel calculations on shared memory, NUMA-based computers using MATLAB

    NASA Astrophysics Data System (ADS)

    Krotkiewski, Marcin; Dabrowski, Marcin

    2014-05-01

    Achieving satisfactory computational performance in numerical simulations on modern computer architectures can be a complex task. Multi-core design makes it necessary to parallelize the code. Efficient parallelization on NUMA (Non-Uniform Memory Access) shared memory architectures necessitates explicit placement of the data in the memory close to the CPU that uses it. In addition, using more than 8 CPUs (~100 cores) requires a cluster solution of interconnected nodes, which involves (expensive) communication between the processors. It takes significant effort to overcome these challenges even when programming in low-level languages, which give the programmer full control over data placement and work distribution. Instead, many modelers use high-level tools such as MATLAB, which severely limit the optimization/tuning options available. Nonetheless, the advantage of programming simplicity and a large available code base can tip the scale in favor of MATLAB. We investigate whether MATLAB can be used for efficient, parallel computations on modern shared memory architectures. A common approach to performance optimization of MATLAB programs is to identify a bottleneck and migrate the corresponding code block to a MEX file implemented in, e.g. C. Instead, we aim at achieving a scalable parallel performance of MATLABs core functionality. Some of the MATLABs internal functions (e.g., bsxfun, sort, BLAS3, operations on vectors) are multi-threaded. Achieving high parallel efficiency of those may potentially improve the performance of significant portion of MATLABs code base. Since we do not have MATLABs source code, our performance tuning relies on the tools provided by the operating system alone. Most importantly, we use custom memory allocation routines, thread to CPU binding, and memory page migration. The performance tests are carried out on multi-socket shared memory systems (2- and 4-way Intel-based computers), as well as a Distributed Shared Memory machine with 96 CPU

  16. Shared mushroom body circuits underlie visual and olfactory memories in Drosophila

    PubMed Central

    Vogt, Katrin; Schnaitmann, Christopher; Dylla, Kristina V; Knapek, Stephan; Aso, Yoshinori; Rubin, Gerald M; Tanimoto, Hiromu

    2014-01-01

    In nature, animals form memories associating reward or punishment with stimuli from different sensory modalities, such as smells and colors. It is unclear, however, how distinct sensory memories are processed in the brain. We established appetitive and aversive visual learning assays for Drosophila that are comparable to the widely used olfactory learning assays. These assays share critical features, such as reinforcing stimuli (sugar reward and electric shock punishment), and allow direct comparison of the cellular requirements for visual and olfactory memories. We found that the same subsets of dopamine neurons drive formation of both sensory memories. Furthermore, distinct yet partially overlapping subsets of mushroom body intrinsic neurons are required for visual and olfactory memories. Thus, our results suggest that distinct sensory memories are processed in a common brain center. Such centralization of related brain functions is an economical design that avoids the repetition of similar circuit motifs. DOI: http://dx.doi.org/10.7554/eLife.02395.001 PMID:25139953

  17. Fault tolerant onboard packet switch architecture for communication satellites: Shared memory per beam approach

    NASA Technical Reports Server (NTRS)

    Shalkhauser, Mary JO; Quintana, Jorge A.; Soni, Nitin J.

    1994-01-01

    The NASA Lewis Research Center is developing a multichannel communication signal processing satellite (MCSPS) system which will provide low data rate, direct to user, commercial communications services. The focus of current space segment developments is a flexible, high-throughput, fault tolerant onboard information switching processor. This information switching processor (ISP) is a destination-directed packet switch which performs both space and time switching to route user information among numerous user ground terminals. Through both industry study contracts and in-house investigations, several packet switching architectures were examined. A contention-free approach, the shared memory per beam architecture, was selected for implementation. The shared memory per beam architecture, fault tolerance insertion, implementation, and demonstration plans are described.

  18. IMPACC: A Tightly Integrated MPI+OpenACC Framework Exploiting Shared Memory Parallelism

    SciTech Connect

    Lee, Seyong; Vetter, Jeffrey S

    2016-01-01

    We propose IMPACC, an MPI+OpenACC framework for heterogeneous accelerator clusters. IMPACC tightly integrates MPI and OpenACC, while exploiting the shared memory parallelism in the target system. IMPACC dynamically adapts the input MPI+OpenACC applications on the target heterogeneous accelerator clusters to fully exploit target system-specific features. IMPACC provides the programmers with the unified virtual address space, automatic NUMA-friendly task-device mapping, efficient integrated communication routines, seamless streamlining of asynchronous executions, and transparent memory sharing. We have implemented IMPACC and evaluated its performance using three heterogeneous accelerator systems, including Titan supercomputer. Results show that IMPACC can achieve easier programming, higher performance, and better scalability than the current MPI+OpenACC model.

  19. Data traffic reduction schemes for Cholesky factorization on asynchronous multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Naik, Vijay K.; Patrick, Merrell L.

    1989-01-01

    Communication requirements of Cholesky factorization of dense and sparse symmetric, positive definite matrices are analyzed. The communication requirement is characterized by the data traffic generated on multiprocessor systems with local and shared memory. Lower bound proofs are given to show that when the load is uniformly distributed the data traffic associated with factoring an n x n dense matrix using n to the alpha power (alpha less than or equal 2) processors is omega(n to the 2 + alpha/2 power). For n x n sparse matrices representing a square root of n x square root of n regular grid graph the data traffic is shown to be omega(n to the 1 + alpha/2 power), alpha less than or equal 1. Partitioning schemes that are variations of block assignment scheme are described and it is shown that the data traffic generated by these schemes are asymptotically optimal. The schemes allow efficient use of up to O(n to the 2nd power) processors in the dense case and up to O(n) processors in the sparse case before the total data traffic reaches the maximum value of O(n to the 3rd power) and O(n to the 3/2 power), respectively. It is shown that the block based partitioning schemes allow a better utilization of the data accessed from shared memory and thus reduce the data traffic than those based on column-wise wrap around assignment schemes.

  20. A multiprocessor operating system simulator

    SciTech Connect

    Johnston, G.M.; Campbell, R.H. . Dept. of Computer Science)

    1988-01-01

    This paper describes a multiprocessor operating system simulator that was developed by the authors in the Fall of 1987. The simulator was built in response to the need to provide students with an environment in which to build and test operating system concepts as part of the coursework of a third-year undergraduate operating systems course. Written in C++, the simulator uses the co-routine style task package that is distributed with the AT and T C++ Translator to provide a hierarchy of classes that represents a broad range of operating system software and hardware components. The class hierarchy closely follows that of the Choices family of operating systems for loosely and tightly coupled multiprocessors. During an operating system course, these classes are refined and specialized by students in homework assignments to facilitate experimentation with different aspects of operating system design and policy decisions. The current implementation runs on the IBM RT PC under 4.3bsd UNIX.

  1. A Multiprocessor Operating System Simulator

    NASA Technical Reports Server (NTRS)

    Johnston, Gary M.; Campbell, Roy H.

    1988-01-01

    This paper describes a multiprocessor operating system simulator that was developed by the authors in the Fall semester of 1987. The simulator was built in response to the need to provide students with an environment in which to build and test operating system concepts as part of the coursework of a third-year undergraduate operating systems course. Written in C++, the simulator uses the co-routine style task package that is distributed with the AT&T C++ Translator to provide a hierarchy of classes that represents a broad range of operating system software and hardware components. The class hierarchy closely follows that of the 'Choices' family of operating systems for loosely- and tightly-coupled multiprocessors. During an operating system course, these classes are refined and specialized by students in homework assignments to facilitate experimentation with different aspects of operating system design and policy decisions. The current implementation runs on the IBM RT PC under 4.3bsd UNIX.

  2. Reproducibility in a multiprocessor system

    DOEpatents

    Bellofatto, Ralph A; Chen, Dong; Coteus, Paul W; Eisley, Noel A; Gara, Alan; Gooding, Thomas M; Haring, Rudolf A; Heidelberger, Philip; Kopcsay, Gerard V; Liebsch, Thomas A; Ohmacht, Martin; Reed, Don D; Senger, Robert M; Steinmacher-Burow, Burkhard; Sugawara, Yutaka

    2013-11-26

    Fixing a problem is usually greatly aided if the problem is reproducible. To ensure reproducibility of a multiprocessor system, the following aspects are proposed; a deterministic system start state, a single system clock, phase alignment of clocks in the system, system-wide synchronization events, reproducible execution of system components, deterministic chip interfaces, zero-impact communication with the system, precise stop of the system and a scan of the system state.

  3. Fault-tolerant multiprocessor computer

    SciTech Connect

    Smith, T.B. III; Lala, J.H.; Goldberg, J.; Kautz, W.H.; Melliar-Smith, P.M.; Green, M.W.; Levitt, K.N.; Schwartz, R.L.; Weinstock, C.B.; Palumbo, D.L.

    1986-01-01

    The development and evaluation of fault-tolerant computer architectures and software-implemented fault tolerance (SIFT) for use in advanced NASA vehicles and potentially in flight-control systms are described in a collection of previously published reports prepared for NASA. Topics addressed include the principles of fault-tolerant multiprocessor (FTMP) operation; processor and slave regional designs; FTMP executive, facilities, aceptance-test/diagnostic, applications, and support software; FTM reliability and availability models; SIFT hardware design; and SIFT validation and verification.

  4. Multiprocessor computer overset grid method and apparatus

    DOEpatents

    Barnette, Daniel W.; Ober, Curtis C.

    2003-01-01

    A multiprocessor computer overset grid method and apparatus comprises associating points in each overset grid with processors and using mapped interpolation transformations to communicate intermediate values between processors assigned base and target points of the interpolation transformations. The method allows a multiprocessor computer to operate with effective load balance on overset grid applications.

  5. Modeling working memory: a computational implementation of the Time-Based Resource-Sharing theory.

    PubMed

    Oberauer, Klaus; Lewandowsky, Stephan

    2011-02-01

    Working memory is a core concept in cognition, predicting about 50% of the variance in IQ and reasoning tasks. A popular test of working memory is the complex span task, in which encoding of memoranda alternates with processing of distractors. A recent model of complex span performance, the Time-Based-Resource-Sharing (TBRS) model of Barrouillet and colleagues, has seemingly accounted for several crucial findings, in particular the intricate trade-off between deterioration and restoration of memory in the complex span task. According to the TBRS, memory traces decay during processing of the distractors, and they are restored by attentional refreshing during brief pauses in between processing steps. However, to date, the theory has been formulated only at a verbal level, which renders it difficult to test and to be certain of its intuited predictions. We present a computational instantiation of the TBRS and show that it can handle most of the findings on which the verbal model was based. We also show that there are potential challenges to the model that await future resolution. This instantiated model, TBRS*, is the first comprehensive computational model of performance in the complex span paradigm. The Matlab model code is available as a supplementary material of this article. PMID:21327362

  6. Coscheduling Technique for Symmetric Multiprocessor Clusters

    SciTech Connect

    Yoo, A B; Jette, M A

    2000-09-18

    Coscheduling is essential for obtaining good performance in a time-shared symmetric multiprocessor (SMP) cluster environment. However, the most common technique, gang scheduling, has limitations such as poor scalability and vulnerability to faults mainly due to explicit synchronization between its components. A decentralized approach called dynamic coscheduling (DCS) has been shown to be effective for network of workstations (NOW), but this technique is not suitable for the workloads on a very large SMP-cluster with thousands of processors. Furthermore, its implementation can be prohibitively expensive for such a large-scale machine. IN this paper, they propose a novel coscheduling technique based on the DCS approach which can achieve coscheduling on very large SMP-clusters in a scalable, efficient, and cost-effective way. In the proposed technique, each local scheduler achieves coscheduling based upon message traffic between the components of parallel jobs. Message trapping is carried out at the user-level, eliminating the need for unsupported hardware or device-level programming. A sending process attaches its status to outgoing messages so local schedulers on remote nodes can make more intelligent scheduling decisions. Once scheduled, processes are guaranteed some minimum period of time to execute. This provides an opportunity to synchronize the parallel job's components across all nodes and achieve good program performance. The results from a performance study reveal that the proposed technique is a promising approach that can reduce response time significantly over uncoordinated time-sharing and batch scheduling.

  7. Shared and Distributed Memory Parallel Security Analysis of Large-Scale Source Code and Binary Applications

    SciTech Connect

    Quinlan, D; Barany, G; Panas, T

    2007-08-30

    Many forms of security analysis on large scale applications can be substantially automated but the size and complexity can exceed the time and memory available on conventional desktop computers. Most commercial tools are understandably focused on such conventional desktop resources. This paper presents research work on the parallelization of security analysis of both source code and binaries within our Compass tool, which is implemented using the ROSE source-to-source open compiler infrastructure. We have focused on both shared and distributed memory parallelization of the evaluation of rules implemented as checkers for a wide range of secure programming rules, applicable to desktop machines, networks of workstations and dedicated clusters. While Compass as a tool focuses on source code analysis and reports violations of an extensible set of rules, the binary analysis work uses the exact same infrastructure but is less well developed into an equivalent final tool.

  8. Parallel Reduction of Large Radar Interferometry Scenes on a Mid-scale, Symmetric Multiprocessor Mainframe Computer

    NASA Astrophysics Data System (ADS)

    Harcke, L. J.; Zebker, H. A.

    2006-12-01

    We report on experiences in processing repeat-orbit interferometry data sets on a mid-scale multiprocessor mainframe computer. Newer applications of interferometric and polarimetric data processing, such as permanent scatterer deformation monitoring, require the generation of many tens of repeat-pass interferometry data pairs, perhaps 30 to 50, to provide sufficient input to the deformation model. Moving existing radar processing techniques toward massively parallel computation provides a path to coping with such large data sets, which can consist of 30 to 50 gigabytes (GB) of raw data. In June 2006, the Stanford School of Earth Sciences dedicated a new computation center for general research use. Two large machines compose the center: a single-node, symmetric multiprocessor (SMP) machine with 48 processor cores and a single 192~GB memory, and a 64 node distributed cluster containing 128 processor cores with at least 2~GB of memory per node. Distributed processing of the matched filter for synthetic aperture radar image formation requires a high communication-to-computation ratio. Experiments performed over a decade ago on distributed memory supercomputers, and repeated a half-decade ago on commodity workstation clusters, both demonstrated saturation of inter-node communication links. For this reason, we chose to parallelize the interferometric processor on the shared memory computer using the OpenMP programming standard. We find, not unexpectedly, that the input/output stage of processing standard 100-by-100~kilometer ERS-1 scenes quickly dominates the total computation time, and that only modest increases in processing time are achieved after 8 to 16 processor cores are brought to bear on a single data set. The input and output data sit in single, serially accessed disk files, creating a bottleneck for overall throughput. This points to a scheme for efficient partitioning of mid-size (24 to 48~core) machines for reducing large Earth science data sets, where 3 to

  9. Improved multiprocessor garbage collection algorithms

    SciTech Connect

    Newman, I.A.; Stallard, R.P.; Woodward, M.C.

    1983-01-01

    Outlines the results of an investigation of existing multiprocessor garbage collection algorithms and introduces two new algorithms which significantly improve some aspects of the performance of their predecessors. The two algorithms arise from different starting assumptions. One considers the case where the algorithm will terminate successfully whatever list structure is being processed and assumes that the extra data space should be minimised. The other seeks a very fast garbage collection time for list structures that do not contain loops. Results of both theoretical and experimental investigations are given to demonstrate the efficacy of the algorithms. 7 references.

  10. A high-performance MPI implementation on a shared-memory vector supercomputer.

    SciTech Connect

    Gropp, W.; Lusk, E.; Mathematics and Computer Science

    1997-01-01

    In this article we recount the sequence of steps by which MPICH, a high-performance, portable implementation of the Message-Passing Interface (MPI) standard, was ported to the NEC SX-4, a high-performance parallel supercomputer. Each step in the sequence raised issues that are important for shared-memory programming in general and shed light on both MPICH and the SX-4. The result is a low-latency, very high bandwidth implementation of MPI for the NEC SX-4. In the process, MPICH was also improved in several general ways.

  11. Set of FORTRAN routines for multitask timing analysis on shared memory machines

    SciTech Connect

    Montry, G.R.

    1986-03-31

    A set of FORTRAN-based timing routines has been written for shared memory parallel processors. These routines are designed to measure the performance of multitasking codes on machines with different hardware configurations. Complete run-time histories of all executing tasks are provided in a postmortem summary. The package is able to provide both processing time and elapsed time statistics for each task subject to functional hardware constraints. Source code for the ELXSI 6400 version of the package is included in an Appendix. 3 refs., 4 figs.

  12. Multiprocessor system with multiple concurrent modes of execution

    SciTech Connect

    Ahn, Daniel; Ceze, Luis H; Chen, Dong; Gara, Alan; Heidelberger, Philip; Ohmacht, Martin

    2013-12-31

    A multiprocessor system supports multiple concurrent modes of speculative execution. Speculation identification numbers (IDs) are allocated to speculative threads from a pool of available numbers. The pool is divided into domains, with each domain being assigned to a mode of speculation. Modes of speculation include TM, TLS, and rollback. Allocation of the IDs is carried out with respect to a central state table and using hardware pointers. The IDs are used for writing different versions of speculative results in different ways of a set in a cache memory.

  13. Exploring the use of I/O nodes for computation in a MIMD multiprocessor

    NASA Technical Reports Server (NTRS)

    Kotz, David; Cai, Ting

    1995-01-01

    As parallel systems move into the production scientific-computing world, the emphasis will be on cost-effective solutions that provide high throughput for a mix of applications. Cost effective solutions demand that a system make effective use of all of its resources. Many MIMD multiprocessors today, however, distinguish between 'compute' and 'I/O' nodes, the latter having attached disks and being dedicated to running the file-system server. This static division of responsibilities simplifies system management but does not necessarily lead to the best performance in workloads that need a different balance of computation and I/O. Of course, computational processes sharing a node with a file-system service may receive less CPU time, network bandwidth, and memory bandwidth than they would on a computation-only node. In this paper we begin to examine this issue experimentally. We found that high performance I/O does not necessarily require substantial CPU time, leaving plenty of time for application computation. There were some complex file-system requests, however, which left little CPU time available to the application. (The impact on network and memory bandwidth still needs to be determined.) For applications (or users) that cannot tolerate an occasional interruption, we recommend that they continue to use only compute nodes. For tolerant applications needing more cycles than those provided by the compute nodes, we recommend that they take full advantage of both compute and I/O nodes for computation, and that operating systems should make this possible.

  14. An implementation of SISAL for distributed-memory architectures

    SciTech Connect

    Beard, P.C.

    1995-06-01

    This thesis describes a new implementation of the implicitly parallel functional programming language SISAL, for massively parallel processor supercomputers. The Optimizing SISAL Compiler (OSC), developed at Lawrence Livermore National Laboratory, was originally designed for shared-memory multiprocessor machines and has been adapted to distributed-memory architectures. OSC has been relatively portable between shared-memory architectures, because they are architecturally similar, and OSC generates portable C code. However, distributed-memory architectures are not standardized -- each has a different programming model. Distributed-memory SISAL depends on a layer of software that provides a portable, distributed, shared-memory abstraction. This layer is provided by Split-C, a dialect of the C programming language developed at U.C. Berkeley, which has demonstrated good performance on distributed-memory architectures. Split-C provides important capabilities for good performance: support for program-specific distributed data structures, and split-phase memory operations. Distributed data structures help achieve good memory locality, while split-phase memory operations help tolerate the longer communication latencies inherent in distributed-memory architectures. The distributed-memory SISAL compiler and run-time system takes advantage of these capabilities. The results of these efforts is a compiler that runs identically on the Thinking Machines Connection Machine (CM-5), and the Meiko Computing Surface (CS-2).

  15. Optically Connected Multiprocessors For Simulating Artificial Neural Networks

    NASA Astrophysics Data System (ADS)

    Ghosh, Joydeep; Hwang, Kai

    1988-05-01

    This paper investigates the architectural requirements in simulating large neural networks using a highly parallel multiprocessor with distributed memory and optical interconnects. First, we model the structure of a neural network and the functional behavior of individual cells. These models are used to estimate the volume of messages that need to be exchanged among physical processors to simulate the weighted connections of the neural network. The distributed processor/memory organization is tailored to an electronic implementation for greater versatility and flexibility. Optical interconnects are used to satisfy the interprocessor communication bandwidth demands. The hybrid implementation attempts to balance the processing, memory and bandwidth demands in simulating asynchronous, value-passing models for cooperative parallel computation with self-learning capabilities.

  16. Sorting large files on a backend multiprocessor

    SciTech Connect

    Beck, M.; Bitton, D.; Wilkinson, W.K.

    1988-07-01

    A fundamental measure of processing power in a database management system is the performance of the sort utility it provides. When sorting a large data file on a serial computer, performance is limited by factors involving processor speed, memory capacity, and I/O bandwidth. In this paper, the authors investigate the feasibility and efficiency of a parallel sort-merge algorithm through implementation on the JASMIN prototype, a backend multiprocessor built around a fast packet bus. The authors describe the design and implementation of a parallel sort utility. They then present and analyze the results of measurements corresponding to a range of file sizes and processor configurations. Their results show that using current, off-the-shelf technology coupled with a streamlined distributed operating system, three- and five-microprocessor configurations provide a very cost-effective sort of large files. The three-processor configuration sorts a 100 Mbyte file in 1 h, which compares well to commercial sort packages available on high-performance mainframes. In additional experiments, the authors investigate a model to tune their sort software and scale their results to higher processor and network capabilities.

  17. Pushing away the communication bottleneck with optical interconnects in symmetric multiprocessors

    NASA Astrophysics Data System (ADS)

    Hlayhel, Wissam; Collet, Jacques H.; Rochange, Christine; Litaize, Daniel

    2000-05-01

    We analyze the bandwidth needed for transmitting the addresses in future symmetric multiprocessor machines (SMP), constructed around a shared bus due to the critical obligation to preserve the coherence of the memory hierarchy. We show that an address-transaction bandwidth as high as several hundreds of Gbit/s will be necessary not to slow down the execution of most applications in large SMP's. This communication bandwidth seems incompatible with the operation constraints of shared electrical busses, making necessary the search for other implementations of the address transmission network. We consider the introduction of optical interconnects (OI) in this context. We review several solutions, in the ascending order of complexity of the optical subsystems as one critical issue concerns the degree of sophistication of the optical solutions and their cost. We first consider simple point to point OI's for a SMP chipset. The interest for OI's comes from the low energy consumption and from the possibility, in the future, to integrate several thousands of optical input/outputs per electronic chip. The we consider the implementation of an optical bus that is a multipoint optical line involving more optical functionality. We discuss the possibility of multiple accesses to the bus, and the constraints related to the necessity to maintain the coherence of caches.

  18. An experimental distributed microprocessor implementation with a shared memory communications and control medium

    NASA Technical Reports Server (NTRS)

    Mejzak, R. S.

    1980-01-01

    The distributed processing concept is defined in terms of control primitives, variables, and structures and their use in performing a decomposed discrete Fourier transform (DET) application function. The design assumes interprocessor communications to be anonymous. In this scheme, all processors can access an entire common database by employing control primitives. Access to selected areas within the common database is random, enforced by a hardware lock, and determined by task and subtask pointers. This enables the number of processors to be varied in the configuration without any modifications to the control structure. Decompositional elements of the DFT application function in terms of tasks and subtasks are also described. The experimental hardware configuration consists of IMSAI 8080 chassis which are independent, 8 bit microcomputer units. These chassis are linked together to form a multiple processing system by means of a shared memory facility. This facility consists of hardware which provides a bus structure to enable up to six microcomputers to be interconnected. It provides polling and arbitration logic so that only one processor has access to shared memory at any one time.

  19. Hybrid image classification and parameter selection using a shared memory parallel algorithm

    NASA Astrophysics Data System (ADS)

    Phillips, Rhonda D.; Watson, Layne T.; Wynne, Randolph H.

    2007-07-01

    This work presents a shared memory parallel version of the hybrid classification algorithm IGSCR (iterative guided spectral class rejection) to facilitate the transition from serial to parallel processing. This transition is motivated by a demonstrated need for more computing power driven by the increasing size of remote sensing data sets due to higher resolution sensors, larger study regions, and the like. Parallel IGSCR was developed to produce fast and portable code using Fortran 95, OpenMP, and the Hierarchical Data Format version 5 (HDF5) and accompanying data access library. The intention of this work is to provide an efficient implementation of the established IGSCR classification algorithm. The applicability of the faster parallel IGSCR algorithm is demonstrated by classifying Landsat data covering most of Virginia, USA into forest and non-forest classes with approximately 90% accuracy. Parallel results are given using the SGI Altix 3300 shared memory computer and the SGI Altix 3700 with as many as 64 processors reaching speedups of almost 77. Parallel IGSCR allows an analyst to perform and assess multiple classifications to refine parameters. As an example, parallel IGSCR was used for a factorial analysis consisting of 42 classifications of a 1.2 GB image to select the number of initial classes (70) and class purity (70%) used for the remaining two images.

  20. Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

    NASA Astrophysics Data System (ADS)

    Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide

    2015-09-01

    The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.

  1. Multiprocessor performance modeling with ADAS

    NASA Technical Reports Server (NTRS)

    Hayes, Paul J.; Andrews, Asa M.

    1989-01-01

    A graph managing strategy referred to as the Algorithm to Architecture Mapping Model (ATAMM) appears useful for the time-optimized execution of application algorithm graphs in embedded multiprocessors and for the performance prediction of graph designs. This paper reports the modeling of ATAMM in the Architecture Design and Assessment System (ADAS) to make an independent verification of ATAMM's performance prediction capability and to provide a user framework for the evaluation of arbitrary algorithm graphs. Following an overview of ATAMM and its major functional rules are descriptions of the ADAS model of ATAMM, methods to enter an arbitrary graph into the model, and techniques to analyze the simulation results. The performance of a 7-node graph example is evaluated using the ADAS model and verifies the ATAMM concept by substantiating previously published performance results.

  2. ATAMM enhancement and multiprocessor performance evaluation

    NASA Technical Reports Server (NTRS)

    Stoughton, John W.; Mielke, Roland R.; Som, Sukhamoy; Obando, Rodrigo; Malekpour, Mahyar R.; Jones, Robert L., III; Mandala, Brij Mohan V.

    1991-01-01

    ATAMM (Algorithm To Architecture Mapping Model) enhancement and multiprocessor performance evaluation is discussed. The following topics are included: the ATAMM model; ATAMM enhancement; ADM (Advanced Development Model) implementation of ATAMM; and ATAMM support tools.

  3. Real-time topological image smoothing on shared memory parallel machines

    NASA Astrophysics Data System (ADS)

    Mahmoudi, Ramzi; Akil, Mohamed

    2011-03-01

    Smoothing filter is the method of choice for image preprocessing and pattern recognition. We present a new concurrent method for smoothing 2D object in binary case. Proposed method provides a parallel computation while preserving the topology by using homotopic transformations. We introduce an adapted parallelization strategy called split, distribute and merge (SDM) strategy which allows efficient parallelization of a large class of topological operators including, mainly, smoothing, skeletonization, and watershed algorithms. To achieve a good speedup, we cared about task scheduling. Distributed work during smoothing process is done by a variable number of threads. Tests on 2D binary image (512*512), using shared memory parallel machine (SMPM) with 8 CPU cores (2× Xeon E5405 running at frequency of 2 GHz), showed an enhancement of 5.2 thus a cadency of 32 images per second is achieved.

  4. Testing and operating a multiprocessor chip with processor redundancy

    DOEpatents

    Bellofatto, Ralph E; Douskey, Steven M; Haring, Rudolf A; McManus, Moyra K; Ohmacht, Martin; Schmunkamp, Dietmar; Sugavanam, Krishnan; Weatherford, Bryan J

    2014-10-21

    A system and method for improving the yield rate of a multiprocessor semiconductor chip that includes primary processor cores and one or more redundant processor cores. A first tester conducts a first test on one or more processor cores, and encodes results of the first test in an on-chip non-volatile memory. A second tester conducts a second test on the processor cores, and encodes results of the second test in an external non-volatile storage device. An override bit of a multiplexer is set if a processor core fails the second test. In response to the override bit, the multiplexer selects a physical-to-logical mapping of processor IDs according to one of: the encoded results in the memory device or the encoded results in the external storage device. On-chip logic configures the processor cores according to the selected physical-to-logical mapping.

  5. Parallel Fock matrix construction with distributed shared memory model for the FMO-MO method.

    PubMed

    Umeda, Hiroaki; Inadomi, Yuichi; Watanabe, Toshio; Yagi, Toru; Ishimoto, Takayoshi; Ikegami, Tsutomu; Tadano, Hiroto; Sakurai, Tetsuya; Nagashima, Umpei

    2010-10-01

    A parallel Fock matrix construction program for FMO-MO method has been developed with the distributed shared memory model. To construct a large-sized Fock matrix during FMO-MO calculations, a distributed parallel algorithm was designed to make full use of local memory to reduce communication, and was implemented on the Global Array toolkit. A benchmark calculation for a small system indicates that the parallelization efficiency of the matrix construction portion is as high as 93% at 1,024 processors. A large FMO-MO application on the epidermal growth factor receptor (EGFR) protein (17,246 atoms and 96,234 basis functions) was also carried out at the HF/6-31G level of theory, with the frontier orbitals being extracted by a Sakurai-Sugiura eigensolver. It takes 11.3 h for the FMO calculation, 49.1 h for the Fock matrix construction, and 10 min to extract 94 eigen-components on a PC cluster system using 256 processors.

  6. Performance and Application of Parallel OVERFLOW Codes on Distributed and Shared Memory Platforms

    NASA Technical Reports Server (NTRS)

    Djomehri, M. Jahed; Rizk, Yehia M.

    1999-01-01

    The presentation discusses recent studies on the performance of the two parallel versions of the aerodynamics CFD code, OVERFLOW_MPI and _MLP. Developed at NASA Ames, the serial version, OVERFLOW, is a multidimensional Navier-Stokes flow solver based on overset (Chimera) grid technology. The code has recently been parallelized in two ways. One is based on the explicit message-passing interface (MPI) across processors and uses the _MPI communication package. This approach is primarily suited for distributed memory systems and workstation clusters. The second, termed the multi-level parallel (MLP) method, is simple and uses shared memory for all communications. The _MLP code is suitable on distributed-shared memory systems. For both methods, the message passing takes place across the processors or processes at the advancement of each time step. This procedure is, in effect, the Chimera boundary conditions update, which is done in an explicit "Jacobi" style. In contrast, the update in the serial code is done in more of the "Gauss-Sidel" fashion. The programming efforts for the _MPI code is more complicated than for the _MLP code; the former requires modification of the outer and some inner shells of the serial code, whereas the latter focuses only on the outer shell of the code. The _MPI version offers a great deal of flexibility in distributing grid zones across a specified number of processors in order to achieve load balancing. The approach is capable of partitioning zones across multiple processors or sending each zone and/or cluster of several zones into a single processor. The message passing across the processors consists of Chimera boundary and/or an overlap of "halo" boundary points for each partitioned zone. The MLP version is a new coarse-grain parallel concept at the zonal and intra-zonal levels. A grouping strategy is used to distribute zones into several groups forming sub-processes which will run in parallel. The total volume of grid points in each

  7. Multiprocessor architecture to handle TJ-II VXI-based digitization channels

    NASA Astrophysics Data System (ADS)

    Crémy, C.; Vega, J.; Sánchez, E.; Dulya, C. M.; Portas, A.

    1999-01-01

    The data acquisition System (DAS) of the TJ-II stellerator provides up to 300 digitization channels integrated in register-based VXI modules designed in CIEMAT Laboratories. The modules are embedded into six 13-slot VXI chassis connected to the TJ-II DAS central computer by means of a dual LAN topology. During normal operation, remote control of the VXI systems and channel setup are accomplished through an Ethernet LAN, while two FDDI rings are dedicated to postdischarge fast data transfer. The former network link is performed by the bus controller whereas the latter one is provided through a FDDI node controller installed in the mainframe, thus creating a multiprocessor architecture. Dedicated software, running on the VxWorks operating system, has been developed to provide handling of the VXI systems including the following facilities: mainframe information readout, channel setup, real time digitization handling, and data transfer. This software, implemented in C++, is distributed over the two CPUs. Interprocessor communication for synchronization purposes is based on a backplane shared memory pool.

  8. Linear solvers on multiprocessor machines

    SciTech Connect

    Kalogerakis, M.A.

    1986-01-01

    Two new methods are introduced for the parallel solution of banded linear systems on multiprocessor machines. Moreover, some new techniques are obtained as variations of the two methods that are applicable to special instances of the problem. Comparisons with the best known methods are performed, from which it is concluded that the two methods are superior, while their variations for special instances are, in general, competitive and in some cases best. In the process, some new results on the parallel prefix problem are obtained and a new design for this problem is presented that is suitable for VLSI implementation. Furthermore, a general model is introduced for the analysis and classification of methods that are based on row transformations of matrices. It is seen that most known methods are included in this model. It is demonstrated that this model may be used as a basis for the analysis as well as the generation of important aspects of those methods, such as their arithmetic complexity and interprocessor communication requirements.

  9. Multiprocessor smalltalk: Implementation, performance, and analysis

    SciTech Connect

    Pallas, J.I.

    1990-01-01

    Multiprocessor Smalltalk demonstrates the value of object-oriented programming on a multiprocessor. Its implementation and analysis shed light on three areas: concurrent programming in an object oriented language without special extensions, implementation techniques for adapting to multiprocessors, and performance factors in the resulting system. Adding parallelism to Smalltalk code is easy, because programs already use control abstractions like iterators. Smalltalk's basic control and concurrency primitives (lambda expressions, processes and semaphores) can be used to build parallel control abstractions, including parallel iterators, parallel objects, atomic objects, and futures. Language extensions for concurrency are not required. This implementation demonstrates that it is possible to build an efficient parallel object-oriented programming system and illustrates techniques for doing so. Three modification tools-serialization, replication, and reorganization-adapted the Berkeley Smalltalk interpreter to the Firefly multiprocessor. Multiprocessor Smalltalk's performance shows that the combination of multiprocessing and object-oriented programming can be effective: speedups (relative to the original serial version) exceed 2.0 for five processors on all the benchmarks; the median efficiency is 48%. Analysis shows both where performance is lost and how to improve and generalize the experimental results. Changes in the interpreter to support concurrency add at most 12% overhead; better access to per-process variables could eliminate much of that. Changes in the user code to express concurrency add as much as 70% overhead; this overhead could be reduced to 54% if blocks (lambda expressions) were reentrant. Performance is also lost when the program cannot keep all five processors busy.

  10. Shared Etiology of Phonological Memory and Vocabulary Deficits in School-Age Children

    PubMed Central

    Peterson, Robin L.; Pennington, Bruce F.; Samuelsson, Stefan; Byrne, Brian; Olson, Richard K.

    2012-01-01

    Purpose The goal of this study was to investigate the etiologic basis for the association between deficits in phonological memory (PM) and vocabulary in school-age children. Method Children with deficits in PM or vocabulary were identified within the International Longitudinal Twin Study (ILTS). The ILTS includes 1,045 twin pairs from the United States, Australia, and Scandinavia aged 5 to 8 years. We applied the DeFries-Fulker regression method to determine whether problems in PM and vocabulary tend to co-occur because of overlapping genes, overlapping environmental risk factors, or both. Results Among children with isolated PM deficits, we found significant bivariate heritability of PM and vocabulary weaknesses both within and across time. However, when probands were selected for a vocabulary deficit, there was no evidence for bivariate heritability. In this case, the PM-vocabulary relationship appeared to owe to common shared environmental experiences. Conclusions The findings are consistent with previous research on the heritability of specific language impairment and suggest that there are etiologic subgroups of children with poor vocabulary for different reasons, one more influenced by genes and another more influenced by environment. PMID:23275423

  11. Parallel computational steering for HPC applications using HDF5 files in distributed shared memory.

    PubMed

    Biddiscombe, John; Soumagne, Jerome; Oger, Guillaume; Guibert, David; Piccinali, Jean-Guillaume

    2012-06-01

    Interfacing a GUI driven visualization/analysis package to an HPC application enables a supercomputer to be used as an interactive instrument. We achieve this by replacing the IO layer in the HDF5 library with a custom driver which transfers data in parallel between simulation and analysis. Our implementation using ParaView as the interface, allows a flexible combination of parallel simulation, concurrent parallel analysis, and GUI client, either on the same or separate machines. Each MPI job may use different core counts or hardware configurations, allowing fine tuning of the amount of resources dedicated to each part of the workload. By making use of a distributed shared memory file, one may read data from the simulation, modify it using ParaView pipelines, write it back, to be reused by the simulation (or vice versa). This allows not only simple parameter changes, but complete remeshing of grids, or operations involving regeneration of field values over the entire domain. To avoid the problem of manually customizing the GUI for each application that is to be steered, we make use of XML templates that describe outputs from the simulation (and inputs back to it) to automatically generate GUI controls for manipulation of the simulation. PMID:22350196

  12. MLP: A Parallel Programming Alternative to MPI for New Shared Memory Parallel Systems

    NASA Technical Reports Server (NTRS)

    Taft, James R.

    1999-01-01

    Recent developments at the NASA AMES Research Center's NAS Division have demonstrated that the new generation of NUMA based Symmetric Multi-Processing systems (SMPs), such as the Silicon Graphics Origin 2000, can successfully execute legacy vector oriented CFD production codes at sustained rates far exceeding processing rates possible on dedicated 16 CPU Cray C90 systems. This high level of performance is achieved via shared memory based Multi-Level Parallelism (MLP). This programming approach, developed at NAS and outlined below, is distinct from the message passing paradigm of MPI. It offers parallelism at both the fine and coarse grained level, with communication latencies that are approximately 50-100 times lower than typical MPI implementations on the same platform. Such latency reductions offer the promise of performance scaling to very large CPU counts. The method draws on, but is also distinct from, the newly defined OpenMP specification, which uses compiler directives to support a limited subset of multi-level parallel operations. The NAS MLP method is general, and applicable to a large class of NASA CFD codes.

  13. Parallel computational steering for HPC applications using HDF5 files in distributed shared memory.

    PubMed

    Biddiscombe, John; Soumagne, Jerome; Oger, Guillaume; Guibert, David; Piccinali, Jean-Guillaume

    2012-06-01

    Interfacing a GUI driven visualization/analysis package to an HPC application enables a supercomputer to be used as an interactive instrument. We achieve this by replacing the IO layer in the HDF5 library with a custom driver which transfers data in parallel between simulation and analysis. Our implementation using ParaView as the interface, allows a flexible combination of parallel simulation, concurrent parallel analysis, and GUI client, either on the same or separate machines. Each MPI job may use different core counts or hardware configurations, allowing fine tuning of the amount of resources dedicated to each part of the workload. By making use of a distributed shared memory file, one may read data from the simulation, modify it using ParaView pipelines, write it back, to be reused by the simulation (or vice versa). This allows not only simple parameter changes, but complete remeshing of grids, or operations involving regeneration of field values over the entire domain. To avoid the problem of manually customizing the GUI for each application that is to be steered, we make use of XML templates that describe outputs from the simulation (and inputs back to it) to automatically generate GUI controls for manipulation of the simulation.

  14. A combined PLC and CPU approach to multiprocessor control

    SciTech Connect

    Harris, J.J.; Broesch, J.D.; Coon, R.M.

    1995-10-01

    A sophisticated multiprocessor control system has been developed for use in the E-Power Supply System Integrated Control (EPSSIC) on the DIII-D tokamak. EPSSIC provides control and interlocks for the ohmic heating coil power supply and its associated systems. Of particular interest is the architecture of this system: both a Programmable Logic Controller (PLC) and a Central Processor Unit (CPU) have been combined on a standard VME bus. The PLC and CPU input and output signals are routed through signal conditioning modules, which provide the necessary voltage and ground isolation. Additionally these modules adapt the signal levels to that of the VME I/O boards. One set of I/O signals is shared between the two processors. The resulting multiprocessor system provides a number of advantages: redundant operation for mission critical situations, flexible communications using conventional TCP/IP protocols, the simplicity of ladder logic programming for the majority of the control code, and an easily maintained and expandable non-proprietary system.

  15. Multiprocessor switch with selective pairing

    DOEpatents

    Gara, Alan; Gschwind, Michael K; Salapura, Valentina

    2014-03-11

    System, method and computer program product for a multiprocessing system to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). Each paired microprocessor or processor cores that provide one highly reliable thread for high-reliability connect with a system components such as a memory "nest" (or memory hierarchy), an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. The memory nest is attached to a selective pairing facility via a switch or a bus

  16. Efficient implementation techniques for gracefully degradable multiprocessor systems

    SciTech Connect

    Liu, J.C.; Shin, K.G.

    1995-04-01

    We propose the dynamic reconfiguration network (DRN) and a monitoring-at-transmission (MAT) bus to support dynamic reconfiguration of an N-modular redundancy (NMR) multiprocessor system. In the reconfiguration process, a maximal number of processor triads are guaranteed to be formed on each processor cluster, thus supporting gracefully degradable operations. This is made possible by dynamically routing the control and clock signals of processors on the DRN so as to synchronize fault-free processors. The MAT bus is an efficient way to implement a triple modular redundant (TMR) pipeline voter (PV), which is a special case of the voting network proposed previously. Extensive experimental results have shown to support our design concept, and the performance of different cache memory organizations is evaluated through an analytic model. 22 refs.

  17. Spaceborne VHSIC multiprocessor system for AI applications

    NASA Technical Reports Server (NTRS)

    Lum, Henry, Jr.; Shrobe, Howard E.; Aspinall, John G.

    1988-01-01

    A multiprocessor system, under design for space-station applications, makes use of the latest generation symbolic processor and packaging technology. The result will be a compact, space-qualified system two to three orders of magnitude more powerful than present-day symbolic processing systems.

  18. Energy efficient low power shared-memory Fast Fourier Transform (FFT) processor with dynamic voltage scaling

    NASA Astrophysics Data System (ADS)

    Fitrio, D.; Singh, J.; Stojcevski, A.

    2005-12-01

    Reduction of power dissipations in CMOS circuits needs to be addressed for portable battery devices. Selection of appropriate transistor library to minimise leakage current, implementation of low power design architectures, power management implementation, and the choice of chip packaging, all have impact on power dissipation and are important considerations in design and implementation of integrated circuits for low power applications. Energy-efficient architecture is highly desirable for battery operated systems, which operates in a wide variation of operating scenarios. Energy-efficient design aims to reconfigure its own architectures to scale down energy consumption depending upon the throughput and quality requirement. An energy efficient system should be able to decide its minimum power requirements by dynamically scaling its own operating frequency, supply voltage or the threshold voltage according to a variety of operating scenarios. The increasing product demand for application specific integrated circuit or processor for independent portable devices has influenced designers to implement dedicated processors with ultra low power requirements. One of these dedicated processors is a Fast Fourier Transform (FFT) processor, which is widely used in signal processing for numerous applications such as, wireless telecommunication and biomedical applications where the demand for extended battery life is extremely high. This paper presents the design and performance analysis of a low power shared memory FFT processor incorporating dynamic voltage scaling. Dynamic voltage scaling enables power supply scaling into various supply voltage levels. The concept behind the proposed solution is that if the speed of the main logic core can be adjusted according to input load or amount of processor's computation "just enough" to meet the requirement. The design was implemented using 0.12 μm ST-Microelectronic 6-metal layer CMOS dual- process technology in Cadence Analogue

  19. Thread mapping using system-level model for shared memory multicores

    NASA Astrophysics Data System (ADS)

    Mitra, Reshmi

    Exploring thread-to-core mapping options for a parallel application on a multicore architecture is computationally very expensive. For the same algorithm, the mapping strategy (MS) with the best response time may change with data size and thread counts. The primary challenge is to design a fast, accurate and automatic framework for exploring these MSs for large data-intensive applications. This is to ensure that the users can explore the design space within reasonable machine hours, without thorough understanding on how the code interacts with the platform. Response time is related to the cycles per instructions retired (CPI), taking into account both active and sleep states of the pipeline. This work establishes a hybrid approach, based on Markov Chain Model (MCM) and Model Tree (MT) for system-level steady state CPI prediction. It is designed for shared memory multicore processors with coarse-grained multithreading. The thread status is represented by the MCM states. The program characteristics are modeled as the transition probabilities, representing the system moving between active and suspended thread states. The MT model extrapolates these probabilities for the actual application size (AS) from the smaller AS performance. This aspect of the framework, along with, the use of mathematical expressions for the actual AS performance information, results in a tremendous reduction in the CPI prediction time. The framework is validated using an electromagnetics application. The average performance prediction error for steady state CPI results with 12 different MSs is less than 1%. The total run time of model is of the order of minutes, whereas the actual application execution time is in terms of days.

  20. Fault-tolerant interconnection networks for multiprocessor systems

    SciTech Connect

    Nassar, H.M.

    1989-01-01

    Interconnection networks represent the backbone of multiprocessor systems. A failure in the network, therefore, could seriously degrade the system performance. For this reason, fault tolerance has been regarded as a major consideration in interconnection network design. This thesis presents two novel techniques to provide fault tolerance capabilities to three major networks: the Beneline network and the Clos network. First, the Simple Fault Tolerance Technique (SFT) is presented. The SFT technique is in fact the result of merging two widely known interconnection mechanisms: a normal interconnection network and a shared bus. This technique is most suitable for networks with small switches, such as the Baseline network and the Benes network. For the Clos network, whose switches may be large for the SFT, another technique is developed to produce the Fault-Tolerant Clos (FTC) network. In the FTC, one switch is added to each stage. The two techniques are described and thoroughly analyzed.

  1. Satisfiability Test with Synchronous Simulated Annealing on the Fujitsu AP1000 Massively-Parallel Multiprocessor

    NASA Technical Reports Server (NTRS)

    Sohn, Andrew; Biswas, Rupak

    1996-01-01

    Solving the hard Satisfiability Problem is time consuming even for modest-sized problem instances. Solving the Random L-SAT Problem is especially difficult due to the ratio of clauses to variables. This report presents a parallel synchronous simulated annealing method for solving the Random L-SAT Problem on a large-scale distributed-memory multiprocessor. In particular, we use a parallel synchronous simulated annealing procedure, called Generalized Speculative Computation, which guarantees the same decision sequence as sequential simulated annealing. To demonstrate the performance of the parallel method, we have selected problem instances varying in size from 100-variables/425-clauses to 5000-variables/21,250-clauses. Experimental results on the AP1000 multiprocessor indicate that our approach can satisfy 99.9 percent of the clauses while giving almost a 70-fold speedup on 500 processors.

  2. Development and Validation of a Hierarchical Memory Model Incorporating CPU- and Memory-Operation Overlap

    SciTech Connect

    Lubeck, Olaf M.; Luo, Yong; Wasserman, Harvey J.; Bassetti, Federico

    1997-12-31

    Distributed shared memory architectures (DSM`s) such as the Origin 2000 are being implemented which extend the concept of single-processor cache hierarchies across an entire physically-distributed multiprocessor machine. The scalability of a DSM machine is inherently tied to memory hierarchy performance, including such issues as latency hiding techniques in the architecture, global cache-coherence protocols, memory consistency models and, of course, the inherent locality of reference in algorithms of interest. In this paper, we characterize application performance with a {open_quotes}memory-centric{close_quotes} view. Using a simple mean value analysis (MVA) strategy and empirical performance data, we infer the contribution of each level in the memory system to the application`s overall cycles per instruction (cpi). We account for the overlap of processor execution with memory accesses - a key parameter which is not directly measurable on the Origin systems. We infer the separate contributions of three major architecture features in the memory subsystem of the Origin 2000: cache size, outstanding loads-under-miss, and memory latency.

  3. Bibliography On Multiprocessors And Distributed Processing

    NASA Technical Reports Server (NTRS)

    Miya, Eugene N.

    1988-01-01

    Multiprocessor and Distributed Processing Bibliography package consists of large machine-readable bibliographic data base, which in addition to usual keyword searches, used for producing citations, indexes, and cross-references. Data base contains UNIX(R) "refer" -formatted ASCII data and implemented on any computer running under UNIX(R) operating system. Easily convertible to other operating systems. Requires approximately one megabyte of secondary storage. Bibliography compiled in 1985.

  4. Frontal optimization algorithms for multiprocessor computers

    SciTech Connect

    Sergienko, I.V.; Gulyanitskii, L.F.

    1981-11-01

    The authors describe one of the approaches to the construction of locally optimal optimization algorithms on multiprocessor computers. Algorithms of this type, called frontal, have been realized previously on single-processor computers, although this configuration does not fully exploit the specific features of their computational scheme. Experience with a number of practical discrete optimization problems confirms that the frontal algorithms are highly successful even with single-processor computers. 9 references.

  5. Associative-memory representations emerge as shared spatial patterns of theta activity spanning the primate temporal cortex.

    PubMed

    Nakahara, Kiyoshi; Adachi, Ken; Kawasaki, Keisuke; Matsuo, Takeshi; Sawahata, Hirohito; Majima, Kei; Takeda, Masaki; Sugiyama, Sayaka; Nakata, Ryota; Iijima, Atsuhiko; Tanigawa, Hisashi; Suzuki, Takafumi; Kamitani, Yukiyasu; Hasegawa, Isao

    2016-01-01

    Highly localized neuronal spikes in primate temporal cortex can encode associative memory; however, whether memory formation involves area-wide reorganization of ensemble activity, which often accompanies rhythmicity, or just local microcircuit-level plasticity, remains elusive. Using high-density electrocorticography, we capture local-field potentials spanning the monkey temporal lobes, and show that the visual pair-association (PA) memory is encoded in spatial patterns of theta activity in areas TE, 36, and, partially, in the parahippocampal cortex, but not in the entorhinal cortex. The theta patterns elicited by learned paired associates are distinct between pairs, but similar within pairs. This pattern similarity, emerging through novel PA learning, allows a machine-learning decoder trained on theta patterns elicited by a particular visual item to correctly predict the identity of those elicited by its paired associate. Our results suggest that the formation and sharing of widespread cortical theta patterns via learning-induced reorganization are involved in the mechanisms of associative memory representation. PMID:27282247

  6. Associative-memory representations emerge as shared spatial patterns of theta activity spanning the primate temporal cortex

    PubMed Central

    Nakahara, Kiyoshi; Adachi, Ken; Kawasaki, Keisuke; Matsuo, Takeshi; Sawahata, Hirohito; Majima, Kei; Takeda, Masaki; Sugiyama, Sayaka; Nakata, Ryota; Iijima, Atsuhiko; Tanigawa, Hisashi; Suzuki, Takafumi; Kamitani, Yukiyasu; Hasegawa, Isao

    2016-01-01

    Highly localized neuronal spikes in primate temporal cortex can encode associative memory; however, whether memory formation involves area-wide reorganization of ensemble activity, which often accompanies rhythmicity, or just local microcircuit-level plasticity, remains elusive. Using high-density electrocorticography, we capture local-field potentials spanning the monkey temporal lobes, and show that the visual pair-association (PA) memory is encoded in spatial patterns of theta activity in areas TE, 36, and, partially, in the parahippocampal cortex, but not in the entorhinal cortex. The theta patterns elicited by learned paired associates are distinct between pairs, but similar within pairs. This pattern similarity, emerging through novel PA learning, allows a machine-learning decoder trained on theta patterns elicited by a particular visual item to correctly predict the identity of those elicited by its paired associate. Our results suggest that the formation and sharing of widespread cortical theta patterns via learning-induced reorganization are involved in the mechanisms of associative memory representation. PMID:27282247

  7. Multi-core and Many-core Shared-memory Parallel Raycasting Volume Rendering Optimization and Tuning

    SciTech Connect

    Howison, Mark

    2012-01-31

    Given the computing industry trend of increasing processing capacity by adding more cores to a chip, the focus of this work is tuning the performance of a staple visualization algorithm, raycasting volume rendering, for shared-memory parallelism on multi-core CPUs and many-core GPUs. Our approach is to vary tunable algorithmic settings, along with known algorithmic optimizations and two different memory layouts, and measure performance in terms of absolute runtime and L2 memory cache misses. Our results indicate there is a wide variation in runtime performance on all platforms, as much as 254% for the tunable parameters we test on multi-core CPUs and 265% on many-core GPUs, and the optimal configurations vary across platforms, often in a non-obvious way. For example, our results indicate the optimal configurations on the GPU occur at a crossover point between those that maintain good cache utilization and those that saturate computational throughput. This result is likely to be extremely difficult to predict with an empirical performance model for this particular algorithm because it has an unstructured memory access pattern that varies locally for individual rays and globally for the selected viewpoint. Our results also show that optimal parameters on modern architectures are markedly different from those in previous studies run on older architectures. And, given the dramatic performance variation across platforms for both optimal algorithm settings and performance results, there is a clear benefit for production visualization and analysis codes to adopt a strategy for performance optimization through auto-tuning. These benefits will likely become more pronounced in the future as the number of cores per chip and the cost of moving data through the memory hierarchy both increase.

  8. Shared and distinct contributions of rostrolateral prefrontal cortex to analogical reasoning and episodic memory retrieval.

    PubMed

    Westphal, Andrew J; Reggente, Nicco; Ito, Kaori L; Rissman, Jesse

    2016-03-01

    Rostrolateral prefrontal cortex (RLPFC) is widely appreciated to support higher cognitive functions, including analogical reasoning and episodic memory retrieval. However, these tasks have typically been studied in isolation, and thus it is unclear whether they involve common or distinct RLPFC mechanisms. Here, we introduce a novel functional magnetic resonance imaging (fMRI) task paradigm to compare brain activity during reasoning and memory tasks while holding bottom-up perceptual stimulation and response demands constant. Univariate analyses on fMRI data from twenty participants identified a large swath of left lateral prefrontal cortex, including RLPFC, that showed common engagement on reasoning trials with valid analogies and memory trials with accurately retrieved source details. Despite broadly overlapping recruitment, multi-voxel activity patterns within left RLPFC reliably differentiated these two trial types, highlighting the presence of at least partially distinct information processing modes. Functional connectivity analyses demonstrated that while left RLPFC showed consistent coupling with the fronto-parietal control network across tasks, its coupling with other cortical areas varied in a task-dependent manner. During the memory task, this region strengthened its connectivity with the default mode and memory retrieval networks, whereas during the reasoning task it coupled more strongly with a nearby left prefrontal region (BA 45) associated with semantic processing, as well as with a superior parietal region associated with visuospatial processing. Taken together, these data suggest a domain-general role for left RLPFC in monitoring and/or integrating task-relevant knowledge representations and showcase how its function cannot solely be attributed to episodic memory or analogical reasoning computations.

  9. Shared Etiology of Phonological Memory and Vocabulary Deficits in School-Age Children

    ERIC Educational Resources Information Center

    Peterson, Robin L.; Pennington, Bruce F.; Samuelsson, Stefan; Byrne, Brian; Olson, Richard K.

    2013-01-01

    Purpose: The goal of this study was to investigate the etiologic basis for the association between deficits in phonological memory (PM) and vocabulary in school-age children. Method: Children with deficits in PM or vocabulary were identified within the International Longitudinal Twin Study (ILTS; Samuelsson et al., 2005). The ILTS includes 1,045…

  10. Multiprocessor sort-merge join algorithm for relational data bases

    SciTech Connect

    Thompson, W.C. III; Ries, D.R.

    1981-01-01

    Using multiprocessor systems for rapid processing of relational operations in relational databases is currently a topic of some interest. This paper presents a new multiprocessor algorithm for merge joins of relations. Considerable gains in speed in comparison with existing algorithms are exhibited by this algorithm.

  11. Multiprocessor sort-merge join algorithm for relational databases

    SciTech Connect

    Thompson, W.C. III; Ries, D.R.

    1981-12-01

    Using multiprocessor systems for rapid processing of relational operations in relational databases is currently a topic of some interest. This paper presents a new multiprocessor algorithm for merge joins of relations. Considerable gains in speed in comparison with existing algorithms are exhibited by this algorithm.

  12. Neural substrates of shared attention as social memory: A hyperscanning functional magnetic resonance imaging study.

    PubMed

    Koike, Takahiko; Tanabe, Hiroki C; Okazaki, Shuntaro; Nakagawa, Eri; Sasaki, Akihiro T; Shimada, Koji; Sugawara, Sho K; Takahashi, Haruka K; Yoshihara, Kazufumi; Bosch-Bayard, Jorge; Sadato, Norihiro

    2016-01-15

    During a dyadic social interaction, two individuals can share visual attention through gaze, directed to each other (mutual gaze) or to a third person or an object (joint attention). Shared attention is fundamental to dyadic face-to-face interaction, but how attention is shared, retained, and neutrally represented in a pair-specific manner has not been well studied. Here, we conducted a two-day hyperscanning functional magnetic resonance imaging study in which pairs of participants performed a real-time mutual gaze task followed by a joint attention task on the first day, and mutual gaze tasks several days later. The joint attention task enhanced eye-blink synchronization, which is believed to be a behavioral index of shared attention. When the same participant pairs underwent mutual gaze without joint attention on the second day, enhanced eye-blink synchronization persisted, and this was positively correlated with inter-individual neural synchronization within the right inferior frontal gyrus. Neural synchronization was also positively correlated with enhanced eye-blink synchronization during the previous joint attention task session. Consistent with the Hebbian association hypothesis, the right inferior frontal gyrus had been activated both by initiating and responding to joint attention. These results indicate that shared attention is represented and retained by pair-specific neural synchronization that cannot be reduced to the individual level.

  13. Neural substrates of shared attention as social memory: A hyperscanning functional magnetic resonance imaging study.

    PubMed

    Koike, Takahiko; Tanabe, Hiroki C; Okazaki, Shuntaro; Nakagawa, Eri; Sasaki, Akihiro T; Shimada, Koji; Sugawara, Sho K; Takahashi, Haruka K; Yoshihara, Kazufumi; Bosch-Bayard, Jorge; Sadato, Norihiro

    2016-01-15

    During a dyadic social interaction, two individuals can share visual attention through gaze, directed to each other (mutual gaze) or to a third person or an object (joint attention). Shared attention is fundamental to dyadic face-to-face interaction, but how attention is shared, retained, and neutrally represented in a pair-specific manner has not been well studied. Here, we conducted a two-day hyperscanning functional magnetic resonance imaging study in which pairs of participants performed a real-time mutual gaze task followed by a joint attention task on the first day, and mutual gaze tasks several days later. The joint attention task enhanced eye-blink synchronization, which is believed to be a behavioral index of shared attention. When the same participant pairs underwent mutual gaze without joint attention on the second day, enhanced eye-blink synchronization persisted, and this was positively correlated with inter-individual neural synchronization within the right inferior frontal gyrus. Neural synchronization was also positively correlated with enhanced eye-blink synchronization during the previous joint attention task session. Consistent with the Hebbian association hypothesis, the right inferior frontal gyrus had been activated both by initiating and responding to joint attention. These results indicate that shared attention is represented and retained by pair-specific neural synchronization that cannot be reduced to the individual level. PMID:26514295

  14. Using multiprocessor systems in scientific applications

    SciTech Connect

    Maples, C.; Logan, D.

    1985-08-01

    The MIDAS multiprocessor system is a multi-level, hierarchial structure developed at the Advanced Computer Architecture Laboratory of the University of California's Lawrence Berkeley Laboratory. A two-stage, 11-processor system has been operational for about 18 months. It has been employed to investigate techniques for decomposing a variety of problems and algorithms into a parallel processor environment. The performance results for a number of different applications are discussed. These include scientific data analysis, Monte Carlo calculations, solutions to partial differential calculations (using finitedifference methods), and problems in accelerator design. Language extensions and programming techniques for the data-flow architecture is also presented.

  15. Partitioning of regular computation on multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Lee, Fung Fung

    1988-01-01

    Problem partitioning of regular computation over two dimensional meshes on multiprocessor systems is examined. The regular computation model considered involves repetitive evaluation of values at each mesh point with local communication. The computational workload and the communication pattern are the same at each mesh point. The regular computation model arises in numerical solutions of partial differential equations and simulations of cellular automata. Given a communication pattern, a systematic way to generate a family of partitions is presented. The influence of various partitioning schemes on performance is compared on the basis of computation to communication ratio.

  16. SHARE and Share Alike

    ERIC Educational Resources Information Center

    Baird, Jeffrey Marshall

    2006-01-01

    This article describes a reading comprehension program adopted at J. E. Cosgriff Memorial Catholic School in Salt Lake City, Utah. The program is called SHARE: Students Helping Achieve Reading Excellence, and involves seventh and eighth grade students teaching first and second graders reading comprehension strategies learned in middle school…

  17. LDRD final report : managing shared memory data distribution in hybrid HPC applications.

    SciTech Connect

    Merritt, Alexander M.; Pedretti, Kevin Thomas Tauke

    2010-09-01

    MPI is the dominant programming model for distributed memory parallel computers, and is often used as the intra-node programming model on multi-core compute nodes. However, application developers are increasingly turning to hybrid models that use threading within a node and MPI between nodes. In contrast to MPI, most current threaded models do not require application developers to deal explicitly with data locality. With increasing core counts and deeper NUMA hierarchies seen in the upcoming LANL/SNL 'Cielo' capability supercomputer, data distribution poses an upper boundary on intra-node scalability within threaded applications. Data locality therefore has to be identified at runtime using static memory allocation policies such as first-touch or next-touch, or specified by the application user at launch time. We evaluate several existing techniques for managing data distribution using micro-benchmarks on an AMD 'Magny-Cours' system with 24 cores among 4 NUMA domains and argue for the adoption of a dynamic runtime system implemented at the kernel level, employing a novel page table replication scheme to gather per-NUMA domain memory access traces.

  18. Autobiographical Memory Sharing in Everyday Life: Characteristics of a Good Story

    ERIC Educational Resources Information Center

    Baron, Jacqueline M.; Bluck, Susan

    2009-01-01

    Storytelling is a ubiquitous human activity that occurs across the lifespan as part of everyday life. Studies from three disparate literatures suggest that older adults (as compared to younger adults) are (a) less likely to recall story details, (b) more likely to go off-target when sharing stories, and, in contrast, (c) more likely to receive…

  19. Memory

    MedlinePlus

    ... it has to decide what is worth remembering. Memory is the process of storing and then remembering this information. There are different types of memory. Short-term memory stores information for a few ...

  20. A multiprocessor airborne lidar data system

    NASA Technical Reports Server (NTRS)

    Wright, C. W.; Bailey, S. A.; Heath, G. E.; Piazza, C. R.

    1988-01-01

    A new multiprocessor data acquisition system was developed for the existing Airborne Oceanographic Lidar (AOL). This implementation simultaneously utilizes five single board 68010 microcomputers, the UNIX system V operating system, and the real time executive VRTX. The original data acquisition system was implemented on a Hewlett Packard HP 21-MX 16 bit minicomputer using a multi-tasking real time operating system and a mixture of assembly and FORTRAN languages. The present collection of data sources produce data at widely varied rates and require varied amounts of burdensome real time processing and formatting. It was decided to replace the aging HP 21-MX minicomputer with a multiprocessor system. A new and flexible recording format was devised and implemented to accommodate the constantly changing sensor configuration. A central feature of this data system is the minimization of non-remote sensing bus traffic. Therefore, it is highly desirable that each micro be capable of functioning as much as possible on-card or via private peripherals. The bus is used primarily for the transfer of remote sensing data to or from the buffer queue.

  1. SIERA: A Multiprocessor System For Robotics

    NASA Astrophysics Data System (ADS)

    Kazanzides, Peter; Wasti, Hamid; Wolovich, W. A.

    1987-03-01

    This paper describes SIERA (System for Implementing and Evaluating Robotic Algorithms), which has been developed at the Laboratory for Engineering Man/Machine Systems (LEMS) at Brown University. SIERA was created to satisfy the requirement for a multiprocessor-based development system flexible enough to be used for research into new robotic algorithms, especially those that utilize externally sensed information, such as vision and force. A multiprocessor architecture has been developed that incorporates a tightly coupled bus-based system for real-time servoing and a loosely coupled point-to-point network for less time-critical operations. SIERA is capable of controlling many types of commercially available robots since all input and output is done via general-purpose I/O boards. Suitably constructed robot interface boards are used to condition all feedback signals and to amplify the control outputs to the proper drive levels. We have constructed robot interface boards for the IBM 7565 and PUMA 560 manipulators in LEMS, and have controlled both robots using SIERA. The operating system used for SIERA has been designed to provide maximum flexibility in implementing new robotic algorithms. The concept of programming levels has been introduced to classify the different ways SIERA can be utilized ¬â€?for simple robot control, for robotics research, and for system enhancements. The main benefit of SIERA is that it is now possible to experimentally implement and evaluate a variety of algorithms in areas such as compliant control, visual servoing, and inverse kinematics.

  2. Developmental improvements in the resolution and capacity of visual working memory share a common source.

    PubMed

    Simmering, Vanessa R; Miller, Hilary E

    2016-08-01

    The nature of visual working memory (VWM) representations is currently a source of debate between characterizations as slot-like versus a flexibly-divided pool of resources. Recently, a dynamic neural field model has been proposed as an alternative account that focuses more on the processes by which VWM representations are formed, maintained, and used in service of behavior. This dynamic model has explained developmental increases in VWM capacity and resolution through strengthening excitatory and inhibitory connections. Simulations of developmental improvements in VWM resolution suggest that one important change is the accuracy of comparisons between items held in memory and new inputs. Thus, the ability to detect changes is a critical component of developmental improvements in VWM performance across tasks, leading to the prediction that capacity and resolution should correlate during childhood. Comparing 5- to 8-year-old children's performance across color discrimination and change detection tasks revealed the predicted correlation between estimates of VWM capacity and resolution, supporting the hypothesis that increasing connectivity underlies improvements in VWM during childhood. These results demonstrate the importance of formalizing the processes that support the use of VWM, rather than focusing solely on the nature of representations. We conclude by considering our results in the broader context of VWM development. PMID:27329264

  3. Shared representations for working memory and mental imagery in early visual cortex.

    PubMed

    Albers, Anke Marit; Kok, Peter; Toni, Ivan; Dijkerman, H Chris; de Lange, Floris P

    2013-08-01

    Early visual areas contain specific information about visual items maintained in working memory, suggesting a role for early visual cortex in more complex cognitive functions [1-4]. It is an open question, however, whether these areas also underlie the ability to internally generate images de novo (i.e., mental imagery). Research on mental imagery has to this point focused mostly on whether mental images activate early sensory areas, with mixed results [5-7]. Recent studies suggest that multivariate pattern analysis of neural activity patterns in visual regions can reveal content-specific representations during cognitive processes, even though overall activation levels are low [1-4]. Here, we used this approach [8, 9] to study item-specific activity patterns in early visual areas (V1-V3) when these items are internally generated. We could reliably decode stimulus identity from neural activity patterns in early visual cortex during both working memory and mental imagery. Crucially, these activity patterns resembled those evoked by bottom-up visual stimulation, suggesting that mental images are indeed "perception-like" in nature. These findings suggest that the visual cortex serves as a dynamic "blackboard" [10, 11] that is used during both bottom-up stimulus processing and top-down internal generation of mental content.

  4. Memory.

    ERIC Educational Resources Information Center

    McKean, Kevin

    1983-01-01

    Discusses current research (including that involving amnesiacs and snails) into the nature of the memory process, differentiating between and providing examples of "fact" memory and "skill" memory. Suggests that three brain parts (thalamus, fornix, mammilary body) are involved in the memory process. (JN)

  5. Design of software for distributed/multiprocessor systems

    SciTech Connect

    Mckelvey, T.R.; Agrawal, D.P.

    1982-01-01

    Software design methodologies for distributed/multiprocessor systems are investigated. Parallelism and multitasking are considered as key issues in the design process. Petri-Nets and precedence graphs are presented as techniques for the modeling of a problem for implementation on a computer system. Techniques using the Petri-Net and precedence graph to decompose the problem model into subsets that may be executed on a distributed/multiprocessor system are presented. These techniques offer a systematic design methodology for the design of distributed/multiprocessor system software. 8 references.

  6. Insertion of coherence requests for debugging a multiprocessor

    DOEpatents

    Blumrich, Matthias A.; Salapura, Valentina

    2010-02-23

    A method and system are disclosed to insert coherence events in a multiprocessor computer system, and to present those coherence events to the processors of the multiprocessor computer system for analysis and debugging purposes. The coherence events are inserted in the computer system by adding one or more special insert registers. By writing into the insert registers, coherence events are inserted in the multiprocessor system as if they were generated by the normal coherence protocol. Once these coherence events are processed, the processing of coherence events can continue in the normal operation mode.

  7. Directions in parallel programming: HPF, shared virtual memory and object parallelism in pC++

    NASA Technical Reports Server (NTRS)

    Bodin, Francois; Priol, Thierry; Mehrotra, Piyush; Gannon, Dennis

    1994-01-01

    Fortran and C++ are the dominant programming languages used in scientific computation. Consequently, extensions to these languages are the most popular for programming massively parallel computers. We discuss two such approaches to parallel Fortran and one approach to C++. The High Performance Fortran Forum has designed HPF with the intent of supporting data parallelism on Fortran 90 applications. HPF works by asking the user to help the compiler distribute and align the data structures with the distributed memory modules in the system. Fortran-S takes a different approach in which the data distribution is managed by the operating system and the user provides annotations to indicate parallel control regions. In the case of C++, we look at pC++ which is based on a concurrent aggregate parallel model.

  8. Mapping of H.264 decoding on a multiprocessor architecture

    NASA Astrophysics Data System (ADS)

    van der Tol, Erik B.; Jaspers, Egbert G.; Gelderblom, Rob H.

    2003-05-01

    Due to the increasing significance of development costs in the competitive domain of high-volume consumer electronics, generic solutions are required to enable reuse of the design effort and to increase the potential market volume. As a result from this, Systems-on-Chip (SoCs) contain a growing amount of fully programmable media processing devices as opposed to application-specific systems, which offered the most attractive solutions due to a high performance density. The following motivates this trend. First, SoCs are increasingly dominated by their communication infrastructure and embedded memory, thereby making the cost of the functional units less significant. Moreover, the continuously growing design costs require generic solutions that can be applied over a broad product range. Hence, powerful programmable SoCs are becoming increasingly attractive. However, to enable power-efficient designs, that are also scalable over the advancing VLSI technology, parallelism should be fully exploited. Both task-level and instruction-level parallelism can be provided by means of e.g. a VLIW multiprocessor architecture. To provide the above-mentioned scalability, we propose to partition the data over the processors, instead of traditional functional partitioning. An advantage of this approach is the inherent locality of data, which is extremely important for communication-efficient software implementations. Consequently, a software implementation is discussed, enabling e.g. SD resolution H.264 decoding with a two-processor architecture, whereas High-Definition (HD) decoding can be achieved with an eight-processor system, executing the same software. Experimental results show that the data communication considerably reduces up to 65% directly improving the overall performance. Apart from considerable improvement in memory bandwidth, this novel concept of partitioning offers a natural approach for optimally balancing the load of all processors, thereby further improving the

  9. A class Hierarchical, object-oriented approach to virtual memory management

    NASA Technical Reports Server (NTRS)

    Russo, Vincent F.; Campbell, Roy H.; Johnston, Gary M.

    1989-01-01

    The Choices family of operating systems exploits class hierarchies and object-oriented programming to facilitate the construction of customized operating systems for shared memory and networked multiprocessors. The software is being used in the Tapestry laboratory to study the performance of algorithms, mechanisms, and policies for parallel systems. Described here are the architectural design and class hierarchy of the Choices virtual memory management system. The software and hardware mechanisms and policies of a virtual memory system implement a memory hierarchy that exploits the trade-off between response times and storage capacities. In Choices, the notion of a memory hierarchy is captured by abstract classes. Concrete subclasses of those abstractions implement a virtual address space, segmentation, paging, physical memory management, secondary storage, and remote (that is, networked) storage. Captured in the notion of a memory hierarchy are classes that represent memory objects. These classes provide a storage mechanism that contains encapsulated data and have methods to read or write the memory object. Each of these classes provides specializations to represent the memory hierarchy.

  10. A New coscheduling technique for a cluster of symmetric multiprocessors

    SciTech Connect

    Yoo, A B; Jette, M A

    2000-04-17

    Coscheduling is essential for obtaining good performance in a time-shared symmetric multiprocessor (SMP) cluster environment. However, the most common technique, gang scheduling, has limitations such as poor scalability and vulnerability to faults mainly due to explicit synchronization between its components. A decentralized approach called dynamic coscheduling (DCS) has been shown to be effective for network of workstations (NOW), but this technique is not suitable for the workloads on a very large SMP-cluster with thousands of processors. Furthermore, its implementation can be prohibitively expensive for such a large-scale machine. In this paper, we propose a novel coscheduling technique based on the DCS approach which can achieve coscheduling on very large SMP-clusters in a scalable, efficient, and cost-effective way. In the proposed technique, each local scheduler achieves coscheduling based upon message traffic between the components of parallel jobs. Message trapping is carried out at the user-level, eliminating the need for unsupported hardware or device-level programming. A sending process attaches its status to outgoing messages so local schedulers on remote nodes can make more intelligent scheduling decisions. Once scheduled, processes are guaranteed some minimum period of time to execute. This provides an opportunity to synchronize the parallel job's components across all nodes and achieve good program performance. The results from a performance study reveal that the proposed technique is a promising approach that can reduce response time significantly over uncoordinated scheduling.

  11. Trajectory optimization on multiprocessors - A comparison of three implementation strategies

    NASA Astrophysics Data System (ADS)

    Summerset, Twain K.; Chowkwanyun, Raymond M.

    The optimization of atmospheric flight vehicle trajectories can require the simulation of several thousand individual trajectories. Such a task can be extremely time consuming if simulating each trajectory requires numerically integrating a set of nonlinear differential equations. This traditional approach, which may require many hours' worth of analysis on a time-shared computer facility, is a bottleneck in space mission planning and limits the number of trajectory design options a mission planner can evaluate. To achieve marked reductions in trajectory design solution times, parallel optimization techniques are proposed. In this paper, three strategies for implementing trajectory optimization methods on multiprocessors will be compared. The comparisons will be illustrated through four trajectory design examples. In the first two examples, maximum reentry downrange and crossrange optimal control problems are posed for a generic maneuvering aerodynamic space vehicle. The third example is Troesch's problem, while the fourth example is the classic Brachistochrone problem. Each of the examples are posed as two-point boundary value problems whose solutions can be expressed as the solutions to a set of nonlinear equations.

  12. Memory Benchmarks for SMP-Based High Performance Parallel Computers

    SciTech Connect

    Yoo, A B; de Supinski, B; Mueller, F; Mckee, S A

    2001-11-20

    As the speed gap between CPU and main memory continues to grow, memory accesses increasingly dominates the performance of many applications. The problem is particularly acute for symmetric multiprocessor (SMP) systems, where the shared memory may be accessed concurrently by a group of threads running on separate CPUs. Unfortunately, several key issues governing memory system performance in current systems are not well understood. Complex interactions between the levels of the memory hierarchy, buses or switches, DRAM back-ends, system software, and application access patterns can make it difficult to pinpoint bottlenecks and determine appropriate optimizations, and the situation is even more complex for SMP systems. To partially address this problem, we formulated a set of multi-threaded microbenchmarks for characterizing and measuring the performance of the underlying memory system in SMP-based high-performance computers. We report our use of these microbenchmarks on two important SMP-based machines. This paper has four primary contributions. First, we introduce a microbenchmark suite to systematically assess and compare the performance of different levels in SMP memory hierarchies. Second, we present a new tool based on hardware performance monitors to determine a wide array of memory system characteristics, such as cache sizes, quickly and easily; by using this tool, memory performance studies can be targeted to the full spectrum of performance regimes with many fewer data points than is otherwise required. Third, we present experimental results indicating that the performance of applications with large memory footprints remains largely constrained by memory. Fourth, we demonstrate that thread-level parallelism further degrades memory performance, even for the latest SMPs with hardware prefetching and switch-based memory interconnects.

  13. Clocking and synchronization circuits in multiprocessor systems

    SciTech Connect

    Jeong, D.K.

    1989-01-01

    Microprocessors based on RISC (Reduced Instruction Set Computer) concepts have demonstrated an ability to provide more computing power at a given level of integration than conventional microprocessors. The next step is multiprocessors composed of RISC processing elements. Communication bandwidth among such microprocessors is critical in achieving efficient hardware utilization. This thesis focuses on the communication capability of VLSI circuits and presents new circuit techniques as a guide to build an interconnection network of VLSI microprocessors. Circuit techniques for PLL-based clock generation are described along with stability criteria. The main objective of the circuit is to realize a zero delay buffer. Experimental results show the feasibility of such circuits in VLSI. Synchronizer circuit configurations in both bipolar and MOS technology that best utilize each device, or overcome the technology limit using a bandwidth doubling technique are shown. Interface techniques including handshake mechanisms in such a system are also described.

  14. Efficient hierarchical interconnection for multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Wei, Sizheng; Levy, Saul

    1992-01-01

    The authors present a novel approach to the design of a class of hierarchical interconnection networks for multiprocessor systems. This approach, based on an architecture providing separate networks for each level, gives a general and flexible way to construct efficient hierarchical networks. The performance and cost-effectiveness of the resulting networks are analyzed and compared in detail, using both unbuffered and buffered network models. It is shown that, if the design parameters are determined based on the degree of locality, the cost-effectiveness of a hierarchical network can be significantly improved. In addition, the authors investigate how to construct a cost-effectiveness hierarchical network by determining appropriate design parameters. Two associated algorithms are developed for this purpose.

  15. High throughput network for multiprocessor interconnections

    NASA Astrophysics Data System (ADS)

    Raatikainen, Pertti; Zidbeck, Juha

    1993-05-01

    Multiprocessor architectures are needed to support modern broadband applications, since traditional bus structures are not capable of providing high throughput. New bus structures are needed, especially in the area of network components and terminals. A study to find an efficient and cost effective interconnection topology for the future high speed products is presented. The most common bus topologies are introduced, and their characteristics are estimated to decide which one of them offers best performance and lowest implementation cost. The ring topology is chosen to be studied in more detail. Four competing bus access schemes for the high throughput ring are introduced as well as simulation models for each of them. Using transfer delay and throughput results, as well as keeping the implementation point of view in mind, the best candidate is selected to be studied and experimented in the succeeding research project.

  16. VME rollback hardware for time warp multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Robb, Michael J.; Buzzell, Calvin A.

    1992-01-01

    The purpose of the research effort is to develop and demonstrate innovative hardware to implement specific rollback and timing functions required for efficient queue management and precision timekeeping in multiprocessor discrete event simulations. The previously completed phase 1 effort demonstrated the technical feasibility of building hardware modules which eliminate the state saving overhead of the Time Warp paradigm used in distributed simulations on multiprocessor systems. The current phase 2 effort will build multiple pre-production rollback hardware modules integrated with a network of Sun workstations, and the integrated system will be tested by executing a Time Warp simulation. The rollback hardware will be designed to interface with the greatest number of multiprocessor systems possible. The authors believe that the rollback hardware will provide for significant speedup of large scale discrete event simulation problems and allow multiprocessors using Time Warp to dramatically increase performance.

  17. Hardware for a real-time multiprocessor simulator

    NASA Technical Reports Server (NTRS)

    Blech, R. A.; Arpasi, D. J.

    1984-01-01

    The hardware for a real time multiprocessor simulator (RTMPS) developed at the NASA Lewis Research Center is described. The RTMPS is a multiple microprocessor system used to investigate the application of parallel processing concepts to real time simulation. It is designed to provide flexible data exchange paths between processors by using off the shelf microcomputer boards and minimal customized interfacing. A dedicated operator interface allows easy setup of the simulator and quick interpreting of simulation data. Simulations for the RTMPS are coded in a NASA designed real time multiprocessor language (RTMPL). This language is high level and geared to the multiprocessor environment. A real time multiprocessor operating system (RTMPOS) has also been developed that provides a user friendly operator interface. The RTMPS and supporting software are currently operational and are being evaluated at Lewis. The results of this evaluation will be used to specify the design of an optimized parallel processing system for real time simulation of dynamic systems.

  18. Exploration of SMP-Aware DAO Memory Performance Issues-Final Report 2002

    SciTech Connect

    de Supinski, B R; Yoo, A; McKee, S A; Schulz, M; Mohan, T

    2003-02-04

    The performance of many LLNL applications is dominated by the cost of main memory accesses. Worse, many current trends in computer architecture will lead to substantial degradation of the percentage of peak performance obtained by these codes. This project yields novel techniques that alleviate this problem in SMP-based systems, which are common at LLNL. Further, our techniques will complement other emerging mechanisms for improving memory system performance, such as processor-in-memory. The exploration of existing dynamic access ordering (DAO) mechanisms adapted to SMPs and the development of new memory performance optimization techniques will lead to significant improvements in run times for LLNL applications on future computing platforms, effectively increasing the size of the platform. In this project, we have focused on a range of techniques to overcome the performance bottleneck of current multiprocessor systems and to increase the single-node efficiency. These efforts include the design and implementation of a toolset to analyze memory access patterns of applications, the exploration of regularity metrics and their use to classify code behavior, and a set of microbenchmarks to assess and quantify the performance of SMP memory systems. We will make these tools available to the general laboratory user community to help the evaluation and optimization of LLNL applications. In addition, we explored the use of Dynamic Access Ordering (DAO) techniques in the realm of shared memory multiprocessors. The most critical part of the latter is the need to maintain coherence among reordered accesses due to possible aliasing. We have worked on several design alternatives to guarantee consistency in such systems without changing the user environment. This guarantees that such novel memory systems will be directly applicable for existing and future HPC codes at LLNL.

  19. File-System Workload on a Scientific Multiprocessor

    NASA Technical Reports Server (NTRS)

    Kotz, David; Nieuwejaar, Nils

    1995-01-01

    Many scientific applications have intense computational and I/O requirements. Although multiprocessors have permitted astounding increases in computational performance, the formidable I/O needs of these applications cannot be met by current multiprocessors a their I/O subsystems. To prevent I/O subsystems from forever bottlenecking multiprocessors and limiting the range of feasible applications, new I/O subsystems must be designed. The successful design of computer systems (both hardware and software) depends on a thorough understanding of their intended use. A system designer optimizes the policies and mechanisms for the cases expected to most common in the user's workload. In the case of multiprocessor file systems, however, designers have been forced to build file systems based only on speculation about how they would be used, extrapolating from file-system characterizations of general-purpose workloads on uniprocessor and distributed systems or scientific workloads on vector supercomputers (see sidebar on related work). To help these system designers, in June 1993 we began the Charisma Project, so named because the project sought to characterize 1/0 in scientific multiprocessor applications from a variety of production parallel computing platforms and sites. The Charisma project is unique in recording individual read and write requests-in live, multiprogramming, parallel workloads (rather than from selected or nonparallel applications). In this article, we present the first results from the project: a characterization of the file-system workload an iPSC/860 multiprocessor running production, parallel scientific applications at NASA's Ames Research Center.

  20. Selection in spatial working memory is independent of perceptual selective attention, but they interact in a shared spatial priority map.

    PubMed

    Hedge, Craig; Oberauer, Klaus; Leonards, Ute

    2015-11-01

    We examined the relationship between the attentional selection of perceptual information and of information in working memory (WM) through four experiments, using a spatial WM-updating task. Participants remembered the locations of two objects in a matrix and worked through a sequence of updating operations, each mentally shifting one dot to a new location according to an arrow cue. Repeatedly updating the same object in two successive steps is typically faster than switching to the other object; this object switch cost reflects the shifting of attention in WM. In Experiment 1, the arrows were presented in random peripheral locations, drawing perceptual attention away from the selected object in WM. This manipulation did not eliminate the object switch cost, indicating that the mechanisms of perceptual selection do not underlie selection in WM. Experiments 2a and 2b corroborated the independence of selection observed in Experiment 1, but showed a benefit to reaction times when the placement of the arrow cue was aligned with the locations of relevant objects in WM. Experiment 2c showed that the same benefit also occurs when participants are not able to mark an updating location through eye fixations. Together, these data can be accounted for by a framework in which perceptual selection and selection in WM are separate mechanisms that interact through a shared spatial priority map.

  1. Selection in spatial working memory is independent of perceptual selective attention, but they interact in a shared spatial priority map.

    PubMed

    Hedge, Craig; Oberauer, Klaus; Leonards, Ute

    2015-11-01

    We examined the relationship between the attentional selection of perceptual information and of information in working memory (WM) through four experiments, using a spatial WM-updating task. Participants remembered the locations of two objects in a matrix and worked through a sequence of updating operations, each mentally shifting one dot to a new location according to an arrow cue. Repeatedly updating the same object in two successive steps is typically faster than switching to the other object; this object switch cost reflects the shifting of attention in WM. In Experiment 1, the arrows were presented in random peripheral locations, drawing perceptual attention away from the selected object in WM. This manipulation did not eliminate the object switch cost, indicating that the mechanisms of perceptual selection do not underlie selection in WM. Experiments 2a and 2b corroborated the independence of selection observed in Experiment 1, but showed a benefit to reaction times when the placement of the arrow cue was aligned with the locations of relevant objects in WM. Experiment 2c showed that the same benefit also occurs when participants are not able to mark an updating location through eye fixations. Together, these data can be accounted for by a framework in which perceptual selection and selection in WM are separate mechanisms that interact through a shared spatial priority map. PMID:26341873

  2. A fault-tolerant multiprocessor architecture for aircraft, volume 1. [autopilot configuration

    NASA Technical Reports Server (NTRS)

    Smith, T. B.; Hopkins, A. L.; Taylor, W.; Ausrotas, R. A.; Lala, J. H.; Hanley, L. D.; Martin, J. H.

    1978-01-01

    A fault-tolerant multiprocessor architecture is reported. This architecture, together with a comprehensive information system architecture, has important potential for future aircraft applications. A preliminary definition and assessment of a suitable multiprocessor architecture for such applications is developed.

  3. Multitasking runtime systems for the Cedar Multiprocessor

    SciTech Connect

    Guzzi, M.D.

    1986-07-01

    The programming of a MIMD machine is more complex than for SISD and SIMD machines. The multiple computational resources of the machine must be made available to the programming language compiler and to the programmer so that multitasking programs may be written. This thesis will explore the additional complexity of programming a MIMD machine, the Cedar Multiprocessor specifically, and the multitasking runtime system necessary to provide multitasking resources to the user. First, the problem will be well defined: the Cedar machine, its operating system, the programming language, and multitasking concepts will be described. Second, a solution to the problem, called macrotasking, will be proposed. This solution provides multitasking facilities to the programmer at a very coarse level with many visible machine dependencies. Third, an alternate solution, called microtasking, will be proposed. This solution provides multitasking facilities of a much finer grain. This solution does not depend so rigidly on the specific architecture of the machine. Finally, the two solutions will be compared for effectiveness. 12 refs., 16 figs.

  4. Multiprocessor scheduling problem with machine constraints

    NASA Astrophysics Data System (ADS)

    He, Yong; Tan, Zhiyi

    2001-09-01

    This paper investigates multiprocessor scheduling with machine constraints, which has many applications in the flexible manufacturing systems and in VLSI chip design. Machines have different starting times and each machine can schedule at most k jobs in a period. The objective is to minimizing the makespan. For this strogly NP-hard problem, it is important to design near-optimal approximation algorithms. It is known that Modified LPT algorithm has a worst-case ratio of 3/2-1/(2m) for kequals2 where m is the number of machines. For k>2, no good algorithm has been got in the literature. In this paper, we prove the worst-case ratio of Modified LPT is less than 2. We further present an approximation algorithm Matching and show it has a worst-case ratio 2-1/m for every k>2. By introducing parameters, we get two better worst-case ratios which show the Matching algorithm is near optimal for two special cases.

  5. Prefetching in file systems for MIMD multiprocessors

    NASA Technical Reports Server (NTRS)

    Kotz, David F.; Ellis, Carla Schlatter

    1990-01-01

    The question of whether prefetching blocks on the file into the block cache can effectively reduce overall execution time of a parallel computation, even under favorable assumptions, is considered. Experiments have been conducted with an interleaved file system testbed on the Butterfly Plus multiprocessor. Results of these experiments suggest that (1) the hit ratio, the accepted measure in traditional caching studies, may not be an adequate measure of performance when the workload consists of parallel computations and parallel file access patterns, (2) caching with prefetching can significantly improve the hit ratio and the average time to perform an I/O (input/output) operation, and (3) an improvement in overall execution time has been observed in most cases. In spite of these gains, prefetching sometimes results in increased execution times (a negative result, given the optimistic nature of the study). The authors explore why it is not trivial to translate savings on individual I/O requests into consistently better overall performance and identify the key problems that need to be addressed in order to improve the potential of prefetching techniques in the environment.

  6. Interrupt handling in a multiprocessor computing system

    SciTech Connect

    D'Amico, L.W.; Guyer, J.M.

    1989-01-03

    A multiprocessor computing system is described comprising: a system bus, including an address bus for carrying an address phase of an instruction and a data bus for carrying a data phase of an instruction; a plurality of processing units connected to the system bus, each processing unit including means for generating broadcast interrupt origin request instructions on the system bus; asynchronous input/output channel controllers connected to the system bus, each of the input/output channel controllers including means for generating a synchronizing signal in response to completion of an address phase of a broadcast instruction on the system bus, and corresponding to a different one of the processing units connected through each of the input/output channel controllers, the input/output channel controllers being arranged on the priority lines in order of priority, the priority lines being gated in an input/output channel controller so that priority is asserted over all lower priority input/output channel controllers on a priority line by an input/output channel controller if the input/output channel controller has an interrupt pending in the input/output channel controller for the processing unit corresponding to the priority line.

  7. FTMP (Fault Tolerant Multiprocessor) programmer's manual

    NASA Technical Reports Server (NTRS)

    Feather, F. E.; Liceaga, C. A.; Padilla, P. A.

    1986-01-01

    The Fault Tolerant Multiprocessor (FTMP) computer system was constructed using the Rockwell/Collins CAPS-6 processor. It is installed in the Avionics Integration Research Laboratory (AIRLAB) of NASA Langley Research Center. It is hosted by AIRLAB's System 10, a VAX 11/750, for the loading of programs and experimentation. The FTMP support software includes a cross compiler for a high level language called Automated Engineering Design (AED) System, an assembler for the CAPS-6 processor assembly language, and a linker. Access to this support software is through an automated remote access facility on the VAX which relieves the user of the burden of learning how to use the IBM 4381. This manual is a compilation of information about the FTMP support environment. It explains the FTMP software and support environment along many of the finer points of running programs on FTMP. This will be helpful to the researcher trying to run an experiment on FTMP and even to the person probing FTMP with fault injections. Much of the information in this manual can be found in other sources; we are only attempting to bring together the basic points in a single source. If the reader should need points clarified, there is a list of support documentation in the back of this manual.

  8. Domain-general involvement of the posterior frontolateral cortex in time-based resource-sharing in working memory: An fMRI study.

    PubMed

    Vergauwe, Evie; Hartstra, Egbert; Barrouillet, Pierre; Brass, Marcel

    2015-07-15

    Working memory is often defined in cognitive psychology as a system devoted to the simultaneous processing and maintenance of information. In line with the time-based resource-sharing model of working memory (TBRS; Barrouillet and Camos, 2015; Barrouillet et al., 2004), there is accumulating evidence that, when memory items have to be maintained while performing a concurrent activity, memory performance depends on the cognitive load of this activity, independently of the domain involved. The present study used fMRI to identify regions in the brain that are sensitive to variations in cognitive load in a domain-general way. More precisely, we aimed at identifying brain areas that activate during maintenance of memory items as a direct function of the cognitive load induced by both verbal and spatial concurrent tasks. Results show that the right IFJ and bilateral SPL/IPS are the only areas showing an increased involvement as cognitive load increases and do so in a domain general manner. When correlating the fMRI signal with the approximated cognitive load as defined by the TBRS model, it was shown that the main focus of the cognitive load-related activation is located in the right IFJ. The present findings indicate that the IFJ makes domain-general contributions to time-based resource-sharing in working memory and allowed us to generate the novel hypothesis by which the IFJ might be the neural basis for the process of rapid switching. We argue that the IFJ might be a crucial part of a central attentional bottleneck in the brain because of its inability to upload more than one task rule at once.

  9. Clocking and synchronization circuits in multiprocessor systems

    SciTech Connect

    Jeong, Deog-Kyoon.

    1989-01-01

    Microprocessors based on RISC (Reduced Instruction Set Computer) concepts have demonstrated an ability to provide more computing power at a given level of integration than conventional microprocessors. The next step is multiprocessors composed of RISC processing elements. Communication bandwidth among such microprocessors is critical in achieving efficient hardware utilization. This thesis focuses on the communication capability of VLSI circuits and presents new circuit techniques as a guide to build an interconnection network of VLSI microprocessors. Two of the most prominent problems in a synchronous system, which most of the current computer systems are based on, have been clock skew and synchronization failure. A new concept called self-timed systems solves such problems but has not been accepted in microprocessor implementations yet because of its complex design procedure and increased overhead. With this in mind, this thesis concentrates on a system in which individual synchronous subsystems are connected asynchronously. Synchronous subsystems operate with a better control over clock skew using a phase locked loop (PLL) technique. Communication among subsystems is done asynchronously with a controlled synchronization failure rate. One advantage is that conventional VLSI design methodologies which are more efficient can still be applied. Circuit techniques for PLL-based clock generation are described along with stability criteria. The main objective of the circuit is to realize a zero delay buffer. Experimental results show the feasibility of such circuits in VLSI. Synchronizer circuit configurations in both bipolar and MOS technology that best utilize each device, or overcome the technology limit using a bandwidth doubling technique are shown. Interface techniques including handshake mechanisms in such a system are also described.

  10. Real-Time Multiprocessor Programming Language (RTMPL) user's manual

    NASA Technical Reports Server (NTRS)

    Arpasi, D. J.

    1985-01-01

    A real-time multiprocessor programming language (RTMPL) has been developed to provide for high-order programming of real-time simulations on systems of distributed computers. RTMPL is a structured, engineering-oriented language. The RTMPL utility supports a variety of multiprocessor configurations and types by generating assembly language programs according to user-specified targeting information. Many programming functions are assumed by the utility (e.g., data transfer and scaling) to reduce the programming chore. This manual describes RTMPL from a user's viewpoint. Source generation, applications, utility operation, and utility output are detailed. An example simulation is generated to illustrate many RTMPL features.

  11. Development and evaluation of a Fault-Tolerant Multiprocessor (FTMP) computer. Volume 2: FTMP software

    NASA Technical Reports Server (NTRS)

    Lala, J. H.; Smith, T. B., III

    1983-01-01

    The software developed for the Fault-Tolerant Multiprocessor (FTMP) is described. The FTMP executive is a timer-interrupt driven dispatcher that schedules iterative tasks which run at 3.125, 12.5, and 25 Hz. Major tasks which run under the executive include system configuration control, flight control, and display. The flight control task includes autopilot and autoland functions for a jet transport aircraft. System Displays include status displays of all hardware elements (processors, memories, I/O ports, buses), failure log displays showing transient and hard faults, and an autopilot display. All software is in a higher order language (AED, an ALGOL derivative). The executive is a fully distributed general purpose executive which automatically balances the load among available processor triads. Provisions for graceful performance degradation under processing overload are an integral part of the scheduling algorithms.

  12. Job-mix modeling and system analysis of an aerospace multiprocessor.

    NASA Technical Reports Server (NTRS)

    Mallach, E. G.

    1972-01-01

    An aerospace guidance computer organization, consisting of multiple processors and memory units attached to a central time-multiplexed data bus, is described. A job mix for this type of computer is obtained by analysis of Apollo mission programs. Multiprocessor performance is then analyzed using: 1) queuing theory, under certain 'limiting case' assumptions; 2) Markov process methods; and 3) system simulation. Results of the analyses indicate: 1) Markov process analysis is a useful and efficient predictor of simulation results; 2) efficient job execution is not seriously impaired even when the system is so overloaded that new jobs are inordinately delayed in starting; 3) job scheduling is significant in determining system performance; and 4) a system having many slow processors may or may not perform better than a system of equal power having few fast processors, but will not perform significantly worse.

  13. Dynamically Reconfigurable Multiprocessor System For Scene Segmentation In Histopathology

    NASA Astrophysics Data System (ADS)

    Shoemaker, Richard L.; Stucky, Oliver; Maenner, Reinhard; Thompson, Deborah B.; Griswold, W. G.; Bartels, Peter H.

    1989-06-01

    The Heidelberg Polyp multiprocessor and its application to scene segmentation problems in histopathology is discussed, including ways in which the architecture can be utilized to support expert system-guided scene segmentation software, the system's current performance, and some major improvements currently being made to the system.

  14. Fault tree models for fault tolerant hypercube multiprocessors

    NASA Technical Reports Server (NTRS)

    Boyd, Mark A.; Tuazon, Jezus O.

    1991-01-01

    Three candidate fault tolerant hypercube architectures are modeled, their reliability analyses are compared, and the resulting implications of these methods of incorporating fault tolerance into hypercube multiprocessors are discussed. In the course of performing the reliability analyses, the use of HARP and fault trees in modeling sequence dependent system behaviors is demonstrated.

  15. Memories.

    ERIC Educational Resources Information Center

    Brand, Judith, Ed.

    1998-01-01

    This theme issue of the journal "Exploring" covers the topic of "memories" and describes an exhibition at San Francisco's Exploratorium that ran from May 22, 1998 through January 1999 and that contained over 40 hands-on exhibits, demonstrations, artworks, images, sounds, smells, and tastes that demonstrated and depicted the biological,…

  16. Characterizing parallel file-access patterns on a large-scale multiprocessor

    NASA Technical Reports Server (NTRS)

    Purakayastha, Apratim; Ellis, Carla Schlatter; Kotz, David; Nieuwejaar, Nils; Best, Michael

    1994-01-01

    Rapid increases in the computational speeds of multiprocessors have not been matched by corresponding performance enhancements in the I/O subsystem. To satisfy the large and growing I/O requirements of some parallel scientific applications, we need parallel file systems that can provide high-bandwidth and high-volume data transfer between the I/O subsystem and thousands of processors. Design of such high-performance parallel file systems depends on a thorough grasp of the expected workload. So far there have been no comprehensive usage studies of multiprocessor file systems. Our CHARISMA project intends to fill this void. The first results from our study involve an iPSC/860 at NASA Ames. This paper presents results from a different platform, the CM-5 at the National Center for Supercomputing Applications. The CHARISMA studies are unique because we collect information about every individual read and write request and about the entire mix of applications running on the machines. The results of our trace analysis lead to recommendations for parallel file system design. First the file system should support efficient concurrent access to many files, and I/O requests from many jobs under varying load conditions. Second, it must efficiently manage large files kept open for long periods. Third, it should expect to see small requests predominantly sequential access patterns, application-wide synchronous access, no concurrent file-sharing between jobs appreciable byte and block sharing between processes within jobs, and strong interprocess locality. Finally, the trace data suggest that node-level write caches and collective I/O request interfaces may be useful in certain environments.

  17. Queueing analysis of a canonical model of real-time multiprocessors

    NASA Technical Reports Server (NTRS)

    Krishna, C. M.; Shin, K. G.

    1983-01-01

    A logical classification of multiprocessor structures from the point of view of control applications is presented. A computation of the response time distribution for a canonical model of a real time multiprocessor is presented. The multiprocessor is approximated by a blocking model. Two separate models are derived: one created from the system's point of view, and the other from the point of view of an incoming task.

  18. Analysis of a Multiprocessor Guidance Computer. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Maltach, E. G.

    1969-01-01

    The design of the next generation of spaceborne digital computers is described. It analyzes a possible multiprocessor computer configuration. For the analysis, a set of representative space computing tasks was abstracted from the Lunar Module Guidance Computer programs as executed during the lunar landing, from the Apollo program. This computer performs at this time about 24 concurrent functions, with iteration rates from 10 times per second to once every two seconds. These jobs were tabulated in a machine-independent form, and statistics of the overall job set were obtained. It was concluded, based on a comparison of simulation and Markov results, that the Markov process analysis is accurate in predicting overall trends and in configuration comparisons, but does not provide useful detailed information in specific situations. Using both types of analysis, it was determined that the job scheduling function is a critical one for efficiency of the multiprocessor. It is recommended that research into the area of automatic job scheduling be performed.

  19. VME multiprocessor system for data acquisition at OSIRIS

    SciTech Connect

    Ziem, P.; Kiehne, T.; Beschorner, C.; Drescher, B.; Zahn, J.

    1987-08-01

    A VME multiprocessor system for data acquisition and data display utilizing several MC68XXX based CPUs and VMEbuses is described. The design of the VME system was stimulated by the data handling requirements of experiments using the anti-Compton spectrometer OSIRIS, i.e. data storage on optical disks and on-line accumulation of large 2-dimensional histograms (4096 x 4096 channels). Due to the general approach the VME system is easily applicable for other nuclear physics experiments.

  20. Plasma physics modeling and the Cray-2 multiprocessor

    SciTech Connect

    Killeen, J.

    1985-01-01

    The importance of computer modeling in the magnetic fusion energy research program is discussed. The need for the most advanced supercomputers is described. To meet the demand for more powerful scientific computers to solve larger and more complicated problems, the computer industry is developing multiprocessors. The role of the Cray-2 in plasma physics modeling is discussed with some examples. 28 refs., 2 figs., 1 tab.

  1. Fault-free performance validation of fault-tolerant multiprocessors

    NASA Technical Reports Server (NTRS)

    Czeck, Edward W.; Feather, Frank E.; Grizzaffi, Ann Marie; Segall, Zary Z.; Siewiorek, Daniel P.

    1987-01-01

    A validation methodology for testing the performance of fault-tolerant computer systems was developed and applied to the Fault-Tolerant Multiprocessor (FTMP) at NASA-Langley's AIRLAB facility. This methodology was claimed to be general enough to apply to any ultrareliable computer system. The goal of this research was to extend the validation methodology and to demonstrate the robustness of the validation methodology by its more extensive application to NASA's Fault-Tolerant Multiprocessor System (FTMP) and to the Software Implemented Fault-Tolerance (SIFT) Computer System. Furthermore, the performance of these two multiprocessors was compared by conducting similar experiments. An analysis of the results shows high level language instruction execution times for both SIFT and FTMP were consistent and predictable, with SIFT having greater throughput. At the operating system level, FTMP consumes 60% of the throughput for its real-time dispatcher and 5% on fault-handling tasks. In contrast, SIFT consumes 16% of its throughput for the dispatcher, but consumes 66% in fault-handling software overhead.

  2. Modelling parallel programs and multiprocessor architectures with AXE

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Fineman, Charles E.

    1991-01-01

    AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.

  3. Modeling and measurement of fault-tolerant multiprocessors

    NASA Technical Reports Server (NTRS)

    Shin, K. G.; Woodbury, M. H.; Lee, Y. H.

    1985-01-01

    The workload effects on computer performance are addressed first for a highly reliable unibus multiprocessor used in real-time control. As an approach to studing these effects, a modified Stochastic Petri Net (SPN) is used to describe the synchronous operation of the multiprocessor system. From this model the vital components affecting performance can be determined. However, because of the complexity in solving the modified SPN, a simpler model, i.e., a closed priority queuing network, is constructed that represents the same critical aspects. The use of this model for a specific application requires the partitioning of the workload into job classes. It is shown that the steady state solution of the queuing model directly produces useful results. The use of this model in evaluating an existing system, the Fault Tolerant Multiprocessor (FTMP) at the NASA AIRLAB, is outlined with some experimental results. Also addressed is the technique of measuring fault latency, an important microscopic system parameter. Most related works have assumed no or a negligible fault latency and then performed approximate analyses. To eliminate this deficiency, a new methodology for indirectly measuring fault latency is presented.

  4. SYMNET: an optical interconnection network for scalable high-performance symmetric multiprocessors.

    PubMed

    Louri, Ahmed; Kodi, Avinash Karanth

    2003-06-10

    We address the primary limitation of the bandwidth to satisfy the demands for address transactions in future cache-coherent symmetric multiprocessors (SMPs). It is widely known that the bus speed and the coherence overhead limit the snoop/address bandwidth needed to broadcast address transactions to all processors. As a solution, we propose a scalable address subnetwork called symmetric multiprocessor network (SYMNET) in which address requests and snoop responses of SMPs are implemented optically. SYMNET not only has the ability to pipeline address requests, but also multiple address requests from different processors can propagate through the address subnetwork simultaneously. This is in contrast with all electrical bus-based SMPs, where only a single request is broadcast on the physical address bus at any given point in time. The simultaneous propagation of multiple address requests in SYMNET increases the available address bandwidth and lowers the latency of the network, but the preservation of cache coherence can no longer be maintained with the usual fast snooping protocols. A modified snooping cache-coherence protocol, coherence in SYMNET (COSYM) is introduced to solve the coherence problem. We evaluated SYMNET with a subset of Splash-2 benchmarks and compared it with the electrical bus-based MOESI (modified, owned, exclusive, shared, invalid) protocol. Our simulation studies have shown a 5-66% improvement in execution time for COSYM as compared with MOESI for various applications. Simulations have also shown that the average latency for a transaction to complete by use of COSYM protocol was 5-78% better than the MOESI protocol. SYMNET can scale up to hundreds of processors while still using fast snooping-based cache-coherence protocols, and additional performance gains may be attained with further improvement in optical device technology.

  5. Shared Attention.

    PubMed

    Shteynberg, Garriy

    2015-09-01

    Shared attention is extremely common. In stadiums, public squares, and private living rooms, people attend to the world with others. Humans do so across all sensory modalities-sharing the sights, sounds, tastes, smells, and textures of everyday life with one another. The potential for attending with others has grown considerably with the emergence of mass media technologies, which allow for the sharing of attention in the absence of physical co-presence. In the last several years, studies have begun to outline the conditions under which attending together is consequential for human memory, motivation, judgment, emotion, and behavior. Here, I advance a psychological theory of shared attention, defining its properties as a mental state and outlining its cognitive, affective, and behavioral consequences. I review empirical findings that are uniquely predicted by shared-attention theory and discuss the possibility of integrating shared-attention, social-facilitation, and social-loafing perspectives. Finally, I reflect on what shared-attention theory implies for living in the digital world. PMID:26385997

  6. Input/output system for multiprocessors

    SciTech Connect

    Bernick, D.L.; Chan, K.K.; Chan, W.M.; Dan, Y.F.; Hoang, D.M.; Hussain, Z.; Iswandhi, G.I.; Korpi, J.E.; Sanner, M.W.; Zwangerman, J.A.

    1989-04-11

    A device controller is described, comprising: a first port-input/output controller coupled to a first input/output channel bus; and a second port-input/output controlled coupled to a second input/output channel bus; each of the first and second port-input/output controllers having: a first ownership latch means for granting shared ownership of the device controller to a first host processor to provide a first data path on a first I/O channel through the first port I/O controller between the first host processor and any peripheral, and at least a second ownership latch means operative independently of the first ownership latch means for granting shared ownership of the device controller to a second host processor independently of the first port input/output controller to provide a second data path on a second I/O channel through the second port I/O controller between the second host processor and any peripheral devices coupled to the device controller.

  7. Transactional memories: A new abstraction for parallel processing

    SciTech Connect

    Fasel, J.H.; Lubeck, O.M.; Agrawal, D.; Bruno, J.L.; El Abbadi, A.

    1997-12-01

    This is the final report of a three-year, Laboratory Directed Research and Development (LDRD) project at Los Alamos National Laboratory (LANL). Current distributed memory multiprocessor computer systems make the development of parallel programs difficult. From a programmer`s perspective, it would be most desirable if the underlying hardware and software could provide the programming abstraction commonly referred to as sequential consistency--a single address space and multiple threads; but enforcement of sequential consistency limits opportunities for architectural and operating system performance optimizations, leading to poor performance. Recently, Herlihy and Moss have introduced a new abstraction called transactional memories for parallel programming. The programming model is shared memory with multiple threads. However, data consistency is obtained through the use of transactions rather than mutual exclusion based on locking. The transaction approach permits the underlying system to exploit the potential parallelism in transaction processing. The authors explore the feasibility of designing parallel programs using the transaction paradigm for data consistency and a barrier type of thread synchronization.

  8. Development and evaluation of a Fault-Tolerant Multiprocessor (FTMP) computer. Volume 3: FTMP test and evaluation

    NASA Technical Reports Server (NTRS)

    Lala, J. H.; Smith, T. B., III

    1983-01-01

    The experimental test and evaluation of the Fault-Tolerant Multiprocessor (FTMP) is described. Major objectives of this exercise include expanding validation envelope, building confidence in the system, revealing any weaknesses in the architectural concepts and in their execution in hardware and software, and in general, stressing the hardware and software. To this end, pin-level faults were injected into one LRU of the FTMP and the FTMP response was measured in terms of fault detection, isolation, and recovery times. A total of 21,055 stuck-at-0, stuck-at-1 and invert-signal faults were injected in the CPU, memory, bus interface circuits, Bus Guardian Units, and voters and error latches. Of these, 17,418 were detected. At least 80 percent of undetected faults are estimated to be on unused pins. The multiprocessor identified all detected faults correctly and recovered successfully in each case. Total recovery time for all faults averaged a little over one second. This can be reduced to half a second by including appropriate self-tests.

  9. Combined shared and distributed memory ab-initio computations of molecular-hydrogen systems in the correlated state: Process pool solution and two-level parallelism

    NASA Astrophysics Data System (ADS)

    Biborski, Andrzej; Kądzielawa, Andrzej P.; Spałek, Józef

    2015-12-01

    An efficient computational scheme devised for investigations of ground state properties of the electronically correlated systems is presented. As an example, (H2)n chain is considered with the long-range electron-electron interactions taken into account. The implemented procedure covers: (i) single-particle Wannier wave-function basis construction in the correlated state, (ii) microscopic parameters calculation, and (iii) ground state energy optimization. The optimization loop is based on highly effective process-pool solution - specific root-workers approach. The hierarchical, two-level parallelism was applied: both shared (by use of Open Multi-Processing) and distributed (by use of Message Passing Interface) memory models were utilized. We discuss in detail the feature that such approach results in a substantial increase of the calculation speed reaching factor of 300 for the fully parallelized solution. The scheme elaborated in detail reflects the situation in which the most demanding task is the single-particle basis optimization.

  10. Fast decision tree-based method to index large DNA-protein sequence databases using hybrid distributed-shared memory programming model.

    PubMed

    Jaber, Khalid Mohammad; Abdullah, Rosni; Rashid, Nur'Aini Abdul

    2014-01-01

    In recent times, the size of biological databases has increased significantly, with the continuous growth in the number of users and rate of queries; such that some databases have reached the terabyte size. There is therefore, the increasing need to access databases at the fastest rates possible. In this paper, the decision tree indexing model (PDTIM) was parallelised, using a hybrid of distributed and shared memory on resident database; with horizontal and vertical growth through Message Passing Interface (MPI) and POSIX Thread (PThread), to accelerate the index building time. The PDTIM was implemented using 1, 2, 4 and 5 processors on 1, 2, 3 and 4 threads respectively. The results show that the hybrid technique improved the speedup, compared to a sequential version. It could be concluded from results that the proposed PDTIM is appropriate for large data sets, in terms of index building time.

  11. Communication Efficient Multi-processor FFT

    NASA Astrophysics Data System (ADS)

    Lennart Johnsson, S.; Jacquemin, Michel; Krawitz, Robert L.

    1992-10-01

    Computing the fast Fourier transform on a distributed memory architecture by a direct pipelined radix-2, a bi-section, or a multisection algorithm, all yield the same communications requirement, if communication for all FFT stages can be performed concurrently, the input data is in normal order, and the data allocation is consecutive. With a cyclic data allocation, or bit-reversed input data and a consecutive allocation, multi-sectioning offers a reduced communications requirement by approximately a factor of two. For a consecutive data allocation, normal input order, a decimation-in-time FFT requires that P/ N + d-2 twiddle factors be stored for P elements distributed evenly over N processors, and the axis that is subject to transformation be distributed over 2 d processors. No communication of twiddle factors is required. The same storage requirements hold for a decimation-in-frequency FFT, bit-reversed input order, and consecutive data allocation. The opposite combination of FFT type and data ordering requires a factor of log 2N more storage for N processors. The peak performance for a Connection Machine system CM-200 implementation is 12.9 Gflops/s in 32-bit precision, and 10.7 Gflops/s in 64-bit precision for unordered transforms local to each processor. The corresponding execution rates for ordered transforms are 11.1 Gflops/s and 8.5 Gflops/s, respectively. For distributed one- and two-dimensional transforms the peak performance for unordered transforms exceeds 5 Gflops/s in 32-bit precision and 3 Gflops/s in 64-bit precision. Three-dimensional transforms execute at a slightly lower rate. Distributed ordered transforms execute at a rate of about {1}/{2}to {2}/{3} of the unordered transforms.

  12. Fault-free validation of a fault-tolerant multiprocessor: Baseline experiments and workoad implementation

    NASA Technical Reports Server (NTRS)

    Feather, F.; Siewiorek, D.; Segall, Z.

    1986-01-01

    In the future, aircraft employing active control technology must use highly reliable multiprocessors in order to achieve flight safety. Such computers must be experimentally validated before they are deployed. This project outlines a methodology for doing fault-free validation of reliable multiprocessors. The methodology begins with baseline experiments, which test single phenomenon. As experiments progress, tools for performance testing are developed. This report presents the results of interrupt baseline experiments performed on the Fault-Tolerant Multiprocessor (FTMP) at NASA-Langley's AIRLAB. Interrupt-causing excepting conditions were tested, and several were found to have unimplemented interrupt handling software while one had an unimplemented interrupt vector. A synthetic workload model for realtime multiprocessors is then developed as an application level performance analysis tool. Details of the workload implementation and calibration are presented. Both the experimental methodology and the synthetic workload model are general enough to be applicable to reliable multi-processors besides FTMP.

  13. Reconfigurable high-speed optoelectronic interconnect technology for multiprocessor computers

    NASA Astrophysics Data System (ADS)

    Cheng, Julian

    1995-06-01

    We describe a compact optoelectronic switching technology for interconnecting multiple computer processors and shared memory modules together through dynamically reconfigurable optical paths to provide simultaneous, high speed communication amongst different nodes. Each switch provides a optical link to other nodes as well as electrical access to an individual processor, and it can perform optical and optoelectronic switching to covert digital data between various electrical and optical input/output formats. This multifunctional switching technology is based on the monolithic integration of arrays of vertical-cavity surface-emitting lasers with photodetectors and heterojunction bipolar transistors. The various digital switching and routing functions, as well as optically cascaded multistage operation, have been experimentally demonstrated.

  14. Software for the ACP (Advanced Computer Program) multiprocessor system

    SciTech Connect

    Biel, J.; Areti, H.; Atac, R.; Cook, A.; Fischler, M.; Gaines, I.; Kaliher, C.; Hance, R.; Husby, D.; Nash, T.

    1987-02-02

    Software has been developed for use with the Fermilab Advanced Computer Program (ACP) multiprocessor system. The software was designed to make a system of a hundred independent node processors as easy to use as a single, powerful CPU. Subroutines have been developed by which a user's host program can send data to and get results from the program running in each of his ACP node processors. Utility programs make it easy to compile and link host and node programs, to debug a node program on an ACP development system, and to submit a debugged program to an ACP production system.

  15. Cache directory look-up re-use as conflict check mechanism for speculative memory requests

    DOEpatents

    Ohmacht, Martin

    2013-09-10

    In a cache memory, energy and other efficiencies can be realized by saving a result of a cache directory lookup for sequential accesses to a same memory address. Where the cache is a point of coherence for speculative execution in a multiprocessor system, with directory lookups serving as the point of conflict detection, such saving becomes particularly advantageous.

  16. Scalable Multiprocessor for High-Speed Computing in Space

    NASA Technical Reports Server (NTRS)

    Lux, James; Lang, Minh; Nishimoto, Kouji; Clark, Douglas; Stosic, Dorothy; Bachmann, Alex; Wilkinson, William; Steffke, Richard

    2004-01-01

    A report discusses the continuing development of a scalable multiprocessor computing system for hard real-time applications aboard a spacecraft. "Hard realtime applications" signifies applications, like real-time radar signal processing, in which the data to be processed are generated at "hundreds" of pulses per second, each pulse "requiring" millions of arithmetic operations. In these applications, the digital processors must be tightly integrated with analog instrumentation (e.g., radar equipment), and data input/output must be synchronized with analog instrumentation, controlled to within fractions of a microsecond. The scalable multiprocessor is a cluster of identical commercial-off-the-shelf generic DSP (digital-signal-processing) computers plus generic interface circuits, including analog-to-digital converters, all controlled by software. The processors are computers interconnected by high-speed serial links. Performance can be increased by adding hardware modules and correspondingly modifying the software. Work is distributed among the processors in a parallel or pipeline fashion by means of a flexible master/slave control and timing scheme. Each processor operates under its own local clock; synchronization is achieved by broadcasting master time signals to all the processors, which compute offsets between the master clock and their local clocks.

  17. Experience with a Genetic Algorithm Implemented on a Multiprocessor Computer

    NASA Technical Reports Server (NTRS)

    Plassman, Gerald E.; Sobieszczanski-Sobieski, Jaroslaw

    2000-01-01

    Numerical experiments were conducted to find out the extent to which a Genetic Algorithm (GA) may benefit from a multiprocessor implementation, considering, on one hand, that analyses of individual designs in a population are independent of each other so that they may be executed concurrently on separate processors, and, on the other hand, that there are some operations in a GA that cannot be so distributed. The algorithm experimented with was based on a gaussian distribution rather than bit exchange in the GA reproductive mechanism, and the test case was a hub frame structure of up to 1080 design variables. The experimentation engaging up to 128 processors confirmed expectations of radical elapsed time reductions comparing to a conventional single processor implementation. It also demonstrated that the time spent in the non-distributable parts of the algorithm and the attendant cross-processor communication may have a very detrimental effect on the efficient utilization of the multiprocessor machine and on the number of processors that can be used effectively in a concurrent manner. Three techniques were devised and tested to mitigate that effect, resulting in efficiency increasing to exceed 99 percent.

  18. Addressing, routing, and broadcasting in hexagonal mesh multiprocessors

    SciTech Connect

    Chen, M.S. ); Shin, K.G.; Kandlur, D.D. )

    1990-01-01

    A family of 6-regular graphs, called hexagonal meshes or H-meshes, is considered as a multiprocessor interconnection network. Processing nodes on the periphery of an H-mesh are first wrapped around to achieve regularity and homogeneity. The diameter of a wrapped H-mesh is shown to be of O(p{sup 1}2/), where p is the number of nodes in the H-mesh. An elegant, distributed routing scheme is developed for wrapped H-meshes so that each node in an H-mesh can compute shortest paths from itself to any other node with a straightforward algorithm of O(1) using the addresses of the source-destination pair only, i.e., independent of the network's size. This is in sharp contract with those previously known algorithms that rely on using routing tables. The authors also develop an efficient point-to-point broadcasting algorithm for the H-meshes which is shown to be optimal in the number of required communication steps. The wrapped H-meshes are compared against some of other existing multiprocessor interconnection networks. such as hypercubes, trees, and square meshes. The comparison reinforces the attractiveness of the H-mesh architecture.

  19. Instrumentation, performance visualization, and debugging tools for multiprocessors

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Fineman, Charles E.; Hontalas, Philip J.

    1991-01-01

    The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessor architectures. However, without effective means to monitor (and visualize) program execution, debugging, and tuning parallel programs becomes intractably difficult as program complexity increases with the number of processors. Research on performance evaluation tools for multiprocessors is being carried out at ARC. Besides investigating new techniques for instrumenting, monitoring, and presenting the state of parallel program execution in a coherent and user-friendly manner, prototypes of software tools are being incorporated into the run-time environments of various hardware testbeds to evaluate their impact on user productivity. Our current tool set, the Ames Instrumentation Systems (AIMS), incorporates features from various software systems developed in academia and industry. The execution of FORTRAN programs on the Intel iPSC/860 can be automatically instrumented and monitored. Performance data collected in this manner can be displayed graphically on workstations supporting X-Windows. We have successfully compared various parallel algorithms for computational fluid dynamics (CFD) applications in collaboration with scientists from the Numerical Aerodynamic Simulation Systems Division. By performing these comparisons, we show that performance monitors and debuggers such as AIMS are practical and can illuminate the complex dynamics that occur within parallel programs.

  20. Multi-processor developments in the United States for future high energy physics experiments and accelerators

    SciTech Connect

    Gaines, I.

    1988-03-01

    The use of multi-processors for analysis and high-level triggering in High Energy Physics experiments, pioneered by the early emulator systems, has reached maturity, in particular with the multiple microprocessor systems in use at Fermilab. It is widely acknowledged that such systems will fulfill the major portion of the computing needs of future large experiments. Recent developments at Fermilab's Advanced Computer Program will make such systems even more powerful, cost-effective, and easier to use than they are at present. The next generation of microprocessors, already available, will provide CPU power of about one VAX 780 equivalent/$300, while supporting most VMS FORTRAN extensions and large (>8MB) amounts of memory. Low cost high density mass storage devices (based on video tape cartridge technology) will allow parallel I/O to remove potential I/O bottlenecks in systems of over 1000 VAX equipment processors. New interconnection schemes and system software will allow more flexible topologies and extremely high data bandwidth, especially for on-line systems. This talk will summarize the work at the Advanced Computer Program and the rest of the US in this field. 3 refs., 4 figs.

  1. Fault recovery characteristics of the fault tolerant multi-processor

    NASA Technical Reports Server (NTRS)

    Padilla, Peter A.

    1990-01-01

    The fault handling performance of the fault tolerant multiprocessor (FTMP) was investigated. Fault handling errors detected during fault injection experiments were characterized. In these fault injection experiments, the FTMP disabled a working unit instead of the faulted unit once every 500 faults, on the average. System design weaknesses allow active faults to exercise a part of the fault management software that handles byzantine or lying faults. It is pointed out that these weak areas in the FTMP's design increase the probability that, for any hardware fault, a good LRU (line replaceable unit) is mistakenly disabled by the fault management software. It is concluded that fault injection can help detect and analyze the behavior of a system in the ultra-reliable regime. Although fault injection testing cannot be exhaustive, it has been demonstrated that it provides a unique capability to unmask problems and to characterize the behavior of a fault-tolerant system.

  2. Dynamically reconfigurable optical interconnect architecture for parallel multiprocessor systems

    NASA Astrophysics Data System (ADS)

    Girard, Mary M.; Husbands, Charles R.; Antoszewska, Reza

    1991-12-01

    The progress in parallel processing technology in recent years has resulted in increased requirements to process large amounts of data in real time. The massively parallel architectures proposed for these applications require the use of a high speed interconnect system to achieve processor-to-processor connectivity without incurring excessive delays. The characteristics of optical components permit high speed operation while the nonconductive nature of the optical medium eliminates ground loop and transmission line problems normally associated with a conductive medium. The MITRE Corp. is evaluating an optical wavelength division multiple access interconnect network design to improve interconnectivity within parallel processor systems and to allow reconfigurability of processor communication paths. This paper describes the architecture and control of and highlights the results from an 8- channel multiprocessor prototype with effective throughput of 3.2 Gigabits per second (Gbps).

  3. On-line data acquisition using a multiprocessor architecture

    SciTech Connect

    DeAsmundis, R.; Spadaccini, G.; Terrasi, F. )

    1989-10-01

    In this paper a system for on-line data processing based on a multiprocessor architecture and a PC/AT is presented. Many improvements with respect to a previous published architecture have been introduced in order to speed-up the processing and the display features of the system. A new CAMAC module for event identification and triggering has been designed and assembled. Collected data are now recorded on a magnetic tape unit, and integrated data (1-dim. and 2-dim. spectra) are handled by a PC/AT linked directly to the system. Although almost the whole code is stored in Eprom, the PC/AT supports also mass-storage and development features for debug phases of the system software.

  4. Parallel algorithm of VLBI software correlator under multiprocessor environment

    NASA Astrophysics Data System (ADS)

    Zheng, Weimin; Zhang, Dong

    2007-11-01

    The correlator is the key signal processing equipment of a Very Lone Baseline Interferometry (VLBI) synthetic aperture telescope. It receives the mass data collected by the VLBI observatories and produces the visibility function of the target, which can be used to spacecraft position, baseline length measurement, synthesis imaging, and other scientific applications. VLBI data correlation is a task of data intensive and computation intensive. This paper presents the algorithms of two parallel software correlators under multiprocessor environments. A near real-time correlator for spacecraft tracking adopts the pipelining and thread-parallel technology, and runs on the SMP (Symmetric Multiple Processor) servers. Another high speed prototype correlator using the mixed Pthreads and MPI (Massage Passing Interface) parallel algorithm is realized on a small Beowulf cluster platform. Both correlators have the characteristic of flexible structure, scalability, and with 10-station data correlating abilities.

  5. A measurement-based performability model for a multiprocessor system

    NASA Technical Reports Server (NTRS)

    Ilsueh, M. C.; Iyer, Ravi K.; Trivedi, K. S.

    1987-01-01

    A measurement-based performability model based on real error-data collected on a multiprocessor system is described. Model development from the raw errror-data to the estimation of cumulative reward is described. Both normal and failure behavior of the system are characterized. The measured data show that the holding times in key operational and failure states are not simple exponential and that semi-Markov process is necessary to model the system behavior. A reward function, based on the service rate and the error rate in each state, is then defined in order to estimate the performability of the system and to depict the cost of different failure types and recovery procedures.

  6. Multiprocessor system with daisy-chained processor selection

    SciTech Connect

    Yamanaka, K.

    1988-09-27

    A multiprocessor system is described comprising: a bus, a master operation processing unit connected to the bus for generating on the bus data to be processes and a command for processing the data, slave operation processing units connected to the bus for receiving the data and the command from the master operation processing unit. The salve operation processing unit includes a priority discriminator which sequentially selects the slave operation processing units in a preset priority sequence. The priority discriminator also includes means for determining conditions of each of the slave operation processing units and means for initiating execution of the command in the first slave operation processing units selected in the preset priority sequence which has its determined conditions meeting preselected conditions. The initiated slave operation processing unit includes means for processing the data in accordance with the command.

  7. An embedded multiprocessor computer for proof-of-principle testing of exploratory systems concepts

    SciTech Connect

    Borgman, C.R.; Dalton, L.J.

    1987-01-01

    This paper discusses the SANDAC V multiprocessor embedded computer hardware and software. Its expandable design provides adequate computing power for testing of various proof-of-principle (POP) exploratory system concepts. It is built from state-of-the-art integrated circuits with ASIC glue chips. A powerful software development system, multiprocessor on-board debugger, and a multitasking operating system kernel provide a user friendly software environment to complement the hardware.

  8. Hawk: An operating system kernel for a real-time embedded multiprocessor. [SANDAC V

    SciTech Connect

    Holmes, V.P.; Harris, D.L.; Piorkowksi, K.W.; Davidson, G.S.

    1987-01-01

    The Hawk operating system is a real-time multitasking environment for the SANDAC V, a Motorola 68020-based multiprocessor. This system is presented with respect to general target applications, the underlying multiprocessor hardware and the programming abstractions useful for real-time problems. The design principles, system features and user interface are discussed, together with an evaluation of the success of the Hawk effort and its usefulness in fielded application.

  9. A Tensor Product Formulation of Strassen's Matrix Multiplication Algorithm with Memory Reduction

    DOE PAGES

    Kumar, B.; Huang, C. -H.; Sadayappan, P.; Johnson, R. W.

    1995-01-01

    In this article, we present a program generation strategy of Strassen's matrix multiplication algorithm using a programming methodology based on tensor product formulas. In this methodology, block recursive programs such as the fast Fourier Transforms and Strassen's matrix multiplication algorithm are expressed as algebraic formulas involving tensor products and other matrix operations. Such formulas can be systematically translated to high-performance parallel/vector codes for various architectures. In this article, we present a nonrecursive implementation of Strassen's algorithm for shared memory vector processors such as the Cray Y-MP. A previous implementation of Strassen's algorithm synthesized from tensor product formulas required workingmore » storage of size O(7 n ) for multiplying 2 n × 2 n matrices. We present a modified formulation in which the working storage requirement is reduced to O(4 n ). The modified formulation exhibits sufficient parallelism for efficient implementation on a shared memory multiprocessor. Performance results on a Cray Y-MP8/64 are presented.« less

  10. Evict on write, a management strategy for a prefetch unit and/or first level cache in a multiprocessor system with speculative execution

    DOEpatents

    Gara, Alan; Ohmacht, Martin

    2014-09-16

    In a multiprocessor system with at least two levels of cache, a speculative thread may run on a core processor in parallel with other threads. When the thread seeks to do a write to main memory, this access is to be written through the first level cache to the second level cache. After the write though, the corresponding line is deleted from the first level cache and/or prefetch unit, so that any further accesses to the same location in main memory have to be retrieved from the second level cache. The second level cache keeps track of multiple versions of data, where more than one speculative thread is running in parallel, while the first level cache does not have any of the versions during speculation. A switch allows choosing between modes of operation of a speculation blind first level cache.

  11. Restructuring symbolic programs for concurrent execution on multiprocessors

    SciTech Connect

    Larus, J.R.

    1989-01-01

    CURARE, the program restructurer described in this dissertation, automatically transforms a sequential Lisp program into an equivalent concurrent program that executes on a multiprocessor. CURARE first analyzes a program to find its control and data dependences. This analysis is most difficult for references to structures connected by pointers. CURARE uses a new data-dependence algorithm, which finds and classifies these dependences. The analysis is conservative and may detect conflicts that do not arise in practice. A programmer can temper and refine its results with declarations. Dependences constrain the program's concurrent execution because, in general, two conflicting statements cannot execute in a different order without affecting the program's result. A restructurer must know all dependences in order to preserve them. However, not all dependences are essential to produce the program's result. CURARE attempts to transform the program so it computes its result with fewer conflicts. An optimized program will execute with less synchronization and more concurrency. CURARE then examines loops in a program to find those that are unconstrained or lightly constrained by dependences. By necessity, CURARE treats recursive functions as loops and does not limit itself to explicit program loops. Recursive functions offer several advantages over explicit loops since they provide a convenient framework for inserting locks and handling the dynamic behavior of symbolic programs. Loops that are suitable for concurrent execution are changed to execute on a set of concurrent server processes. These servers execute single loop iterations and therefore need to be extremely inexpensive to invoke.

  12. Efficient diagnosis of multiprocessor systems under probabilistic models

    NASA Technical Reports Server (NTRS)

    Blough, Douglas M.; Sullivan, Gregory F.; Masson, Gerald M.

    1989-01-01

    The problem of fault diagnosis in multiprocessor systems is considered under a probabilistic fault model. The focus is on minimizing the number of tests that must be conducted in order to correctly diagnose the state of every processor in the system with high probability. A diagnosis algorithm that can correctly diagnose the state of every processor with probability approaching one in a class of systems performing slightly greater than a linear number of tests is presented. A nearly matching lower bound on the number of tests required to achieve correct diagnosis in arbitrary systems is also proven. Lower and upper bounds on the number of tests required for regular systems are also presented. A class of regular systems which includes hypercubes is shown to be correctly diagnosable with high probability. In all cases, the number of tests required under this probabilistic model is shown to be significantly less than under a bounded-size fault set model. Because the number of tests that must be conducted is a measure of the diagnosis overhead, these results represent a dramatic improvement in the performance of system-level diagnosis techniques.

  13. Performance and economy of a fault-tolerant multiprocessor

    NASA Technical Reports Server (NTRS)

    Lala, J. H.; Smith, C. J.

    1979-01-01

    The FTMP (Fault-Tolerant Multiprocessor) is one of two central aircraft fault-tolerant architectures now in the prototype phase under NASA sponsorship. The intended application of the computer includes such critical real-time tasks as 'fly-by-wire' active control and completely automatic Category III landings of commercial aircraft. The FTMP architecture is briefly described and it is shown that it is a viable solution to the multi-faceted problems of safety, speed, and cost. Three job dispatch strategies are described, and their results with respect to job-starting delay are presented. The first strategy is a simple First-Come-First-Serve (FCFS) job dispatch executive. The other two schedulers are an adaptive FCFS and an interrupt driven scheduler. Three failure modes are discussed, and the FTMP survival probability in the face of random hard failures is evaluated. It is noted that the hourly cost of operating two FTMPs in a transport aircraft can be as little as one-to-two percent of the total flight-hour cost of the aircraft.

  14. MULTIPROCESSOR AND DISTRIBUTED PROCESSING BIBLIOGRAPHIC DATA BASE SOFTWARE SYSTEM

    NASA Technical Reports Server (NTRS)

    Miya, E. N.

    1994-01-01

    Multiprocessors and distributed processing are undergoing increased scientific scrutiny for many reasons. It is more and more difficult to keep track of the existing research in these fields. This package consists of a large machine-readable bibliographic data base which, in addition to the usual keyword searches, can be used for producing citations, indexes, and cross-references. The data base is compiled from smaller existing multiprocessing bibliographies, and tables of contents from journals and significant conferences. There are approximately 4,000 entries covering topics such as parallel and vector processing, networks, supercomputers, fault-tolerant computers, and cellular automata. Each entry is represented by 21 fields including keywords, author, referencing book or journal title, volume and page number, and date and city of publication. The data base contains UNIX 'refer' formatted ASCII data and can be implemented on any computer running under the UNIX operating system. The data base requires approximately one megabyte of secondary storage. The documentation for this program is included with the distribution tape, although it can be purchased for the price below. This bibliography was compiled in 1985 and updated in 1988.

  15. Memory interface simulator: A computer design aid

    NASA Technical Reports Server (NTRS)

    Taylor, D. S.; Williams, T.; Weatherbee, J. E.

    1972-01-01

    Results are presented of a study conducted with a digital simulation model being used in the design of the Automatically Reconfigurable Modular Multiprocessor System (ARMMS), a candidate computer system for future manned and unmanned space missions. The model simulates the activity involved as instructions are fetched from random access memory for execution in one of the system central processing units. A series of model runs measured instruction execution time under various assumptions pertaining to the CPU's and the interface between the CPU's and RAM. Design tradeoffs are presented in the following areas: Bus widths, CPU microprogram read only memory cycle time, multiple instruction fetch, and instruction mix.

  16. Memory management and compiler support for rapid recovery from failures in computer systems

    NASA Technical Reports Server (NTRS)

    Fuchs, W. K.

    1991-01-01

    This paper describes recent developments in the use of memory management and compiler technology to support rapid recovery from failures in computer systems. The techniques described include cache coherence protocols for user transparent checkpointing in multiprocessor systems, compiler-based checkpoint placement, compiler-based code modification for multiple instruction retry, and forward recovery in distributed systems utilizing optimistic execution.

  17. Meeting the memory challenges of brain-scale network simulation.

    PubMed

    Kunkel, Susanne; Potjans, Tobias C; Eppler, Jochen M; Plesser, Hans Ekkehard; Morrison, Abigail; Diesmann, Markus

    2011-01-01

    The development of high-performance simulation software is crucial for studying the brain connectome. Using connectome data to generate neurocomputational models requires software capable of coping with models on a variety of scales: from the microscale, investigating plasticity, and dynamics of circuits in local networks, to the macroscale, investigating the interactions between distinct brain regions. Prior to any serious dynamical investigation, the first task of network simulations is to check the consistency of data integrated in the connectome and constrain ranges for yet unknown parameters. Thanks to distributed computing techniques, it is possible today to routinely simulate local cortical networks of around 10(5) neurons with up to 10(9) synapses on clusters and multi-processor shared-memory machines. However, brain-scale networks are orders of magnitude larger than such local networks, in terms of numbers of neurons and synapses as well as in terms of computational load. Such networks have been investigated in individual studies, but the underlying simulation technologies have neither been described in sufficient detail to be reproducible nor made publicly available. Here, we discover that as the network model sizes approach the regime of meso- and macroscale simulations, memory consumption on individual compute nodes becomes a critical bottleneck. This is especially relevant on modern supercomputers such as the Blue Gene/P architecture where the available working memory per CPU core is rather limited. We develop a simple linear model to analyze the memory consumption of the constituent components of neuronal simulators as a function of network size and the number of cores used. This approach has multiple benefits. The model enables identification of key contributing components to memory saturation and prediction of the effects of potential improvements to code before any implementation takes place. As a consequence, development cycles can be shorter and

  18. Reliability Inherent in Heterogeneous Multiprocessor Systems and Task Scheduling for Ameliorating Their Reliability

    NASA Astrophysics Data System (ADS)

    Sugihara, Makoto

    Utilizing a heterogeneous multiprocessor system has become a popular design paradigm to build an embedded system at a cheap cost within short development time. A reliability issue for embedded systems, which is vulnerability to single event upsets (SEUs), has become a matter of concern as technology proceeds. This paper discusses reliability inherent in heterogeneous multiprocessors and proposes task scheduling for minimizing SEU vulnerability of them. This paper experimentally shows that increasing performance of a CPU core deteriorates its reliability. Based on the experimental observation, we propose task scheduling for reducing SEU vulnerability of a heterogeneous multiprocessor system. The experimental results demonstrate that our task scheduling technique can reduce much of SEU vulnerability under real-time constraints.

  19. Modeling techniques in a parallelizing compiler for the B-HIVE multiprocessor system

    NASA Technical Reports Server (NTRS)

    Kim, Sukil; Agrawal, Dharma P.; Mauney, Jon; Leu, Ja-Song

    1989-01-01

    The parallelizing compiler for the B-HIVE loosely-coupled multiprocessor system uses a medium grain model to minimize the communication overhead. A medium grain model is shown to be an optimum way of merging fine grain operations into parallel tasks such that the parallelism obtained at the grain level is retained and communication overhead is decreased. A new communication model is introduced in this paper, allowing additional overlap between computation and communication. Simulation results indicate that the medium grain communication model shows promise for automatic parallelization for a loosely-coupled multiprocessor system.

  20. Programmable Optoelectronic Multiprocessors: Design, Performance and CAD Development

    NASA Astrophysics Data System (ADS)

    Kiamilev, Fouad Eskender

    1992-01-01

    This thesis describes the development of Programmable Optoelectronic Multiprocessor (POEM) architectures and systems. POEM systems combine simple electronic processing elements with free-space optical interconnects to implement high-performance, massively-parallel computers. POEM architectures are fundamentally different from architectures used in conventional VLSI systems. Novel system partitioning and processing element design methods have been developed to ensure efficient implementation of POEM architectures with optoelectronic technology. The main contributions of this thesis are: architecture and software design for the POEM prototype built at UCSD; detailed technology design-tradeoff and comparison studies for POEM interconnection networks; and application of the VHSIC Hardware Description Language (VHDL) to the design, simulation, and synthesis of POEM computers. A general-purpose POEM SIMD parallel computer architecture has been designed for symbolic computing applications. A VHDL simulation of this architecture was written to test the POEM hardware running parallel programs prior to prototype fabrication. Detailed performance comparison of this architecture with all-optical computing, based on symbolic substitution, has also been carried out to show that POEMs offer higher computational efficiency. A detailed technological design of a packet-switched POEM multistage interconnection network system has been performed. This design uses optically interconnected stages of K x K electronic switching elements, where K is a variable parameter, called grain-size, that determines the ratio of optics to electronics in the system. A thorough cost and performance comparison between this design and existing VLSI implementations was undertaken to show that the POEM approach offers better scalability and higher performance. The grain-size was optimized, showing that switch sizes of 16 x 16 to 256 x 256 provide maximum performance/cost. The effects of varying

  1. Image-Data-Driven Dynamically-Reconfigurable Multiprocessor System In Automated Histopathology

    NASA Astrophysics Data System (ADS)

    Shoemaker, R. L.; Bartels, P. H.; Bartels, H.; Griswold, W. G.; Hillman, D.; Maenner, R.

    1986-04-01

    The diagnostic evaluation of biomedical imagery by computer presents a massive data processing problem that may be effectively handled by multiprocessor computer systems such as the Heidelberg Polyp. A hardware and software configuration of the Polyp is described that can run as a data-driven system directed by a knowledge data base for efficient image analysis.

  2. Evaluation of the impact chip multiprocessors have on SNL application performance.

    SciTech Connect

    Doerfler, Douglas W.

    2009-10-01

    This report describes trans-organizational efforts to investigate the impact of chip multiprocessors (CMPs) on the performance of important Sandia application codes. The impact of CMPs on the performance and applicability of Sandia's system software was also investigated. The goal of the investigation was to make algorithmic and architectural recommendations for next generation platform acquisitions.

  3. SIFT - Multiprocessor architecture for Software Implemented Fault Tolerance flight control and avionics computers

    NASA Technical Reports Server (NTRS)

    Forman, P.; Moses, K.

    1979-01-01

    A brief description of a SIFT (Software Implemented Fault Tolerance) Flight Control Computer with emphasis on implementation is presented. A multiprocessor system that relies on software-implemented fault detection and reconfiguration algorithms is described. A high level reliability and fault tolerance is achieved by the replication of computing tasks among processing units.

  4. Implementation of multigrid methods for solving Navier-Stokes equations on a multiprocessor system

    NASA Technical Reports Server (NTRS)

    Naik, Vijay K.; Taasan, Shlomo

    1987-01-01

    Presented are schemes for implementing multigrid algorithms on message based MIMD multiprocessor systems. To address the various issues involved, a nontrivial problem of solving the 2-D incompressible Navier-Stokes equations is considered as the model problem. Three different multigrid algorithms are considered. Results from implementing these algorithms on an Intel iPSC are presented.

  5. Hybrid Simulation of the Interaction of Europa's Atmosphere with the Jovian Plasma: Multiprocessor Simulations

    NASA Astrophysics Data System (ADS)

    Dols, V. J.; Delamere, P. A.; Bagenal, F.; Cassidy, T. A.; Crary, F. J.

    2014-12-01

    We model the interaction of Europa's tenuous atmosphere with the plasma of Jupiter's torus with an improved version of our hybrid plasma code. In a hybrid plasma code, the ions are treated as kinetic Macro-particles moving under the Lorentz force and the electrons as a fluid leading to a generalized formulation of Ohm's law. In this version, the spatial simulation domain is decomposed in 2 directions and is non-uniform in the plasma convection direction. The code is run on a multi-processor supercomputer that offers 16416 cores and 2GB Ram per core. This new version allows us to tap into the large memory of the supercomputer and simulate the full interaction volume (Reuropa=1561km) with a high spatial resolution (50km). Compared to Io, Europa's atmosphere is about 100 times more tenuous, the ambient magnetic field is weaker and the density of incident plasma is lower. Consequently, the electrodynamic interaction is also weaker and substantial fluxes of thermal torus ions might reach and sputter the icy surface. Molecular O2 is the dominant atmospheric product of this surface sputtering. Observations of oxygen UV emissions (specifically the ratio of OI 1356A / 1304A emissions) are roughly consistent with an atmosphere that is composed predominantely of O2 with a small amount of atomic O. Galileo observations along flybys close to Europa have revealed the existence of induced currents in a conducting ocean under the icy crust. They also showed that, from flyby to flyby, the plasma interaction is very variable. Asymmetries of the plasma density and temperature in the wake of Europa were also observed and still elude a clear explanation. Galileo mag data also detected ion cyclotron waves, which is an indication of heavy ion pickup close to the moon. We prescribe an O2 atmosphere with a vertical density column consistent with UV observations and model the plasma properties along several Galileo flybys of the moon. We compare our results with the magnetometer

  6. Apparatus for multiprocessor-based control of a multiagent robot

    NASA Technical Reports Server (NTRS)

    Peters, II, Richard Alan (Inventor)

    2009-01-01

    An architecture for robot intelligence enables a robot to learn new behaviors and create new behavior sequences autonomously and interact with a dynamically changing environment. Sensory information is mapped onto a Sensory Ego-Sphere (SES) that rapidly identifies important changes in the environment and functions much like short term memory. Behaviors are stored in a DBAM that creates an active map from the robot's current state to a goal state and functions much like long term memory. A dream state converts recent activities stored in the SES and creates or modifies behaviors in the DBAM.

  7. Controlling fine-grain non-numeric parallelism on a combinator-based multiprocessor system

    SciTech Connect

    Chu, Pong Ping.

    1989-01-01

    The author has developed a scheme to extend the SASL programming language and its run-time system for fine grain parallel processing. The proposed scheme provides a mechanism that can override the original lazy semantics by augmenting proper eager information. This information is first annotated in SASL programs and then translated to the combinator control tags by a new set of optimization rules. The effectiveness of this scheme has been evaluated through the simulation of a set of symbolic-oriented programs on an idealized shared-memory system. The results show that a considerable amount of parallelism can be extracted from a wide variety of application programs.

  8. Validation of fault-free behavior of a reliable multiprocessor system - FTMP: A case study. [Fault-Tolerant Multi-Processor avionics

    NASA Technical Reports Server (NTRS)

    Clune, E.; Segall, Z.; Siewiorek, D.

    1984-01-01

    A program of experiments has been conducted at NASA-Langley to test the fault-free performance of a Fault-Tolerant Multiprocessor (FTMP) avionics system for next-generation aircraft. Baseline measurements of an operating FTMP system were obtained with respect to the following parameters: instruction execution time, frame size, and the variation of clock ticks. The mechanisms of frame stretching were also investigated. The experimental results are summarized in a table. Areas of interest for future tests are identified, with emphasis given to the implementation of a synthetic workload generation mechanism on FTMP.

  9. Performance of multiprocessors and parallel algorithms: Quick-sort, a case study

    SciTech Connect

    Patil, I.M.

    1989-01-01

    Performance of parallel algorithms on multiprocessors has been traditionally analyzed by looking at either the algorithm or the architecture of the multiprocessor system. However, it is important to study the combined effect of both these factors in order to evaluate and predict performance. A different methodology based on approximate trace-driven simulation is adopted in this thesis to study the performance of a class of non-numerical algorithms. Performance of parallel quick-sort and parallel quick-merge sort is investigated in order to demonstrate the methodology as well as develop an understanding of the limitations imposed by a cache-based single bus environment on achievable speedup. A wide range of issues including the effect of cache parameters, coherency protocol, scheduling mechanisms and technology effects are discussed in the context of performance of the two versions of parallel quick-sort.

  10. Method for wiring allocation and switch configuration in a multiprocessor environment

    DOEpatents

    Aridor, Yariv; Domany, Tamar; Frachtenberg, Eitan; Gal, Yoav; Shmueli, Edi; Stockmeyer, legal representative, Robert E.; Stockmeyer, Larry Joseph

    2008-07-15

    A method for wiring allocation and switch configuration in a multiprocessor computer, the method including employing depth-first tree traversal to determine a plurality of paths among a plurality of processing elements allocated to a job along a plurality of switches and wires in a plurality of D-lines, and selecting one of the paths in accordance with at least one selection criterion.

  11. Multiprocessor data acquisition system for high event rates at the Heidelberg/Darmstadt crystal ball

    SciTech Connect

    Ender, C.; Manner, R.; Bauer, P. . Physikalisches Inst.)

    1989-10-01

    The Heidelberg/Darmstadt crystal ball detector uses a distributed data acquisition system consisting of a Fastbus/CAMAC front-end, the Heidelberg Polyp multiprocessor system with 30 processor modules, and an online VAX. For this heterogeneous multicomputer system a distributed real-time operating system was developed. The distributed computer system allows for event rates up to 2 x 10/sup 4/ events/s. It is managed user transparently and fault tolerantly.

  12. RTMPL: A structured programming and documentation utility for real-time multiprocessor simulations

    NASA Technical Reports Server (NTRS)

    Arpasi, D. J.

    1984-01-01

    The NASA Lewis Research Center is developing and evaluating experimental hardware and software systems to help meet future needs for real time simulations of air-breathing propulsion systems. The Real Time Multiprocessor Simulator (RTMPS) project is aimed at developing a prototype simulator system that uses multiple microprocessors to achieve the desired computing speed and accuracy at relatively low cost. Software utilities are being developed to provide engineering-level programming and interactive operation of the simulator. Two major software development efforts were undertaken in the RTMPS project. A real time multiprocessor operating system was developed to provide for interactive operation of the simulator. The second effort was aimed at developing a structured, high-level, engineering-oriented programming language and translator that would facilitate the programming of the simulator. The Real Time Multiprocessor Programming Language (RTMPL) allows the user to describe simulation tasks for each processor in a straight-forward, structured manner. The RTMPL utility acts as an assembly language programmer, translating the high-level simulation description into time-efficient assembly language code for the processors. The utility sets up all of the interfaces between the simulator hardware, firmware, and operating system.

  13. Maximizing run time performance of deployed data flow graphs on a multiprocessor architecture

    NASA Astrophysics Data System (ADS)

    Tobias, Richard J.; Hunt, Peter D.

    1993-10-01

    This paper discusses a practical solution for supporting the deployment of data flow graphs onto the Loral/Rolm Computer Systems, Inc. vector processing multi-processor architecture. It outlines the support software (both workstation hosted and target system hosted) that is required to design, debug, and maximize deployed data flow graph performance on the multiprocessor architecture. The deployment process guarantees real-time deadlines, minimizes run time scheduling overhead, and minimizes designer partitioning input. It is known that determining effective run time data flow graph node schedules for multi-processor architectures is an NP-complete class of problem not well suited to real-time systems. Loral/Rolm Computer Systems, Inc.'s vector processing toolset recognizes this problem and this paper discusses a prescheduling and pre-assignment approach for partitioning data flow graphs to available hardware resources. In particular the toolset components (which are based upon an enhanced data flow graph language) of workstation pre-assignment, prescheduling, run time gross allocation and local compute element dispatching are discussed in detail.

  14. Myrmics Memory Allocator

    SciTech Connect

    Lymperis, S.

    2011-09-23

    MMA is a stand-alone memory management system for MPI clusters. It implements a shared Partitioned Global Address Space, where multiple MPI processes request objects from the allocator and the latter provides them with system-wide unique memory addresses for each object. It provides applications with an intuitive way of managing the memory system in a unified way, thus enabling easier writing of irregular application code.

  15. Sharing code.

    PubMed

    Kubilius, Jonas

    2014-01-01

    Sharing code is becoming increasingly important in the wake of Open Science. In this review I describe and compare two popular code-sharing utilities, GitHub and Open Science Framework (OSF). GitHub is a mature, industry-standard tool but lacks focus towards researchers. In comparison, OSF offers a one-stop solution for researchers but a lot of functionality is still under development. I conclude by listing alternative lesser-known tools for code and materials sharing.

  16. Animal models of source memory.

    PubMed

    Crystal, Jonathon D

    2016-01-01

    Source memory is the aspect of episodic memory that encodes the origin (i.e., source) of information acquired in the past. Episodic memory (i.e., our memories for unique personal past events) typically involves source memory because those memories focus on the origin of previous events. Source memory is at work when, for example, someone tells a favorite joke to a person while avoiding retelling the joke to the friend who originally shared the joke. Importantly, source memory permits differentiation of one episodic memory from another because source memory includes features that were present when the different memories were formed. This article reviews recent efforts to develop an animal model of source memory using rats. Experiments are reviewed which suggest that source memory is dissociated from other forms of memory. The review highlights strengths and weaknesses of a number of animal models of episodic memory. Animal models of source memory may be used to probe the biological bases of memory. Moreover, these models can be combined with genetic models of Alzheimer's disease to evaluate pharmacotherapies that ultimately have the potential to improve memory.

  17. Shared Intentionality

    ERIC Educational Resources Information Center

    Tomasello, Michael; Carpenter, Malinda

    2007-01-01

    We argue for the importance of processes of shared intentionality in children's early cognitive development. We look briefly at four important social-cognitive skills and how they are transformed by shared intentionality. In each case, we look first at a kind of individualistic version of the skill--as exemplified most clearly in the behavior of…

  18. Distributed input/output processing in data-driven multiprocessors

    SciTech Connect

    Evripidou, P.; Gaudiot, J.L.

    1992-12-31

    Data-flow principles of execution provide an elegant way to ensure at runtime that instructions can be executed asynchronously in a parallel environment. However, while the conventional von Neumann model of interpretation has a very rigid ordering of instructions, it is the very asynchronous character of the dataflow model of execution that introduces conflicts when ``state`` tasks (such as I/O operations) must share common data objects. In order to execute I/O operations safely and in parallel, an algorithm to detect and classify cases of potential conflicts (hazards) has been developed; it is described in this paper. It is based upon localizing the effect of I/O operations by splitting the data-flow graph into two subgraphs: (a) the computation subgraph, and (b) the I/O subgraph. The scheme presented in this paper thus enables the creation and interaction of both subgraphs, which in turn yields a deterministic execution. Furthermore, the proposed scheme enables the distributed execution of I/O operations as permitted by data dependencies.

  19. Energy-efficient fault tolerance in multiprocessor real-time systems

    NASA Astrophysics Data System (ADS)

    Guo, Yifeng

    The recent progress in the multiprocessor/multicore systems has important implications for real-time system design and operation. From vehicle navigation to space applications as well as industrial control systems, the trend is to deploy multiple processors in real-time systems: systems with 4 -- 8 processors are common, and it is expected that many-core systems with dozens of processing cores will be available in near future. For such systems, in addition to general temporal requirement common for all real-time systems, two additional operational objectives are seen as critical: energy efficiency and fault tolerance. An intriguing dimension of the problem is that energy efficiency and fault tolerance are typically conflicting objectives, due to the fact that tolerating faults (e.g., permanent/transient) often requires extra resources with high energy consumption potential. In this dissertation, various techniques for energy-efficient fault tolerance in multiprocessor real-time systems have been investigated. First, the Reliability-Aware Power Management (RAPM) framework, which can preserve the system reliability with respect to transient faults when Dynamic Voltage Scaling (DVS) is applied for energy savings, is extended to support parallel real-time applications with precedence constraints. Next, the traditional Standby-Sparing (SS) technique for dual processor systems, which takes both transient and permanent faults into consideration while saving energy, is generalized to support multiprocessor systems with arbitrary number of identical processors. Observing the inefficient usage of slack time in the SS technique, a Preference-Oriented Scheduling Framework is designed to address the problem where tasks are given preferences for being executed as soon as possible (ASAP) or as late as possible (ALAP). A preference-oriented earliest deadline (POED) scheduler is proposed and its application in multiprocessor systems for energy-efficient fault tolerance is

  20. ScalaBLAST 2.0: Rapid and robust BLAST calculations on multiprocessor systems

    SciTech Connect

    Oehmen, Christopher S.; Baxter, Douglas J.

    2013-03-15

    BLAST remains one of the most widely used tools in computational biology. The rate at which new sequence data is available continues to grow exponentially, driving the emergence of new fields of biological research. At the same time multicore systems and conventional clusters are more accessible. ScalaBLAST has been designed to run on conventional multiprocessor systems with an eye to extreme parallelism, enabling parallel BLAST calculations using over 16,000 processing cores with a portable, robust, fault-resilient design. ScalaBLAST 2.0 source code can be freely downloaded from http://omics.pnl.gov/software/ScalaBLAST.php.

  1. Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 4: FTMP executive summary

    NASA Technical Reports Server (NTRS)

    Smith, T. B., III; Lala, J. H.

    1984-01-01

    The FTMP architecture is a high reliability computer concept modeled after a homogeneous multiprocessor architecture. Elements of the FTMP are operated in tight synchronism with one another and hardware fault-detection and fault-masking is provided which is transparent to the software. Operating system design and user software design is thus greatly simplified. Performance of the FTMP is also comparable to that of a simplex equivalent due to the efficiency of fault handling hardware. The FTMP project constructed an engineering module of the FTMP, programmed the machine and extensively tested the architecture through fault injection and other stress testing. This testing confirmed the soundness of the FTMP concepts.

  2. A high speed multi-tasking, multi-processor telemetry system

    SciTech Connect

    Wu, Kung Chris

    1996-12-31

    This paper describes a small size, light weight, multitasking, multiprocessor telemetry system capable of collecting 32 channels of differential signals at a sampling rate of 6.25 kHz per channel. The system is designed to collect data from remote wind turbine research sites and transfer the data via wireless communication. A description of operational theory, hardware components, and itemized cost is provided. Synchronization with other data acquisition systems and test data on data transmission rates is also given. 11 refs., 7 figs., 4 tabs.

  3. Mapping virtual addresses to different physical addresses for value disambiguation for thread memory access requests

    DOEpatents

    Gala, Alan; Ohmacht, Martin

    2014-09-02

    A multiprocessor system includes nodes. Each node includes a data path that includes a core, a TLB, and a first level cache implementing disambiguation. The system also includes at least one second level cache and a main memory. For thread memory access requests, the core uses an address associated with an instruction format of the core. The first level cache uses an address format related to the size of the main memory plus an offset corresponding to hardware thread meta data. The second level cache uses a physical main memory address plus software thread meta data to store the memory access request. The second level cache accesses the main memory using the physical address with neither the offset nor the thread meta data after resolving speculation. In short, this system includes mapping of a virtual address to a different physical addresses for value disambiguation for different threads.

  4. Advanced large-scale and high-speed multiprocessor system for scientific applications CRAY X-MP-4 series

    SciTech Connect

    Chen, S.S.

    1986-06-01

    This paper offers an overview of CRAY X-MP-4 system, a general purpose multiprocessor system for multitasking applications. Diagrams illustrate the overall system organization, data flow, vector computations and vector loop families benchmark timings. The CRAY X-MP-4 has scalar and vector applications.

  5. Photonic crystal optical memory

    NASA Astrophysics Data System (ADS)

    Lima, A. Wirth; Sombra, A. S. B.

    2011-06-01

    After several decades pushing the technology and the development of the world, the electronics is giving space for technologies that use light. We propose and analyze an optical memory embedded in a nonlinear photonic crystal (PhC), whose system of writing and reading data is controlled by an external command signal. This optical memory is based on optical directional couplers connected to a shared optical ring. Such a device can work over the C-Band of ITU (International Telecommunication Union).

  6. Computational principles of memory.

    PubMed

    Chaudhuri, Rishidev; Fiete, Ila

    2016-03-01

    The ability to store and later use information is essential for a variety of adaptive behaviors, including integration, learning, generalization, prediction and inference. In this Review, we survey theoretical principles that can allow the brain to construct persistent states for memory. We identify requirements that a memory system must satisfy and analyze existing models and hypothesized biological substrates in light of these requirements. We also highlight open questions, theoretical puzzles and problems shared with computer science and information theory. PMID:26906506

  7. Mechanical memory

    DOEpatents

    Gilkey, Jeffrey C.; Duesterhaus, Michelle A.; Peter, Frank J.; Renn, Rosemarie A.; Baker, Michael S.

    2006-08-15

    A first-in-first-out (FIFO) microelectromechanical memory apparatus (also termed a mechanical memory) is disclosed. The mechanical memory utilizes a plurality of memory cells, with each memory cell having a beam which can be bowed in either of two directions of curvature to indicate two different logic states for that memory cell. The memory cells can be arranged around a wheel which operates as a clocking actuator to serially shift data from one memory cell to the next. The mechanical memory can be formed using conventional surface micromachining, and can be formed as either a nonvolatile memory or as a volatile memory.

  8. Mechanical memory

    DOEpatents

    Gilkey, Jeffrey C.; Duesterhaus, Michelle A.; Peter, Frank J.; Renn, Rosemarie A.; Baker, Michael S.

    2006-05-16

    A first-in-first-out (FIFO) microelectromechanical memory apparatus (also termed a mechanical memory) is disclosed. The mechanical memory utilizes a plurality of memory cells, with each memory cell having a beam which can be bowed in either of two directions of curvature to indicate two different logic states for that memory cell. The memory cells can be arranged around a wheel which operates as a clocking actuator to serially shift data from one memory cell to the next. The mechanical memory can be formed using conventional surface micromachining, and can be formed as either a nonvolatile memory or as a volatile memory.

  9. Towards Scalable 1024 Processor Shared Memory Systems

    NASA Technical Reports Server (NTRS)

    Ciotti, Robert B.; Thigpen, William W. (Technical Monitor)

    2001-01-01

    Over the past 3 years, NASA Ames has been involved in a cooperative effort with SGI to develop the largest single system image systems available. Currently a 1024 Origin3OOO is under development, with first boot expected later in the summer of 2001. This paper discusses some early results with a 512p Origin3OOO system and some arcane IRIX system calls that can dramatically improve scaling performance.

  10. Debugging Fortran on a shared memory machine

    SciTech Connect

    Allen, T.R.; Padua, D.A.

    1987-01-01

    Debugging on a parallel processor is more difficult than debugging on a serial machine because errors in a parallel program may introduce nondeterminism. The approach to parallel debugging presented here attempts to reduce the problem of debugging on a parallel machine to that of debugging on a serial machine by automatically detecting nondeterminism. 20 refs., 6 figs.

  11. System and method for memory allocation in a multiclass memory system

    DOEpatents

    Loh, Gabriel; Meswani, Mitesh; Ignatowski, Michael; Nutter, Mark

    2016-06-28

    A system for memory allocation in a multiclass memory system includes a processor coupleable to a plurality of memories sharing a unified memory address space, and a library store to store a library of software functions. The processor identifies a type of a data structure in response to a memory allocation function call to the library for allocating memory to the data structure. Using the library, the processor allocates portions of the data structure among multiple memories of the multiclass memory system based on the type of the data structure.

  12. Design and construction of the high-speed optoelectronic memory system demonstrator.

    PubMed

    Barbieri, Roberto; Benabes, Philippe; Bierhoff, Thomas; Caswell, Josh J; Gauthier, Alain; Jahns, Jürgen; Jarczynski, Manfred; Lukowicz, Paul; Oksman, Jacques; Russell, Gordon A; Schrage, Jürgen; Snowdon, John F; Stübbe, Oliver; Troster, Gerhard; Wirz, Marco

    2008-07-01

    The high-speed optoelectronic memory system project is concerned with the reduction of latency within multiprocessor computer systems (a key problem) by the use of optoelectronics and associated packaging technologies. System demonstrators have been constructed to enable the evaluation of the technologies in terms of manufacturability. The system combines fiber, free space, and planar integrated optical waveguide technologies to augment the electronic memory and the processor components. Modeling and simulation techniques were developed toward the analysis and design of board-integrated waveguide transmission characteristics and optical interfacing. We describe the fabrication, assembly, and simulation of the major components within the system.

  13. Dynamic modelling and estimation of the error due to asynchronism in a redundant asynchronous multiprocessor system

    NASA Technical Reports Server (NTRS)

    Huynh, Loc C.; Duval, R. W.

    1986-01-01

    The use of Redundant Asynchronous Multiprocessor System to achieve ultrareliable Fault Tolerant Control Systems shows great promise. The development has been hampered by the inability to determine whether differences in the outputs of redundant CPU's are due to failures or to accrued error built up by slight differences in CPU clock intervals. This study derives an analytical dynamic model of the difference between redundant CPU's due to differences in their clock intervals and uses this model with on-line parameter identification to idenitify the differences in the clock intervals. The ability of this methodology to accurately track errors due to asynchronisity generate an error signal with the effect of asynchronisity removed and this signal may be used to detect and isolate actual system failures.

  14. Analysis of Photonic Networks for a Chip Multiprocessor Using Scientific Applications

    SciTech Connect

    Kamil, Shoaib A; Hendry, Gilbert; Biberman, Aleksandr; Chan, Johnnie; Lee, Benjamin G.; Mohiyuddin, Marghoob; Jain, Ankit; Bergman, Keren; Carloni, Luca; Kubiatowicz, John; Oliker, Leonid; Shalf, John

    2009-01-31

    As multiprocessors scale to unprecedented numbers of cores in order to sustain performance growth, it is vital that these gains are not nullified by high energy consumption from inter-core communication. With recent advances in 3D Integration CMOS technology, the possibility for realizing hybrid photonic-electronic networks-on-chip warrants investigating real application traces on functionally comparable photonic and electronic network designs. We present a comparative analysis using both synthetic benchmarks as well as real applications, run through detailed cycle accurate models implemented under the OMNeT++ discrete event simulation environment. Results show that when utilizing standard process-to-processor mapping methods, this hybrid network can achieve 75X improvement in energy efficiency for synthetic benchmarks and up to 37X improvement for real scientific applications, defined as network performance per energy spent, over an electronic mesh for large messages across a variety of communication patterns.

  15. Simulating a small turboshaft engine in real-time multiprocessor simulator (RTMPS) environment

    NASA Technical Reports Server (NTRS)

    Milner, E. J.; Arpasi, D. J.

    1986-01-01

    A Real-Time Multiprocessor Simulator (RTMPS) has been developed at NASA Lewis Research Center. The RTMPS uses parallel microprocessors to achieve computing speeds needed for real-time engine simulation. This report describes the use of the RTMPS system to simulate a small turboshaft engine. The process of programming the engine equations and distributing them over one, two, and four processors is discussed. Steady-state and transient results from the RTMPS simulation are compared with results from a main-frame-based simulation. Processor execution times and the associated execution time savings for the two and four processor cases are presented using actual data obtained from the RTMPS system. Included is a discussion of why the minimum achievable calculation time for the turboshaft engine model was attained using four processors. Finally, future enhancements to the RTMPS system are discussed including the development of a generalized partitioning algorithm to automatically distribute the system equations among the processors in optimum fashion.

  16. Characterizing parallel file-access patterns on a large-scale multiprocessor

    NASA Technical Reports Server (NTRS)

    Purakayastha, A.; Ellis, Carla; Kotz, David; Nieuwejaar, Nils; Best, Michael L.

    1995-01-01

    High-performance parallel file systems are needed to satisfy tremendous I/O requirements of parallel scientific applications. The design of such high-performance parallel file systems depends on a comprehensive understanding of the expected workload, but so far there have been very few usage studies of multiprocessor file systems. This paper is part of the CHARISMA project, which intends to fill this void by measuring real file-system workloads on various production parallel machines. In particular, we present results from the CM-5 at the National Center for Supercomputing Applications. Our results are unique because we collect information about nearly every individual I/O request from the mix of jobs running on the machine. Analysis of the traces leads to various recommendations for parallel file-system design.

  17. Closed-form solutions of performability. [modeling of a degradable buffer/multiprocessor system

    NASA Technical Reports Server (NTRS)

    Meyer, J. F.

    1981-01-01

    Methods which yield closed form performability solutions for continuous valued variables are developed. The models are similar to those employed in performance modeling (i.e., Markovian queueing models) but are extended so as to account for variations in structure due to faults. In particular, the modeling of a degradable buffer/multiprocessor system is considered whose performance Y is the (normalized) average throughput rate realized during a bounded interval of time. To avoid known difficulties associated with exact transient solutions, an approximate decomposition of the model is employed permitting certain submodels to be solved in equilibrium. These solutions are then incorporated in a model with fewer transient states and by solving the latter, a closed form solution of the system's performability is obtained. In conclusion, some applications of this solution are discussed and illustrated, including an example of design optimization.

  18. Partitioning strategy for efficient nonlinear finite element dynamic analysis on multiprocessor computers

    NASA Technical Reports Server (NTRS)

    Noor, Ahmed K.; Peters, Jeanne M.

    1989-01-01

    A computational procedure is presented for the nonlinear dynamic analysis of unsymmetric structures on vector multiprocessor systems. The procedure is based on a novel hierarchical partitioning strategy in which the response of the unsymmetric and antisymmetric response vectors (modes), each obtained by using only a fraction of the degrees of freedom of the original finite element model. The three key elements of the procedure which result in high degree of concurrency throughout the solution process are: (1) mixed (or primitive variable) formulation with independent shape functions for the different fields; (2) operator splitting or restructuring of the discrete equations at each time step to delineate the symmetric and antisymmetric vectors constituting the response; and (3) two level iterative process for generating the response of the structure. An assessment is made of the effectiveness of the procedure on the CRAY X-MP/4 computers.

  19. Multiprocessor data acquisition system for a 256x256-pixel infrared camera

    NASA Astrophysics Data System (ADS)

    Rodriguez-Ramos, Luis F.; Rodriguez-Mora, A.; Sosa, Nicolas A.; Diaz, Jose J.; Joven-Alvarez, Enrique

    1993-04-01

    The Department of Detectors of the Instituto de Astrofisica de Canarias, Spain is developing a data acquisition system (DAS) for an infrared camera based in a 256 X 256 InSb detector. The camera is going to work from 1 to 5 microns wavelength, with a scale on the sky of 0.5 arcsec per pixel, and will be installed as a common user instrument at the Carlos Sanchez Telescope in the Observatorio del Teide (Canary Island, Spain). A multiprocessor architecture has been chosen for the DAS, due to the very tight requirements on real time processing, and high speed storage capability (20 images per second readout rate, 2 images per second storage rate). The complete system is split into two main parts, the front end electronics and the user workstation. They are interconnected through an ETHERNET link.

  20. Event parallelism: Distributed memory parallel computing for high energy physics experiments

    SciTech Connect

    Nash, T.

    1989-05-01

    This paper describes the present and expected future development of distributed memory parallel computers for high energy physics experiments. It covers the use of event parallel microprocessor farms, particularly at Fermilab, including both ACP multiprocessors and farms of MicroVAXES. These systems have proven very cost effective in the past. A case is made for moving to the more open environment of UNIX and RISC processors. The 2nd Generation ACP Multiprocessor System, which is based on powerful RISC systems, is described. Given the promise of still more extraordinary increases in processor performance, a new emphasis on point to point, rather than bussed, communication will be required. Developments in this direction are described. 6 figs.

  1. Asynchronous and corrected-asynchronous numerical solutions of parabolic PDES on MIMD multiprocessors

    NASA Technical Reports Server (NTRS)

    Amitai, Dganit; Averbuch, Amir; Itzikowitz, Samuel; Turkel, Eli

    1991-01-01

    A major problem in achieving significant speed-up on parallel machines is the overhead involved with synchronizing the concurrent process. Removing the synchronization constraint has the potential of speeding up the computation. The authors present asynchronous (AS) and corrected-asynchronous (CA) finite difference schemes for the multi-dimensional heat equation. Although the discussion concentrates on the Euler scheme for the solution of the heat equation, it has the potential for being extended to other schemes and other parabolic partial differential equations (PDEs). These schemes are analyzed and implemented on the shared memory multi-user Sequent Balance machine. Numerical results for one and two dimensional problems are presented. It is shown experimentally that the synchronization penalty can be about 50 percent of run time: in most cases, the asynchronous scheme runs twice as fast as the parallel synchronous scheme. In general, the efficiency of the parallel schemes increases with processor load, with the time level, and with the problem dimension. The efficiency of the AS may reach 90 percent and over, but it provides accurate results only for steady-state values. The CA, on the other hand, is less efficient, but provides more accurate results for intermediate (non steady-state) values.

  2. Abnormal fault-recovery characteristics of the fault-tolerant multiprocessor uncovered using a new fault-injection methodology

    NASA Technical Reports Server (NTRS)

    Padilla, Peter A.

    1991-01-01

    An investigation was made in AIRLAB of the fault handling performance of the Fault Tolerant MultiProcessor (FTMP). Fault handling errors detected during fault injection experiments were characterized. In these fault injection experiments, the FTMP disabled a working unit instead of the faulted unit once in every 500 faults, on the average. System design weaknesses allow active faults to exercise a part of the fault management software that handles Byzantine or lying faults. Byzantine faults behave such that the faulted unit points to a working unit as the source of errors. The design's problems involve: (1) the design and interface between the simplex error detection hardware and the error processing software, (2) the functional capabilities of the FTMP system bus, and (3) the communication requirements of a multiprocessor architecture. These weak areas in the FTMP's design increase the probability that, for any hardware fault, a good line replacement unit (LRU) is mistakenly disabled by the fault management software.

  3. Abnormal fault-recovery characteristics of the fault-tolerant multiprocessor uncovered using a new fault-injection methodology

    NASA Astrophysics Data System (ADS)

    Padilla, Peter A.

    1991-03-01

    An investigation was made in AIRLAB of the fault handling performance of the Fault Tolerant MultiProcessor (FTMP). Fault handling errors detected during fault injection experiments were characterized. In these fault injection experiments, the FTMP disabled a working unit instead of the faulted unit once in every 500 faults, on the average. System design weaknesses allow active faults to exercise a part of the fault management software that handles Byzantine or lying faults. Byzantine faults behave such that the faulted unit points to a working unit as the source of errors. The design's problems involve: (1) the design and interface between the simplex error detection hardware and the error processing software, (2) the functional capabilities of the FTMP system bus, and (3) the communication requirements of a multiprocessor architecture. These weak areas in the FTMP's design increase the probability that, for any hardware fault, a good line replacement unit (LRU) is mistakenly disabled by the fault management software.

  4. Memory Matters

    MedlinePlus

    ... different parts. Some of them are important for memory. The hippocampus (say: hih-puh-KAM-pus) is one of the more important parts of the brain that processes memories. Old information and new information, or memories, are ...

  5. Use of a genetic algorithm to solve two-fluid flow problems on an NCUBE multiprocessor computer

    SciTech Connect

    Pryor, R.J.; Cline, D.D.

    1992-01-01

    A method of solving the two-phase fluid flow equations using a genetic algorithm on a NCUBE multiprocessor computer is presented. The topics discussed are the two-phase flow equations, the genetic representation of the unknowns, the fitness function, the genetic operators, and the implementation of the algorithm on the NCUBE computer. The efficiency of the implementation is investigated using a pipe blowdown problem. Effects of varying the genetic parameters and the number of processors are presented.

  6. Use of a genetic algorithm to solve two-fluid flow problems on an NCUBE multiprocessor computer

    SciTech Connect

    Pryor, R.J.; Cline, D.D.

    1992-12-31

    A method of solving the two-phase fluid flow equations using a genetic algorithm on a NCUBE multiprocessor computer is presented. The topics discussed are the two-phase flow equations, the genetic representation of the unknowns, the fitness function, the genetic operators, and the implementation of the algorithm on the NCUBE computer. The efficiency of the implementation is investigated using a pipe blowdown problem. Effects of varying the genetic parameters and the number of processors are presented.

  7. A Multiprocessor SoC Architecture with Efficient Communication Infrastructure and Advanced Compiler Support for Easy Application Development

    NASA Astrophysics Data System (ADS)

    Urfianto, Mohammad Zalfany; Isshiki, Tsuyoshi; Khan, Arif Ullah; Li, Dongju; Kunieda, Hiroaki

    This paper presentss a Multiprocessor System-on-Chips (MPSoC) architecture used as an execution platform for the new C-language based MPSoC design framework we are currently developing. The MPSoC architecture is based on an existing SoC platform with a commercial RISC core acting as the host CPU. We extend the existing SoC with a multiprocessor-array block that is used as the main engine to run parallel applications modeled in our design framework. Utilizing several optimizations provided by our compiler, an efficient inter-communication between processing elements with minimum overhead is implemented. A host-interface is designed to integrate the existing RISC core to the multiprocessor-array. The experimental results show that an efficacious integration is achieved, proving that the designed communication module can be used to efficiently incorporate off-the-shelf processors as a processing element for MPSoC architectures designed using our framework.

  8. Sharing values, sharing a vision

    SciTech Connect

    Not Available

    1993-12-31

    Teamwork, partnership and shared values emerged as recurring themes at the Third Technology Transfer/Communications Conference. The program drew about 100 participants who sat through a packed two days to find ways for their laboratories and facilities to better help American business and the economy. Co-hosts were the Lawrence Livermore National Laboratory and the Lawrence Berkeley Laboratory, where most meetings took place. The conference followed traditions established at the First Technology Transfer/Communications Conference, conceived of and hosted by the Pacific Northwest Laboratory in May 1992 in Richmond, Washington, and the second conference, hosted by the National Renewable Energy Laboratory in January 1993 in Golden, Colorado. As at the other conferences, participants at the third session represented the fields of technology transfer, public affairs and communications. They came from Department of Energy headquarters and DOE offices, laboratories and production facilities. Continued in this report are keynote address; panel discussion; workshops; and presentations in technology transfer.

  9. Blanket Gate Would Address Blocks Of Memory

    NASA Technical Reports Server (NTRS)

    Lambe, John; Moopenn, Alexander; Thakoor, Anilkumar P.

    1988-01-01

    Circuit-chip area used more efficiently. Proposed gate structure selectively allows and restricts access to blocks of memory in electronic neural-type network. By breaking memory into independent blocks, gate greatly simplifies problem of reading from and writing to memory. Since blocks not used simultaneously, share operational amplifiers that prompt and read information stored in memory cells. Fewer operational amplifiers needed, and chip area occupied reduced correspondingly. Cost per bit drops as result.

  10. 3-dimensional magnetotelluric inversion including topography using deformed hexahedral edge finite elements and direct solvers parallelized on symmetric multiprocessor computers - Part II: direct data-space inverse solution

    NASA Astrophysics Data System (ADS)

    Kordy, M.; Wannamaker, P.; Maris, V.; Cherkaev, E.; Hill, G.

    2016-01-01

    Following the creation described in Part I of a deformable edge finite-element simulator for 3-D magnetotelluric (MT) responses using direct solvers, in Part II we develop an algorithm named HexMT for 3-D regularized inversion of MT data including topography. Direct solvers parallelized on large-RAM, symmetric multiprocessor (SMP) workstations are used also for the Gauss-Newton model update. By exploiting the data-space approach, the computational cost of the model update becomes much less in both time and computer memory than the cost of the forward simulation. In order to regularize using the second norm of the gradient, we factor the matrix related to the regularization term and apply its inverse to the Jacobian, which is done using the MKL PARDISO library. For dense matrix multiplication and factorization related to the model update, we use the PLASMA library which shows very good scalability across processor cores. A synthetic test inversion using a simple hill model shows that including topography can be important; in this case depression of the electric field by the hill can cause false conductors at depth or mask the presence of resistive structure. With a simple model of two buried bricks, a uniform spatial weighting for the norm of model smoothing recovered more accurate locations for the tomographic images compared to weightings which were a function of parameter Jacobians. We implement joint inversion for static distortion matrices tested using the Dublin secret model 2, for which we are able to reduce nRMS to ˜1.1 while avoiding oscillatory convergence. Finally we test the code on field data by inverting full impedance and tipper MT responses collected around Mount St Helens in the Cascade volcanic chain. Among several prominent structures, the north-south trending, eruption-controlling shear zone is clearly imaged in the inversion.

  11. Bipartite memory network architectures for parallel processing

    SciTech Connect

    Smith, W.; Kale, L.V. . Dept. of Computer Science)

    1990-01-01

    Parallel architectures are boradly classified as either shared memory or distributed memory architectures. In this paper, the authors propose a third family of architectures, called bipartite memory network architectures. In this architecture, processors and memory modules constitute a bipartite graph, where each processor is allowed to access a small subset of the memory modules, and each memory module allows access from a small set of processors. The architecture is particularly suitable for computations requiring dynamic load balancing. The authors explore the properties of this architecture by examining the Perfect Difference set based topology for the graph. Extensions of this topology are also suggested.

  12. Memory Palaces

    ERIC Educational Resources Information Center

    Wood, Marianne

    2007-01-01

    This article presents a lesson called Memory Palaces. A memory palace is a memory tool used to remember information, usually as visual images, in a sequence that is logical to the person remembering it. In his book, "In the Palaces of Memory", George Johnson calls them "...structure(s) for arranging knowledge. Lots of connections to language arts,…

  13. Episodic memory in nonhuman animals

    PubMed Central

    Templer, Victoria L.

    2013-01-01

    Summary Episodic memories differ from other types of memory because they represent aspects of the past not present in other memories, such as the time, place, or social context in which the memories were formed. Focus on phenomenal experience in human memory, such as the sense of “having been there” has resulted in conceptualizations of episodic memory that are difficult or impossible to apply to nonhumans. It is therefore a significant challenge for investigators to agree on objective behavioral criteria that can be applied in nonhumans and still capture features of memory thought to be critical in humans. Some investigators have attempted to use neurobiological parallels to bridge this gap. However, defining memory types on the basis of the brain structures involved rather than on identified cognitive mechanisms risks missing the most crucial functional aspects of episodic memory, which are ultimately behavioral. The most productive way forward is likely a combination of neurobiology and sophisticated cognitive testing that identifies the mental representations present in episodic memory. Investigators that have refined their approach from asking the naïve question “do nonhuman animals have episodic memory” to instead asking “what aspects of episodic memory are shared by humans and nonhumans” are making progress. PMID:24028963

  14. Fault-free behavior of reliable multiprocessor systems: FTMP experiments in AIRLAB

    NASA Technical Reports Server (NTRS)

    Clune, E.; Segall, Z.; Siewiorek, D.

    1985-01-01

    This report describes a set of experiments which were implemented on the Fault tolerant Multi-Processor (FTMP) at NASA/Langley's AIRLAB facility. These experiments are part of an effort to formulate and evaluate validation methodologies for fault-tolerant computers. This report deals with the measurement of single parameters (baselines) of a fault free system. The initial set of baseline experiments lead to the following conclusions: (1) The system clock is constant and independent of workload in the tested cases; (2) the instruction execution times are constant; (3) the R4 frame size is 40mS with some variation; (4) the frame stretching mechanism has some flaws in its implementation that allow the possibility of an infinite stretching of frame duration. Future experiments are planned. Some will broaden the results of these initial experiments. Others will measure the system more dynamically. The implementation of a synthetic workload generation mechanism for FTMP is planned to enhance the experimental environment of the system.

  15. MASC: Multiprocessor Architecture for Symbolic Processing. Final report, March 1986-September 1988

    SciTech Connect

    Hopkins, W.C.; Blenko, T.M.; Cassell, K.; Clark, J.P.; Coltoff, J.B.

    1989-08-01

    The MASC program addresses the support of symbolic computation for Artificial Intelligence (AI), as part of the DARPA Strategic Computing Program. The work has been aimed at the fundamental questions of choosing an appropriate programming model for AI and finding efficient implementation techniques to support it on multiprocessor systems. The Strategic Computing Program is motivated by the need to provide real-time response by AI subsystems in planned or contemplated DoD systems. The authors efforts have directly supported this goal, contributing fundamental results in language design, language implementation, and application parallelism. The work has been driven by the needs of real AI applications, drawn from their substantial experience in natural language processing, knowledge representation, and expert systems. The technologies they have developed are broadly applicable, and several specific recommendations for their further development and application are presented. The most profound result is the design and demonstration of a powerful technique for the compilation of logic programs to applicative form. This brings the power of compilation and optimization techniques for functional programming to bear on logic programming languages. It is recommended that these technologies be extended and applied to current research efforts in language design for very-high-level programming and program prototyping. Other results and recommendations are contained in the body of the report.

  16. Design of a resource allocation and management mechanism for a multiprocessor

    SciTech Connect

    Lin, W.

    1985-01-01

    This work presents the design of a resource allocation and management mechanism for a multiprocessor, Star. The mechanism is designed for supporting highly parallel computation. Three major functions performed by the mechanism are: (1) resource allocation - for each task a suitable execution environment will be created, which contains an adequate number of processors, and these processors are reconfigured to achieve the desired topologies to match communication graph, (2) multitasking - several execution environments can be simultaneously created to execute different tasks on the system, and (3) distributed and structured management - the entire system can be dynamically partitioned into independent management entities, in each of which the resources are structured and managed in the form of objects, rather than individual system components. In addition, to demonstrate the effectiveness of the proposed mechanism, a simulation is performed to gather various statistics. With the simulation, the kind of space-dependent scheduling disciplines that can best match the proposed mechanism to achieve effective resource utilization can be examined.

  17. Use of a genetic algorithm to solve fluid flow problems on an NCUBE/2 multiprocessor computer

    SciTech Connect

    Pryor, R.J.; Cline, D.D.

    1992-04-01

    This paper presents a method to solve partial differential equations governing two-phase fluid flow by using a genetic algorithm on the NCUBE/2 multiprocessor computer. Genetic algorithms represent a significant departure from traditional approaches of solving fluid flow problems. The inherent parallelism of genetic algorithms offers the prospect of obtaining solutions faster than ever possible. The paper discusses the two-phase flow equations, the genetic representation of the unknowns, the fitness function, the genetic operators, and the implementation of the genetic algorithm on the NCUBE/2 computer. The paper investigates the implementation efficiency using a pipe blowdown test and presents the effects of varying both the genetic parameters and the number of processors. The results show that genetic algorithms provide a major advancement in methods for solving two-phase flow problems. A desired goal of solving these equations for a specific simulation problem in real time or faster requires computers with an order of magnitude more processors or faster than the NCUBE/2's 1024.

  18. Use of a genetic algorithm to solve fluid flow problems on an NCUBE/2 multiprocessor computer

    SciTech Connect

    Pryor, R.J.; Cline, D.D.

    1992-04-01

    This paper presents a method to solve partial differential equations governing two-phase fluid flow by using a genetic algorithm on the NCUBE/2 multiprocessor computer. Genetic algorithms represent a significant departure from traditional approaches of solving fluid flow problems. The inherent parallelism of genetic algorithms offers the prospect of obtaining solutions faster than ever possible. The paper discusses the two-phase flow equations, the genetic representation of the unknowns, the fitness function, the genetic operators, and the implementation of the genetic algorithm on the NCUBE/2 computer. The paper investigates the implementation efficiency using a pipe blowdown test and presents the effects of varying both the genetic parameters and the number of processors. The results show that genetic algorithms provide a major advancement in methods for solving two-phase flow problems. A desired goal of solving these equations for a specific simulation problem in real time or faster requires computers with an order of magnitude more processors or faster than the NCUBE/2`s 1024.

  19. The Performance of Parallel Disk Write Methods for Linux Multiprocessor Nodes

    SciTech Connect

    Benson, G D; Long, K; Pacheco, P

    2003-05-07

    Despite increasing attention paid to parallel I/O and the introduction of MPI-IO, there is limited, practical data to help guide a programmer in the choice of a good parallel write strategy in the absence of a parallel file system. In this study we experimentally evaluate several methods for implementing parallel computations that interleave a significant number of contiguous or strided writes to a local disk on Linux-based multiprocessor nodes. Using synthetic benchmark programs written with MPI and Pthreads, we have acquired detailed performance data for different application characteristics of programs running on dual processor nodes. In general, our results show that programs that perform a significant amount of I/O relative to pure computation benefit greatly from the use of threads, while programs that perform relatively little I/O obtain excellent results using only MPI. For a pure MPI approach, we have found that it is best to use two writing processes with mmap(). For Pthreads it is usually best to use write() for contiguous data and writev() for strided data. Codes that use mmap() tend to benefit from periodic syncs of the data of the data to the disk, while codes that use write() or writev() tend to have better performance with few syncs. A straightforward use of ROMIO usually does not perform as well as these direct approaches for writing to the local disk.

  20. MIMD (multiple instruction multiple data) multiprocessor system for real-time image processing

    NASA Astrophysics Data System (ADS)

    Pirsch, Peter; Jeschke, Hartwig

    1991-06-01

    Anovel MIMD (Multiple Instruction Multiple Data) based architecture consisting of multiple processing elements (PE) has been developed. This architecture is adapted to real-time processing of sequences of different tasks for local image segments. Each PE contains an arithmetic processing unit (APU), adapted to parallel processing of low level operations, and a high level and control processor (HLCP) for medium and high level operations and control of the PE. This HLCP can be a standard signal processor or a RISC processor. Because of the local control of each PE by the HLCP and a SIMD structure of the APU, the overall system architecture is characterized as MIMD based with a local SIMD structure for low level processing. Due to an overlapped computation and communication the multiprocessor system achieves a linear speedup compared to a single processing element. Main parts of the PE have been realized as two ASICs in a 1.5 jim CMOS-Process. With a system clock rate of 25MHz, each PE provides a peak performance of 400 Mega operations per second (MOPS).

  1. Scheduling for energy and reliability management on multiprocessor real-time systems

    NASA Astrophysics Data System (ADS)

    Qi, Xuan

    Scheduling algorithms for multiprocessor real-time systems have been studied for years with many well-recognized algorithms proposed. However, it is still an evolving research area and many problems remain open due to their intrinsic complexities. With the emergence of multicore processors, it is necessary to re-investigate the scheduling problems and design/develop efficient algorithms for better system utilization, low scheduling overhead, high energy efficiency, and better system reliability. Focusing cluster schedulings with optimal global schedulers, we study the utilization bound and scheduling overhead for a class of cluster-optimal schedulers. Then, taking energy/power consumption into consideration, we developed energy-efficient scheduling algorithms for real-time systems, especially for the proliferating embedded systems with limited energy budget. As the commonly deployed energy-saving technique (e.g. dynamic voltage frequency scaling (DVFS)) will significantly affect system reliability, we study schedulers that have intelligent mechanisms to recuperate system reliability to satisfy the quality assurance requirements. Extensive simulation is conducted to evaluate the performance of the proposed algorithms on reduction of scheduling overhead, energy saving, and reliability improvement. The simulation results show that the proposed reliability-aware power management schemes could preserve the system reliability while still achieving substantial energy saving.

  2. Reusing existing resources for testing a multi-processor system-on-chip

    NASA Astrophysics Data System (ADS)

    Lee, Seung Eun

    2013-03-01

    In this article, we propose a test strategy for a multi-processor system-on-chip and model the test time for distributed Intellectual Property (IP) cores. The proposed test methodology uses the existing on-chip resources, IP cores and network elements in network-on-chip. The use of embedded IP cores as a built- in self-test (BIST) module completes the test much faster than an external test and provides flexibility in the test program. Moreover, the reuse of the existing network resources as a test media eliminates additional test access mechanism (TAM) wires in the design and increases test parallelism, reducing the area and test time. Based on the proposed test methodology, we evaluate the test time for distributed IP cores. First, we define the model for a distributed IP core with four parameters in the context of test purposes. Next, the required test time is driven. Finally, we show the characteristics of IP cores for a parallel testing that provides useful information for the test scheduling.

  3. Computer vector multiprocessing control with multiple access memory and priority conflict resolution method

    SciTech Connect

    Chen, S.S.; Schiffleger, A.J.

    1990-02-13

    This patent describes a multiprocessor memory system. It comprises: a central memory comprised of a plurality of independently addressable memory banks organized into a plurality of sections each accessible through a plurality of access paths; a plurality of processing machines; each of the processing machine including a plurality of ports for generating memory references to any one of the central memory sections; and conflict resolution means interfacing each of the ports to each of the central memory sections through the central memory access paths. The resolution means for receiving references from the ports and coordinating and controlling the procession of the references along to the access paths. The conflict resolution means comprising a plurality of conflict resolution circuits corresponding in number to the memory sections, each of the circuits receiving the references to its corresponding section from any one of the ports and selectively conveying the references to the access paths for the corresponding section. The circuits each including; means for checking the readiness of the memory banks to be referenced and holding a reference to a busy one of the banks until the bank is ready to be referenced; means for detecting when more than one of the references is pending to the same bank simultaneously and holding all but one of the simultaneously pending references; and means communicating with the ports and the other of the conflict resolution circuits to cause one of the ports referencing the memory to suspend generation of further references when a reference from the referencing port is being held.

  4. The FORCE - A highly portable parallel programming language

    NASA Technical Reports Server (NTRS)

    Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

    1989-01-01

    This paper explains why the FORCE parallel programming language is easily portable among six different shared-memory multiprocessors, and how a two-level macro preprocessor makes it possible to hide low-level machine dependencies and to build machine-independent high-level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared-memory multiprocessor executing them.

  5. The evolution of episodic memory

    PubMed Central

    Allen, Timothy A.; Fortin, Norbert J.

    2013-01-01

    One prominent view holds that episodic memory emerged recently in humans and lacks a “(neo)Darwinian evolution” [Tulving E (2002) Annu Rev Psychol 53:1–25]. Here, we review evidence supporting the alternative perspective that episodic memory has a long evolutionary history. We show that fundamental features of episodic memory capacity are present in mammals and birds and that the major brain regions responsible for episodic memory in humans have anatomical and functional homologs in other species. We propose that episodic memory capacity depends on a fundamental neural circuit that is similar across mammalian and avian species, suggesting that protoepisodic memory systems exist across amniotes and, possibly, all vertebrates. The implication is that episodic memory in diverse species may primarily be due to a shared underlying neural ancestry, rather than the result of evolutionary convergence. We also discuss potential advantages that episodic memory may offer, as well as species-specific divergences that have developed on top of the fundamental episodic memory architecture. We conclude by identifying possible time points for the emergence of episodic memory in evolution, to help guide further research in this area. PMID:23754432

  6. Computational design of RNA parts, devices, and transcripts with kinetic folding algorithms implemented on multiprocessor clusters.

    PubMed

    Thimmaiah, Tim; Voje, William E; Carothers, James M

    2015-01-01

    With progress toward inexpensive, large-scale DNA assembly, the demand for simulation tools that allow the rapid construction of synthetic biological devices with predictable behaviors continues to increase. By combining engineered transcript components, such as ribosome binding sites, transcriptional terminators, ligand-binding aptamers, catalytic ribozymes, and aptamer-controlled ribozymes (aptazymes), gene expression in bacteria can be fine-tuned, with many corollaries and applications in yeast and mammalian cells. The successful design of genetic constructs that implement these kinds of RNA-based control mechanisms requires modeling and analyzing kinetically determined co-transcriptional folding pathways. Transcript design methods using stochastic kinetic folding simulations to search spacer sequence libraries for motifs enabling the assembly of RNA component parts into static ribozyme- and dynamic aptazyme-regulated expression devices with quantitatively predictable functions (rREDs and aREDs, respectively) have been described (Carothers et al., Science 334:1716-1719, 2011). Here, we provide a detailed practical procedure for computational transcript design by illustrating a high throughput, multiprocessor approach for evaluating spacer sequences and generating functional rREDs. This chapter is written as a tutorial, complete with pseudo-code and step-by-step instructions for setting up a computational cluster with an Amazon, Inc. web server and performing the large numbers of kinefold-based stochastic kinetic co-transcriptional folding simulations needed to design functional rREDs and aREDs. The method described here should be broadly applicable for designing and analyzing a variety of synthetic RNA parts, devices and transcripts.

  7. Efficient mapping algorithms for scheduling robot inverse dynamics computation on a multiprocessor system

    NASA Technical Reports Server (NTRS)

    Lee, C. S. G.; Chen, C. L.

    1989-01-01

    Two efficient mapping algorithms for scheduling the robot inverse dynamics computation consisting of m computational modules with precedence relationship to be executed on a multiprocessor system consisting of p identical homogeneous processors with processor and communication costs to achieve minimum computation time are presented. An objective function is defined in terms of the sum of the processor finishing time and the interprocessor communication time. The minimax optimization is performed on the objective function to obtain the best mapping. This mapping problem can be formulated as a combination of the graph partitioning and the scheduling problems; both have been known to be NP-complete. Thus, to speed up the searching for a solution, two heuristic algorithms were proposed to obtain fast but suboptimal mapping solutions. The first algorithm utilizes the level and the communication intensity of the task modules to construct an ordered priority list of ready modules and the module assignment is performed by a weighted bipartite matching algorithm. For a near-optimal mapping solution, the problem can be solved by the heuristic algorithm with simulated annealing. These proposed optimization algorithms can solve various large-scale problems within a reasonable time. Computer simulations were performed to evaluate and verify the performance and the validity of the proposed mapping algorithms. Finally, experiments for computing the inverse dynamics of a six-jointed PUMA-like manipulator based on the Newton-Euler dynamic equations were implemented on an NCUBE/ten hypercube computer to verify the proposed mapping algorithms. Computer simulation and experimental results are compared and discussed.

  8. Vicarious memories.

    PubMed

    Pillemer, David B; Steiner, Kristina L; Kuwabara, Kie J; Thomsen, Dorthe Kirkegaard; Svob, Connie

    2015-11-01

    People not only have vivid memories of their own personal experiences, but also vicarious memories of events that happened to other people. To compare the phenomenological and functional qualities of personal and vicarious memories, college students described a specific past event that they had recounted to a parent or friend, and also an event that a friend or parent had recounted to them. Although ratings of memory vividness, emotional intensity, visualization, and physical reactions were higher for personal than for vicarious memories, the overall pattern of ratings was similar. Participants' ratings also indicated that vicarious memories serve many of the same life functions as personal memories, although at lower levels of intensity. The findings suggest that current conceptions of autobiographical memory, which focus on past events that happened directly to the self, should be expanded to include detailed mental representations of specific past events that happened to other people.

  9. Memory Dysfunction

    PubMed Central

    Matthews, Brandy R.

    2015-01-01

    Purpose of Review: This article highlights the dissociable human memory systems of episodic, semantic, and procedural memory in the context of neurologic illnesses known to adversely affect specific neuroanatomic structures relevant to each memory system. Recent Findings: Advances in functional neuroimaging and refinement of neuropsychological and bedside assessment tools continue to support a model of multiple memory systems that are distinct yet complementary and to support the potential for one system to be engaged as a compensatory strategy when a counterpart system fails. Summary: Episodic memory, the ability to recall personal episodes, is the subtype of memory most often perceived as dysfunctional by patients and informants. Medial temporal lobe structures, especially the hippocampal formation and associated cortical and subcortical structures, are most often associated with episodic memory loss. Episodic memory dysfunction may present acutely, as in concussion; transiently, as in transient global amnesia (TGA); subacutely, as in thiamine deficiency; or chronically, as in Alzheimer disease. Semantic memory refers to acquired knowledge about the world. Anterior and inferior temporal lobe structures are most often associated with semantic memory loss. The semantic variant of primary progressive aphasia (svPPA) is the paradigmatic disorder resulting in predominant semantic memory dysfunction. Working memory, associated with frontal lobe function, is the active maintenance of information in the mind that can be potentially manipulated to complete goal-directed tasks. Procedural memory, the ability to learn skills that become automatic, involves the basal ganglia, cerebellum, and supplementary motor cortex. Parkinson disease and related disorders result in procedural memory deficits. Most memory concerns warrant bedside cognitive or neuropsychological evaluation and neuroimaging to assess for specific neuropathologies and guide treatment. PMID:26039844

  10. Memory protection

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.

    1988-01-01

    Accidental overwriting of files or of memory regions belonging to other programs, browsing of personal files by superusers, Trojan horses, and viruses are examples of breakdowns in workstations and personal computers that would be significantly reduced by memory protection. Memory protection is the capability of an operating system and supporting hardware to delimit segments of memory, to control whether segments can be read from or written into, and to confine accesses of a program to its segments alone. The absence of memory protection in many operating systems today is the result of a bias toward a narrow definition of performance as maximum instruction-execution rate. A broader definition, including the time to get the job done, makes clear that cost of recovery from memory interference errors reduces expected performance. The mechanisms of memory protection are well understood, powerful, efficient, and elegant. They add to performance in the broad sense without reducing instruction execution rate.

  11. Parallel processing of real-time dynamic systems simulation on OSCAR (Optimally SCheduled Advanced multiprocessoR)

    NASA Technical Reports Server (NTRS)

    Kasahara, Hironori; Honda, Hiroki; Narita, Seinosuke

    1989-01-01

    Parallel processing of real-time dynamic systems simulation on a multiprocessor system named OSCAR is presented. In the simulation of dynamic systems, generally, the same calculation are repeated every time step. However, we cannot apply to Do-all or the Do-across techniques for parallel processing of the simulation since there exist data dependencies from the end of an iteration to the beginning of the next iteration and furthermore data-input and data-output are required every sampling time period. Therefore, parallelism inside the calculation required for a single time step, or a large basic block which consists of arithmetic assignment statements, must be used. In the proposed method, near fine grain tasks, each of which consists of one or more floating point operations, are generated to extract the parallelism from the calculation and assigned to processors by using optimal static scheduling at compile time in order to reduce large run time overhead caused by the use of near fine grain tasks. The practicality of the scheme is demonstrated on OSCAR (Optimally SCheduled Advanced multiprocessoR) which has been developed to extract advantageous features of static scheduling algorithms to the maximum extent.

  12. Process Management and Exception Handling in Multiprocessor Operating Systems Using Object-Oriented Design Techniques. Revised Sep. 1988

    NASA Technical Reports Server (NTRS)

    Russo, Vincent; Johnston, Gary; Campbell, Roy

    1988-01-01

    The programming of the interrupt handling mechanisms, process switching primitives, scheduling mechanism, and synchronization primitives of an operating system for a multiprocessor require both efficient code in order to support the needs of high- performance or real-time applications and careful organization to facilitate maintenance. Although many advantages have been claimed for object-oriented class hierarchical languages and their corresponding design methodologies, the application of these techniques to the design of the primitives within an operating system has not been widely demonstrated. To investigate the role of class hierarchical design in systems programming, the authors have constructed the Choices multiprocessor operating system architecture the C++ programming language. During the implementation, it was found that many operating system design concerns can be represented advantageously using a class hierarchical approach, including: the separation of mechanism and policy; the organization of an operating system into layers, each of which represents an abstract machine; and the notions of process and exception management. In this paper, we discuss an implementation of the low-level primitives of this system and outline the strategy by which we developed our solution.

  13. Quantum memory Quantum memory

    NASA Astrophysics Data System (ADS)

    Le Gouët, Jean-Louis; Moiseev, Sergey

    2012-06-01

    Interaction of quantum radiation with multi-particle ensembles has sparked off intense research efforts during the past decade. Emblematic of this field is the quantum memory scheme, where a quantum state of light is mapped onto an ensemble of atoms and then recovered in its original shape. While opening new access to the basics of light-atom interaction, quantum memory also appears as a key element for information processing applications, such as linear optics quantum computation and long-distance quantum communication via quantum repeaters. Not surprisingly, it is far from trivial to practically recover a stored quantum state of light and, although impressive progress has already been accomplished, researchers are still struggling to reach this ambitious objective. This special issue provides an account of the state-of-the-art in a fast-moving research area that makes physicists, engineers and chemists work together at the forefront of their discipline, involving quantum fields and atoms in different media, magnetic resonance techniques and material science. Various strategies have been considered to store and retrieve quantum light. The explored designs belong to three main—while still overlapping—classes. In architectures derived from photon echo, information is mapped over the spectral components of inhomogeneously broadened absorption bands, such as those encountered in rare earth ion doped crystals and atomic gases in external gradient magnetic field. Protocols based on electromagnetic induced transparency also rely on resonant excitation and are ideally suited to the homogeneous absorption lines offered by laser cooled atomic clouds or ion Coulomb crystals. Finally off-resonance approaches are illustrated by Faraday and Raman processes. Coupling with an optical cavity may enhance the storage process, even for negligibly small atom number. Multiple scattering is also proposed as a way to enlarge the quantum interaction distance of light with matter. The

  14. Declarative memory.

    PubMed

    Riedel, Wim J; Blokland, Arjan

    2015-01-01

    Declarative Memory consists of memory for events (episodic memory) and facts (semantic memory). Methods to test declarative memory are key in investigating effects of potential cognition-enhancing substances--medicinal drugs or nutrients. A number of cognitive performance tests assessing declarative episodic memory tapping verbal learning, logical memory, pattern recognition memory, and paired associates learning are described. These tests have been used as outcome variables in 34 studies in humans that have been described in the literature in the past 10 years. Also, the use of episodic tests in animal research is discussed also in relation to the drug effects in these tasks. The results show that nutritional supplementation of polyunsaturated fatty acids has been investigated most abundantly and, in a number of cases, but not all, show indications of positive effects on declarative memory, more so in elderly than in young subjects. Studies investigating effects of registered anti-Alzheimer drugs, cholinesterase inhibitors in mild cognitive impairment, show positive and negative effects on declarative memory. Studies mainly carried out in healthy volunteers investigating the effects of acute dopamine stimulation indicate enhanced memory consolidation as manifested specifically by better delayed recall, especially at time points long after learning and more so when drug is administered after learning and if word lists are longer. The animal studies reveal a different picture with respect to the effects of different drugs on memory performance. This suggests that at least for episodic memory tasks, the translational value is rather poor. For the human studies, detailed parameters of the compositions of word lists for declarative memory tests are discussed and it is concluded that tailored adaptations of tests to fit the hypothesis under study, rather than "off-the-shelf" use of existing tests, are recommended. PMID:25977084

  15. Discrete Resource Allocation in Visual Working Memory

    ERIC Educational Resources Information Center

    Barton, Brian; Ester, Edward F.; Awh, Edward

    2009-01-01

    Are resources in visual working memory allocated in a continuous or a discrete fashion? On one hand, flexible resource models suggest that capacity is determined by a central resource pool that can be flexibly divided such that items of greater complexity receive a larger share of resources. On the other hand, if capacity in working memory is…

  16. SHARING EDUCATIONAL SERVICES.

    ERIC Educational Resources Information Center

    Catskill Area Project in Small School Design, Oneonta, NY.

    SHARED SERVICES, A COOPERATIVE SCHOOL RESOURCE PROGRAM, IS DEFINED IN DETAIL. INCLUDED IS A DISCUSSION OF THEIR NEED, ADVANTAGES, GROWTH, DESIGN, AND OPERATION. SPECIFIC PROCEDURES FOR OBTAINING STATE AID IN SHARED SERVICES, EFFECTS OF SHARED SERVICES ON THE SCHOOL, AND HINTS CONCERNING SHARED SERVICES ARE DESCRIBED. CHARACTERISTICS OF THE SMALL…

  17. Children's Working Memory: Investigating Performance Limitations in Complex Span Tasks

    ERIC Educational Resources Information Center

    Conlin, J.A.; Gathercole, S.E.; Adams, J.W.

    2005-01-01

    Three experiments investigated the roles of resource-sharing and intrinsic memory demands in complex working memory span performance in 7- and 9-year-olds. In Experiment 1, the processing complexity of arithmetic operations was varied under conditions in which processing times were equivalent. Memory span did not differ as a function of processing…

  18. The Structure of Memory: Fixed of Flexible? Structural Learning Series.

    ERIC Educational Resources Information Center

    Scandura, Joseph M.

    Most current information processing theories of cognition and memory share one common feature: the structure (state-space) of memory is fixed and retrieval from memory involves searching through that structure. Learning, where it is treated at all, involves transforming one such structure into another. This form of representation is questioned and…

  19. State recovery and lockstep execution restart in a system with multiprocessor pairing

    DOEpatents

    Gara, Alan; Gschwind, Michael K; Salapura, Valentina

    2014-01-21

    System, method and computer program product for a multiprocessing system to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). Each paired microprocessor or processor cores that provide one highly reliable thread for high-reliability connect with a system components such as a memory "nest" (or memory hierarchy), an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. The memory nest is attached to a selective pairing facility via a switch or a bus. Each selectively paired processor core is includes a transactional execution facility, whereing the system is configured to enable processor rollback to a previous state and reinitialize lockstep execution in order to recover from an incorrect execution when an incorrect execution has been detected by the selective pairing facility.

  20. Runtime and Programming Support for Memory Adaptation in Scientific Applications via Local Disk and Remote Memory

    SciTech Connect

    Mills, Richard T; Yue, Chuan; Andreas, Stathopoulos; Nikolopoulos, Dimitrios S

    2007-01-01

    The ever increasing memory demands of many scientific applications and the complexity of today's shared computational resources still require the occasional use of virtual memory, network memory, or even out-of-core implementations, with well known drawbacks in performance and usability. In Mills et al. (Adapting to memory pressure from within scientific applications on multiprogrammed COWS. In: International Parallel and Distributed Processing Symposium, IPDPS, Santa Fe, NM, 2004), we introduced a basic framework for a runtime, user-level library, MMlib, in which DRAM is treated as a dynamic size cache for large memory objects residing on local disk. Application developers can specify and access these objects through MMlib, enabling their application to execute optimally under variable memory availability, using as much DRAM as fluctuating memory levels will allow. In this paper, we first extend our earlier MMlib prototype from a proof of concept to a usable, robust, and flexible library. We present a general framework that enables fully customizable memory malleability in a wide variety of scientific applications. We provide several necessary enhancements to the environment sensing capabilities of MMlib, and introduce a remote memory capability, based on MPI communication of cached memory blocks between 'compute nodes' and designated memory servers. The increasing speed of interconnection networks makes a remote memory approach attractive, especially at the large granularity present in large scientific applications. We show experimental results from three important scientific applications that require the general MMlib framework. The memory-adaptive versions perform nearly optimally under constant memory pressure and execute harmoniously with other applications competing for memory, without thrashing the memory system. Under constant memory pressure, we observe execution time improvements of factors between three and

  1. Memories Are Made of This

    ERIC Educational Resources Information Center

    Chang, Christine

    2010-01-01

    In this article, the author shares her memories of Sally Smith, the founder of The Lab School of Washington, where she works as the director of the Occupational Therapy. When the author first met Smith, Smith asked her what brought her to The Lab School at that point in her career. She told Smith that her background was rather eclectic, since she…

  2. Virtual memory

    NASA Technical Reports Server (NTRS)

    Denning, P. J.

    1986-01-01

    Virtual memory was conceived as a way to automate overlaying of program segments. Modern computers have very large main memories, but need automatic solutions to the relocation and protection problems. Virtual memory serves this need as well and is thus useful in computers of all sizes. The history of the idea is traced, showing how it has become a widespread, little noticed feature of computers today.

  3. CCD Memory

    NASA Technical Reports Server (NTRS)

    Janesick, James R.; Elliot, Tom; Norris, Dave; Vescelus, Fred

    1987-01-01

    CCD memory device yields over 6.4 x 10 to the eighth power levels of information on single chip. Charge-coupled device (CCD) demonstrated to operate as either read-only-memory (ROM) or photon-programmable memory with capacity of 640,000 bits, with each bit capable of being weighted to more than 1,000 discrete analog levels. Larger memory capacities now possible using proposed approach in conjunction with CCD's now being fabricated, which yield over 4 x 10 to the ninth power discrete levels of information on single chip.

  4. Memory in health and in schizophrenia.

    PubMed

    Gur, Ruben C; Gur, Raquel E

    2013-12-01

    Memory is an important capacity needed for survival in a changing environment, and its principles are shared across species. These principles have been studied since the inception of behavioral science, and more recently neuroscience has helped understand brain systems and mechanisms responsible for enabling aspects of memory. Here we outline the history of work on memory and its neural underpinning, and describe the major dimensions of memory processing that have been evaluated by cognitive neuroscience, focusing on episodic memory. We present evidence in healthy populations for sex differences-females outperforming in verbal and face memory, and age effects-slowed memory processes with age. We then describe deficits associated with schizophrenia. Impairment in schizophrenia is more severe in patients with negative symptoms-especially flat affect-who also show deficits in measures of social cognition. This evidence implicates medial temporal and frontal regions in schizophrenia.

  5. Memory in health and in schizophrenia

    PubMed Central

    Gur, Ruben C.; Gur, Raquel E.

    2013-01-01

    Memory is an important capacity needed for survival in a changing environment, and its principles are shared across species. These principles have been studied since the inception of behavioral science, and more recently neuroscience has helped understand brain systems and mechanisms responsible for enabling aspects of memory. Here we outline the history of work on memory and its neural underpinning, and describe the major dimensions of memory processing that have been evaluated by cognitive neuroscience, focusing on episodic memory. We present evidence in healthy populations for sex differences—females outperforming in verbal and face memory, and age effects—slowed memory processes with age. We then describe deficits associated with schizophrenia. Impairment in schizophrenia is more severe in patients with negative symptoms—especially flat affect—who also show deficits in measures of social cognition. This evidence implicates medial temporal and frontal regions in schizophrenia. PMID:24459407

  6. Memory systems.

    PubMed

    Eichenbaum, Howard

    2010-07-01

    The idea that there are multiple memory systems can be traced to early philosophical considerations and introspection. However, the early experimental work considered memory a unitary phenomenon and focused on finding the mechanism upon which memory is based. A full reconciliation of debates about that mechanism, and a coincidental rediscovery of the idea of multiple memory systems, emerged from studies in the cognitive neuroscience of memory. This research has identified three major forms of memory that have distinct operating principles and are supported by different brain systems. These include: (1) a cortical-hippocampal circuit that mediates declarative memory, our capacity to recollect facts and events; (2) procedural memory subsystems involving a cortical-striatal circuit that mediates habit formation and a brainstem-cerebellar circuit that mediates sensorimotor adaptations; and (3) a circuit involving subcortical and cortical pathways through the amygdala that mediates the attachment of affective status and emotional responses to previously neutral stimuli. Copyright © 2010 John Wiley & Sons, Ltd. For further resources related to this article, please visit the WIREs website.

  7. Collaging Memories

    ERIC Educational Resources Information Center

    Wallach, Michele

    2011-01-01

    Even middle school students can have memories of their childhoods, of an earlier time. The art of Romare Bearden and the writings of Paul Auster can be used to introduce ideas about time and memory to students and inspire works of their own. Bearden is an exceptional role model for young artists, not only because of his astounding art, but also…

  8. Episodic Memories

    ERIC Educational Resources Information Center

    Conway, Martin A.

    2009-01-01

    An account of episodic memories is developed that focuses on the types of knowledge they represent, their properties, and the functions they might serve. It is proposed that episodic memories consist of "episodic elements," summary records of experience often in the form of visual images, associated to a "conceptual frame" that provides a…

  9. Job Sharing in Education.

    ERIC Educational Resources Information Center

    Davidson, Wilma; Kline, Susan

    1979-01-01

    The author presents the advantages of job sharing for all school personnel, saying that education is particularly adaptable to this new form of employment. Current job sharing programs in Massachusetts, California, and New Jersey schools are briefly discussed. (SJL)

  10. Share Your Values

    MedlinePlus

    ... Español Text Size Email Print Share Share Your Values Page Content Article Body Today, teenagers are bombarded ... mid-twenties. The Most Effective Way to Instill Values? By Example Your words will carry more weight ...

  11. Robert Hooke's model of memory.

    PubMed

    Hintzman, Douglas L

    2003-03-01

    In 1682 the scientist and inventor Robert Hooke read a lecture to the Royal Society of London, in which he described a mechanistic model of human memory. Yet few psychologists today seem to have heard of Hooke's memory model. The lecture addressed questions of encoding, memory capacity, repetition, retrieval, and forgetting--some of these in a surprisingly modern way. Hooke's model shares several characteristics with the theory of Richard Semon, which came more than 200 years later, but it is more complete. Among the model's interesting properties are that (1) it allows for attention and other top-down influences on encoding; (2) it uses resonance to implement parallel, cue-dependent retrieval; (3) it explains memory for recency; (4) it offers a single-system account of repetition priming; and (5) the power law of forgetting can be derived from the model's assumptions in a straightforward way. PMID:12747488

  12. Sensory memory for ambiguous vision.

    PubMed

    Pearson, Joel; Brascamp, Jan

    2008-09-01

    In recent years the overlap between visual perception and memory has shed light on our understanding of both. When ambiguous images that normally cause perception to waver unpredictably are presented briefly with intervening blank periods, perception tends to freeze, locking into one interpretation. This indicates that there is a form of memory storage across the blank interval. This memory trace codes low-level characteristics of the stored stimulus. Although a trace is evident after a single perceptual instance, the trace builds over many separate stimulus presentations, indicating a flexible, variable-length time-course. This memory shares important characteristics with priming by non-ambiguous stimuli. Computational models now provide a framework to interpret many empirical observations.

  13. Robert Hooke's model of memory.

    PubMed

    Hintzman, Douglas L

    2003-03-01

    In 1682 the scientist and inventor Robert Hooke read a lecture to the Royal Society of London, in which he described a mechanistic model of human memory. Yet few psychologists today seem to have heard of Hooke's memory model. The lecture addressed questions of encoding, memory capacity, repetition, retrieval, and forgetting--some of these in a surprisingly modern way. Hooke's model shares several characteristics with the theory of Richard Semon, which came more than 200 years later, but it is more complete. Among the model's interesting properties are that (1) it allows for attention and other top-down influences on encoding; (2) it uses resonance to implement parallel, cue-dependent retrieval; (3) it explains memory for recency; (4) it offers a single-system account of repetition priming; and (5) the power law of forgetting can be derived from the model's assumptions in a straightforward way.

  14. Encoding of Memory in Sheared Amorphous Solids

    NASA Astrophysics Data System (ADS)

    Fiocco, Davide; Foffi, Giuseppe; Sastry, Srikanth

    2014-01-01

    We show that memory can be encoded in a model amorphous solid subjected to athermal oscillatory shear deformations, and in an analogous spin model with disordered interactions, sharing the feature of a deformable energy landscape. When these systems are subjected to oscillatory shear deformation, they retain memory of the deformation amplitude imposed in the training phase, when the amplitude is below a "localization" threshold. Remarkably, multiple persistent memories can be stored using such an athermal, noise-free, protocol. The possibility of such memory is shown to be linked to the presence of plastic deformations and associated limit cycles traversed by the system, which exhibit avalanche statistics also seen in related contexts.

  15. Memory conformity affects inaccurate memories more than accurate memories.

    PubMed

    Wright, Daniel B; Villalba, Daniella K

    2012-01-01

    After controlling for initial confidence, inaccurate memories were shown to be more easily distorted than accurate memories. In two experiments groups of participants viewed 50 stimuli and were then presented with these stimuli plus 50 fillers. During this test phase participants reported their confidence that each stimulus was originally shown. This was followed by computer-generated responses from a bogus participant. After being exposed to this response participants again rated the confidence of their memory. The computer-generated responses systematically distorted participants' responses. Memory distortion depended on initial memory confidence, with uncertain memories being more malleable than confident memories. This effect was moderated by whether the participant's memory was initially accurate or inaccurate. Inaccurate memories were more malleable than accurate memories. The data were consistent with a model describing two types of memory (i.e., recollective and non-recollective memories), which differ in how susceptible these memories are to memory distortion.

  16. Job Sharing in Geography.

    ERIC Educational Resources Information Center

    Kay, Jeanne

    1982-01-01

    Job sharing is an employment alternative in which two qualified individuals manage the responsibilities of a single position. Discusses the barriers to and the potential, advantages, disadvantages, pitfalls, and challenges of job sharing. Focuses on job sharing in the geography profession. (Author/JN)

  17. LACC Shared Governance Model.

    ERIC Educational Resources Information Center

    Spangler, Mary

    This document discusses Los Angeles City College's (LACC) (California) Shared Governance Model. In response to California Assembly Bill 1725, LACC set forth a plan to implement the statutory requirements of shared governance. Shared governance is a concept grounded in the idea that decision-making is a process that affects the entire campus…

  18. DMA shared byte counters in a parallel computer

    SciTech Connect

    Chen, Dong; Gara, Alan G.; Heidelberger, Philip; Vranas, Pavlos

    2010-04-06

    A parallel computer system is constructed as a network of interconnected compute nodes. Each of the compute nodes includes at least one processor, a memory and a DMA engine. The DMA engine includes a processor interface for interfacing with the at least one processor, DMA logic, a memory interface for interfacing with the memory, a DMA network interface for interfacing with the network, injection and reception byte counters, injection and reception FIFO metadata, and status registers and control registers. The injection FIFOs maintain memory locations of the injection FIFO metadata memory locations including its current head and tail, and the reception FIFOs maintain the reception FIFO metadata memory locations including its current head and tail. The injection byte counters and reception byte counters may be shared between messages.

  19. Optimal foraging in semantic memory.

    PubMed

    Hills, Thomas T; Jones, Michael N; Todd, Peter M

    2012-04-01

    Do humans search in memory using dynamic local-to-global search strategies similar to those that animals use to forage between patches in space? If so, do their dynamic memory search policies correspond to optimal foraging strategies seen for spatial foraging? Results from a number of fields suggest these possibilities, including the shared structure of the search problems-searching in patchy environments-and recent evidence supporting a domain-general cognitive search process. To investigate these questions directly, we asked participants to recover from memory as many animal names as they could in 3 min. Memory search was modeled over a representation of the semantic search space generated from the BEAGLE memory model of Jones and Mewhort (2007), via a search process similar to models of associative memory search (e.g., Raaijmakers & Shiffrin, 1981). We found evidence for local structure (i.e., patches) in memory search and patch depletion preceding dynamic local-to-global transitions between patches. Dynamic models also significantly outperformed nondynamic models. The timing of dynamic local-to-global transitions was consistent with optimal search policies in space, specifically the marginal value theorem (Charnov, 1976), and participants who were more consistent with this policy recalled more items.

  20. A multilevel nonvolatile magnetoelectric memory

    PubMed Central

    Shen, Jianxin; Cong, Junzhuang; Shang, Dashan; Chai, Yisheng; Shen, Shipeng; Zhai, Kun; Sun, Young

    2016-01-01

    The coexistence and coupling between magnetization and electric polarization in multiferroic materials provide extra degrees of freedom for creating next-generation memory devices. A variety of concepts of multiferroic or magnetoelectric memories have been proposed and explored in the past decade. Here we propose a new principle to realize a multilevel nonvolatile memory based on the multiple states of the magnetoelectric coefficient (α) of multiferroics. Because the states of α depends on the relative orientation between magnetization and polarization, one can reach different levels of α by controlling the ratio of up and down ferroelectric domains with external electric fields. Our experiments in a device made of the PMN-PT/Terfenol-D multiferroic heterostructure confirm that the states of α can be well controlled between positive and negative by applying selective electric fields. Consequently, two-level, four-level, and eight-level nonvolatile memory devices are demonstrated at room temperature. This kind of multilevel magnetoelectric memory retains all the advantages of ferroelectric random access memory but overcomes the drawback of destructive reading of polarization. In contrast, the reading of α is nondestructive and highly efficient in a parallel way, with an independent reading coil shared by all the memory cells. PMID:27681812

  1. A comparison of multiprocessor scheduling methods for iterative data flow architectures

    NASA Astrophysics Data System (ADS)

    Storch, Matthew

    1993-02-01

    A comparative study is made between the Algorithm to Architecture Mapping Model (ATAMM) and three other related multiprocessing models from the published literature. The primary focus of all four models is the non-preemptive scheduling of large-grain iterative data flow graphs as required in real-time systems, control applications, signal processing, and pipelined computations. Important characteristics of the models such as injection control, dynamic assignment, multiple node instantiations, static optimum unfolding, range-chart guided scheduling, and mathematical optimization are identified. The models from the literature are compared with the ATAMM for performance, scheduling methods, memory requirements, and complexity of scheduling and design procedures.

  2. A comparison of multiprocessor scheduling methods for iterative data flow architectures

    NASA Technical Reports Server (NTRS)

    Storch, Matthew

    1993-01-01

    A comparative study is made between the Algorithm to Architecture Mapping Model (ATAMM) and three other related multiprocessing models from the published literature. The primary focus of all four models is the non-preemptive scheduling of large-grain iterative data flow graphs as required in real-time systems, control applications, signal processing, and pipelined computations. Important characteristics of the models such as injection control, dynamic assignment, multiple node instantiations, static optimum unfolding, range-chart guided scheduling, and mathematical optimization are identified. The models from the literature are compared with the ATAMM for performance, scheduling methods, memory requirements, and complexity of scheduling and design procedures.

  3. Rearview Memories

    ERIC Educational Resources Information Center

    Gross, Gwen E.

    2008-01-01

    In this article, the author shares her experience when she was still a student until she became a superintendent. In her 17th year in the superintendency, the author finds the joys of her work all around her, grateful to be bestowed with the gift of leadership. She shares with colleagues a few especially meaningful moments from her professional…

  4. Support for non-locking parallel reception of packets belonging to a single memory reception FIFO

    DOEpatents

    Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M.; Steinmacher-Burow, Burkhard; Sugawara, Yutaka

    2011-01-27

    A method and apparatus for distributed parallel messaging in a parallel computing system. A plurality of DMA engine units are configured in a multiprocessor system to operate in parallel, one DMA engine unit for transferring a current packet received at a network reception queue to a memory location in a memory FIFO (rmFIFO) region of a memory. A control unit implements logic to determine whether any prior received packet destined for that rmFIFO is still in a process of being stored in the associated memory by another DMA engine unit of the plurality, and prevent the one DMA engine unit from indicating completion of storing the current received packet in the reception memory FIFO (rmFIFO) until all prior received packets destined for that rmFIFO are completely stored by the other DMA engine units. Thus, there is provided non-locking support so that multiple packets destined for a single rmFIFO are transferred and stored in parallel to predetermined locations in a memory.

  5. Musical and verbal memory in Alzheimer's disease: a study of long-term and short-term memory.

    PubMed

    Ménard, Marie-Claude; Belleville, Sylvie

    2009-10-01

    Musical memory was tested in Alzheimer patients and in healthy older adults using long-term and short-term memory tasks. Long-term memory (LTM) was tested with a recognition procedure using unfamiliar melodies. Short-term memory (STM) was evaluated with same/different judgment tasks on short series of notes. Musical memory was compared to verbal memory using a task that used pseudowords (LTM) or syllables (STM). Results indicated impaired musical memory in AD patients relative to healthy controls. The deficit was found for both long-term and short-term memory. Furthermore, it was of the same magnitude for both musical and verbal domains whether tested with short-term or long-term memory tasks. No correlation was found between musical and verbal LTM. However, there was a significant correlation between verbal and musical STM in AD participants and healthy older adults, which suggests that the two domains may share common mechanisms. PMID:19398148

  6. Parallel variable-band Choleski solvers for computational structural analysis applications on vector multiprocessor supercomputers

    NASA Technical Reports Server (NTRS)

    Poole, E. L.; Overman, A. L.

    1991-01-01

    A Choleski method used to solve linear systems of equations that arise in large scale structural analyses is described. The method uses a novel variable-band storage scheme and is structured to exploit fast local memory caches while minimizing data access delays between main memory and vector registers. Several parallel implementations of this method are described for the CRAY-2 and CRAY Y-MP computers demonstrating the use of microtasking and autotasking directives. A portable parallel language, FORCE, is also used for two different parallel implementations, demonstrating the use of CRAY macrotasking. Results are presented comparing the matrix factorization times for three representative structural analysis problems from runs made in both dedicated and multi-user modes on both the CRAY-2 and CRAY Y-MP computers. CPU and wall clock timings are given for the various parallel methods and are compared to single processor timings of the same algorithm. Computation rates over 1 GIGAFLOP (1 billion floating point operations per second) on a four processor CRAY-2 and over 2 GIGAFLOPS on an eight processor CRAY Y-MP are demonstrated as measured by wall clock time in a dedicated environment. Reduced wall clock times for the parallel methods relative to the single processor implementation of the same Choleski algorithm are also demonstrated for runs made in multi-user mode.

  7. System and method for programmable bank selection for banked memory subsystems

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Gara, Alan G.; Giampapa, Mark E.; Hoenicke, Dirk; Ohmacht, Martin; Salapura, Valentina; Sugavanam, Krishnan

    2010-09-07

    A programmable memory system and method for enabling one or more processor devices access to shared memory in a computing environment, the shared memory including one or more memory storage structures having addressable locations for storing data. The system comprises: one or more first logic devices associated with a respective one or more processor devices, each first logic device for receiving physical memory address signals and programmable for generating a respective memory storage structure select signal upon receipt of pre-determined address bit values at selected physical memory address bit locations; and, a second logic device responsive to each of the respective select signal for generating an address signal used for selecting a memory storage structure for processor access. The system thus enables each processor device of a computing environment memory storage access distributed across the one or more memory storage structures.

  8. Fear Memory.

    PubMed

    Izquierdo, Ivan; Furini, Cristiane R G; Myskiw, Jociane C

    2016-04-01

    Fear memory is the best-studied form of memory. It was thoroughly investigated in the past 60 years mostly using two classical conditioning procedures (contextual fear conditioning and fear conditioning to a tone) and one instrumental procedure (one-trial inhibitory avoidance). Fear memory is formed in the hippocampus (contextual conditioning and inhibitory avoidance), in the basolateral amygdala (inhibitory avoidance), and in the lateral amygdala (conditioning to a tone). The circuitry involves, in addition, the pre- and infralimbic ventromedial prefrontal cortex, the central amygdala subnuclei, and the dentate gyrus. Fear learning models, notably inhibitory avoidance, have also been very useful for the analysis of the biochemical mechanisms of memory consolidation as a whole. These studies have capitalized on in vitro observations on long-term potentiation and other kinds of plasticity. The effect of a very large number of drugs on fear learning has been intensively studied, often as a prelude to the investigation of effects on anxiety. The extinction of fear learning involves to an extent a reversal of the flow of information in the mentioned structures and is used in the therapy of posttraumatic stress disorder and fear memories in general. PMID:26983799

  9. Fear Memory.

    PubMed

    Izquierdo, Ivan; Furini, Cristiane R G; Myskiw, Jociane C

    2016-04-01

    Fear memory is the best-studied form of memory. It was thoroughly investigated in the past 60 years mostly using two classical conditioning procedures (contextual fear conditioning and fear conditioning to a tone) and one instrumental procedure (one-trial inhibitory avoidance). Fear memory is formed in the hippocampus (contextual conditioning and inhibitory avoidance), in the basolateral amygdala (inhibitory avoidance), and in the lateral amygdala (conditioning to a tone). The circuitry involves, in addition, the pre- and infralimbic ventromedial prefrontal cortex, the central amygdala subnuclei, and the dentate gyrus. Fear learning models, notably inhibitory avoidance, have also been very useful for the analysis of the biochemical mechanisms of memory consolidation as a whole. These studies have capitalized on in vitro observations on long-term potentiation and other kinds of plasticity. The effect of a very large number of drugs on fear learning has been intensively studied, often as a prelude to the investigation of effects on anxiety. The extinction of fear learning involves to an extent a reversal of the flow of information in the mentioned structures and is used in the therapy of posttraumatic stress disorder and fear memories in general.

  10. Is external memory memory? Biological memory and extended mind.

    PubMed

    Michaelian, Kourken

    2012-09-01

    Clark and Chalmers (1998) claim that an external resource satisfying the following criteria counts as a memory: (1) the agent has constant access to the resource; (2) the information in the resource is directly available; (3) retrieved information is automatically endorsed; (4) information is stored as a consequence of past endorsement. Research on forgetting and metamemory shows that most of these criteria are not satisfied by biological memory, so they are inadequate. More psychologically realistic criteria generate a similar classification of standard putative external memories, but the criteria still do not capture the function of memory. An adequate account of memory function, compatible with its evolution and its roles in prospection and imagination, suggests that external memory performs a function not performed by biological memory systems. External memory is thus not memory. This has implications for: extended mind theorizing, ecological validity of memory research, the causal theory of memory.

  11. Memory Network For Distributed Data Processors

    NASA Technical Reports Server (NTRS)

    Bolen, David; Jensen, Dean; Millard, ED; Robinson, Dave; Scanlon, George

    1992-01-01

    Universal Memory Network (UMN) is modular, digital data-communication system enabling computers with differing bus architectures to share 32-bit-wide data between locations up to 3 km apart with less than one millisecond of latency. Makes it possible to design sophisticated real-time and near-real-time data-processing systems without data-transfer "bottlenecks". This enterprise network permits transmission of volume of data equivalent to an encyclopedia each second. Facilities benefiting from Universal Memory Network include telemetry stations, simulation facilities, power-plants, and large laboratories or any facility sharing very large volumes of data. Main hub of UMN is reflection center including smaller hubs called Shared Memory Interfaces.

  12. Work Sharing Case Studies.

    ERIC Educational Resources Information Center

    McCarthy, Maureen E.; And Others

    Designed to provide private sector employers with the practical information necessary to select and then to design and implement work sharing arrangements, this book presents case studies of some 36 work sharing programs. Topics covered in the case studies include the circumstances leading to adoption of the program, details of compensation and…

  13. Shared Parenting Dysfunction.

    ERIC Educational Resources Information Center

    Turkat, Ira Daniel

    2002-01-01

    Joint custody of children is the most prevalent court ordered arrangement for families of divorce. A growing body of literature indicates that many parents engage in behaviors that are incompatible with shared parenting. This article provides specific criteria for a definition of the Shared Parenting Dysfunction. Clinical aspects of the phenomenon…

  14. Models, Norms and Sharing.

    ERIC Educational Resources Information Center

    Harris, Mary B.

    To investigate the effect of modeling on altruism, 156 third and fifth grade children were exposed to a model who either shared with them, gave to a charity, or refused to share. The test apparatus, identified as a game, consisted of a box with signal lights and a chute through which marbles were dispensed. Subjects and the model played the game…

  15. Chimpanzees share forbidden fruit.

    PubMed

    Hockings, Kimberley J; Humle, Tatyana; Anderson, James R; Biro, Dora; Sousa, Claudia; Ohashi, Gaku; Matsuzawa, Tetsuro

    2007-01-01

    The sharing of wild plant foods is infrequent in chimpanzees, but in chimpanzee communities that engage in hunting, meat is frequently used as a 'social tool' for nurturing alliances and social bonds. Here we report the only recorded example of regular sharing of plant foods by unrelated, non-provisioned wild chimpanzees, and the contexts in which these sharing behaviours occur. From direct observations, adult chimpanzees at Bossou (Republic of Guinea, West Africa) very rarely transferred wild plant foods. In contrast, they shared cultivated plant foods much more frequently (58 out of 59 food sharing events). Sharing primarily consists of adult males allowing reproductively cycling females to take food that they possess. We propose that hypotheses focussing on 'food-for-sex and -grooming' and 'showing-off' strategies plausibly account for observed sharing behaviours. A changing human-dominated landscape presents chimpanzees with fresh challenges, and our observations suggest that crop-raiding provides adult male chimpanzees at Bossou with highly desirable food commodities that may be traded for other currencies.

  16. Nonpreemptive run-time scheduling issues on a multitasked, multiprogrammed multiprocessor with dependencies, bidimensional tasks, folding and dynamic graphs

    SciTech Connect

    Miller, Allan Ray

    1987-05-01

    Increases in high speed hardware have mandated studies in software techniques to exploit the parallel capabilities. This thesis examines the effects a run-time scheduler has on a multiprocessor. The model consists of directed, acyclic graphs, generated from serial FORTRAN benchmark programs by the parallel compiler Parafrase. A multitasked, multiprogrammed environment is created. Dependencies are generated by the compiler. Tasks are bidimensional, i.e., they may specify both time and processor requests. Processor requests may be folded into execution time by the scheduler. The graphs may arrive at arbitrary time intervals. The general case is NP-hard, thus, a variety of heuristics are examined by a simulator. Multiprogramming demonstrates a greater need for a run-time scheduler than does monoprogramming for a variety of reasons, e.g., greater stress on the processors, a larger number of independent control paths, more variety in the task parameters, etc. The dynamic critical path series of algorithms perform well. Dynamic critical volume did not add much. Unfortunately, dynamic critical path maximizes turnaround time as well as throughput. Two schedulers are presented which balance throughput and turnaround time. The first requires classification of jobs by type; the second requires selection of a ratio value which is dependent upon system parameters. 45 refs., 19 figs., 20 tabs.

  17. A parallel row-based algorithm with error control for standard-cell replacement on a hypercube multiprocessor

    NASA Technical Reports Server (NTRS)

    Sargent, Jeff Scott

    1988-01-01

    A new row-based parallel algorithm for standard-cell placement targeted for execution on a hypercube multiprocessor is presented. Key features of this implementation include a dynamic simulated-annealing schedule, row-partitioning of the VLSI chip image, and two novel new approaches to controlling error in parallel cell-placement algorithms; Heuristic Cell-Coloring and Adaptive (Parallel Move) Sequence Control. Heuristic Cell-Coloring identifies sets of noninteracting cells that can be moved repeatedly, and in parallel, with no buildup of error in the placement cost. Adaptive Sequence Control allows multiple parallel cell moves to take place between global cell-position updates. This feedback mechanism is based on an error bound derived analytically from the traditional annealing move-acceptance profile. Placement results are presented for real industry circuits and the performance is summarized of an implementation on the Intel iPSC/2 Hypercube. The runtime of this algorithm is 5 to 16 times faster than a previous program developed for the Hypercube, while producing equivalent quality placement. An integrated place and route program for the Intel iPSC/2 Hypercube is currently being developed.

  18. Retracing Memories

    ERIC Educational Resources Information Center

    Harrison, David L.

    2005-01-01

    There are plenty of paths to poetry but few are as accessible as retracing ones own memories. When students are asked to write about something they remember, they are given them the gift of choosing from events that are important enough to recall. They remember because what happened was funny or scary or embarrassing or heartbreaking or silly.…

  19. Fueling Memories

    PubMed Central

    Powell, Jonathan D.; Pollizzi, Kristen

    2012-01-01

    A hallmark of the adaptive immune response is rapid and robust activation upon rechallenge. In the current issue of Immunity van der Windt et al. (2012) provide an important link between mitochondrial respiratory capacity and the development of CD8+ T cell memory. PMID:22284413

  20. Memory Loss

    ERIC Educational Resources Information Center

    Cassebaum, Anne

    2011-01-01

    In four decades of teaching college English, the author has watched many good teaching jobs morph into second-class ones. Worse, she has seen the memory and then the expectation of teaching jobs with decent status, security, and salary depart along with principles and collegiality. To help reverse this downward spiral, she contends that what is…

  1. Programming distributed memory architectures using Kali

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush; Vanrosendale, John

    1990-01-01

    Programming nonshared memory systems is more difficult than programming shared memory systems, in part because of the relatively low level of current programming environments for such machines. A new programming environment is presented, Kali, which provides a global name space and allows direct access to remote data values. In order to retain efficiency, Kali provides a system on annotations, allowing the user to control those aspects of the program critical to performance, such as data distribution and load balancing. The primitives and constructs provided by the language is described, and some of the issues raised in translating a Kali program for execution on distributed memory systems are also discussed.

  2. A contention-based bus-control scheme for multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Juang, Jie-Yong; Wah, Benjamin W.

    1991-01-01

    The authors study contention-based bus-control schemes for scheduling processors in using a bit-parallel shared bus. The protocol is designed under the requirements that each processor exhibit a random access behavior, that there be no centralized bus control in the system, and that access be granted in real time. The proposed scheme is based on splitting algorithms used in conventional contention-resolution schemes, and utilizes two-state information obtained from collision detection. Two versions of the bus-control scheme are studied. The static one resolves contentions of N requesting processors in an average of O(logW/2N) iterations, where W is the number of bits in the bit-parallel bus. An adaptive version resolves contentions in an average time that is independent of N.

  3. A compact PE memory for vision chips

    NASA Astrophysics Data System (ADS)

    Cong, Shi; Zhe, Chen; Jie, Yang; Nanjian, Wu; Zhihua, Wang

    2014-09-01

    This paper presents a novel compact memory in the processing element (PE) for single-instruction multiple-data (SIMD) vision chips. The PE memory is constructed with 8 × 8 register cells, where one latch in the slave stage is shared by eight latches in the master stage. The memory supports simultaneous read and write on the same address in one clock cycle. Its compact area of 14.33 μm2/bit promises a higher integration level of the processor. A prototype chip with a 64 × 64 PE array is fabricated in a UMC 0.18 μm CMOS technology. Five types of the PE memory cell structure are designed and compared. The testing results demonstrate that the proposed PE memory architecture well satisfies the requirement of the vision chip in high-speed real-time vision applications, such as 1000 fps edge extraction.

  4. Fluid intelligence, working memory and executive functioning.

    PubMed

    Colom, Roberto; Rubio, Víctor J; Shih, Pei Chun; Santacreu, José

    2006-11-01

    The causes underlying the correlation between working memory and fluid intelligence remain unknown. There are some researchers who argue that the answer can be found on the presumed executive component of working memory. However, the available empirical evidence is far from conclusive. The present study tested a sample of 229 participants. Intelligence, working memory, and executive functioning were measured by one analytic reasoning test (TRASI), a dual task combining a primary task of deductive reasoning with a secondary task of counting, and the Tower of Hanoi task, respectively. All the 3 measures were computer administered. The results indicate that the shared variance between executive functioning and working memory do not account for the relationship between intelligence and working memory. Some theoretical implications are discussed. PMID:17296123

  5. Sharing a Faculty Position.

    ERIC Educational Resources Information Center

    O'Kane, Patricia K.; Meyer, Mary

    1982-01-01

    Describes the experience of two nursing faculty members who shared an assistant professor of nursing position. Discusses positive and negative aspects of the experience and notes that a unified and creative approach must be taken for it to succeed. (JOW)

  6. Accelerating Spectrum Sharing Technologies

    SciTech Connect

    Juan D. Deaton; Lynda L. Brighton; Rangam Subramanian; Hussein Moradi; Jose Loera

    2013-09-01

    Spectrum sharing potentially holds the promise of solving the emerging spectrum crisis. However, technology innovators face the conundrum of developing spectrum sharing technologies without the ability to experiment and test with real incumbent systems. Interference with operational incumbents can prevent critical services, and the cost of deploying and operating an incumbent system can be prohibitive. Thus, the lack of incumbent systems and frequency authorization for technology incubation and demonstration has stymied spectrum sharing research. To this end, industry, academia, and regulators all require a test facility for validating hypotheses and demonstrating functionality without affecting operational incumbent systems. This article proposes a four-phase program supported by our spectrum accountability architecture. We propose that our comprehensive experimentation and testing approach for technology incubation and demonstration will accelerate the development of spectrum sharing technologies.

  7. Shared decision making

    MedlinePlus

    ... Shared decision making to improve care and reduce costs. N Engl J Med . 2013 Jan 3;368(1):6-8. ... UW Medicine, School of Medicine, University of Washington, Seattle, WA. Also reviewed by David ...

  8. A Sharing Proposition.

    ERIC Educational Resources Information Center

    Sturgeon, Julie

    2002-01-01

    Describes how the University of Vermont and St. Michael's College in Burlington, Vermont cooperated to share a single card access system. Discusses the planning, financial, and marketplace advantages of the cooperation. (EV)

  9. Secure Information Sharing

    2005-09-09

    We are develoing a peer-to-peer system to support secure, location independent information sharing in the scientific community. Once complete, this system will allow seamless and secure sharing of information between multiple collaborators. The owners of information will be able to control how the information is stored, managed. ano shared. In addition, users will have faster access to information updates within a collaboration. Groups collaborating on scientific experiments have a need to share information and data.more » This information and data is often represented in the form of files and database entries. In a typical scientific collaboration, there are many different locations where data would naturally be stored. This makes It difficult for collaborators to find and access the information they need. Our goal is to create a lightweight file-sharing system that makes it’easy for collaborators to find and use the data they need. This system must be easy-to-use, easy-to-administer, and secure. Our information-sharing tool uses group communication, in particular the InterGroup protocols, to reliably deliver each query to all of the current participants in a scalable manner, without having to discover all of their identities. We will use the Secure Group Layer (SGL) and Akenti to provide security to the participants of our environment, SGL will provide confldentiality, integrity, authenticity, and authorization enforcement for the InterGroup protocols and Akenti will provide access control to other resources.« less

  10. Information partnerships--shared data, shared scale.

    PubMed

    Konsynski, B R; McFarlan, F W

    1990-01-01

    How can one company gain access to another's resources or customers without merging ownership, management, or plotting a takeover? The answer is found in new information partnerships, enabling diverse companies to develop strategic coalitions through the sharing of data. The key to cooperation is a quantum improvement in the hardware and software supporting relational databases: new computer speeds, cheaper mass-storage devices, the proliferation of fiber-optic networks, and networking architectures. Information partnerships mean that companies can distribute the technological and financial exposure that comes with huge investments. For the customer's part, partnerships inevitably lead to greater simplification on the desktop and more common standards around which vendors have to compete. The most common types of partnership are: joint marketing partnerships, such as American Airline's award of frequent flyer miles to customers who use Citibank's credit card; intraindustry partnerships, such as the insurance value-added network service (which links insurance and casualty companies to independent agents); customer-supplier partnerships, such as Baxter Healthcare's electronic channel to hospitals for medical and other equipment; and IT vendor-driven partnerships, exemplified by ESAB (a European welding supplies and equipment company), whose expansion strategy was premised on a technology platform offered by an IT vendor. Partnerships that succeed have shared vision at the top, reciprocal skills in information technology, concrete plans for an early success, persistence in the development of usable information for all partners, coordination on business policy, and a new and imaginative business architecture.

  11. Semantic graphs and associative memories.

    PubMed

    Pomi, Andrés; Mizraji, Eduardo

    2004-12-01

    Graphs have been increasingly utilized in the characterization of complex networks from diverse origins, including different kinds of semantic networks. Human memories are associative and are known to support complex semantic nets; these nets are represented by graphs. However, it is not known how the brain can sustain these semantic graphs. The vision of cognitive brain activities, shown by modern functional imaging techniques, assigns renewed value to classical distributed associative memory models. Here we show that these neural network models, also known as correlation matrix memories, naturally support a graph representation of the stored semantic structure. We demonstrate that the adjacency matrix of this graph of associations is just the memory coded with the standard basis of the concept vector space, and that the spectrum of the graph is a code invariant of the memory. As long as the assumptions of the model remain valid this result provides a practical method to predict and modify the evolution of the cognitive dynamics. Also, it could provide us with a way to comprehend how individual brains that map the external reality, almost surely with different particular vector representations, are nevertheless able to communicate and share a common knowledge of the world. We finish presenting adaptive association graphs, an extension of the model that makes use of the tensor product, which provides a solution to the known problem of branching in semantic nets.

  12. Semantic graphs and associative memories

    NASA Astrophysics Data System (ADS)

    Pomi, Andrés; Mizraji, Eduardo

    2004-12-01

    Graphs have been increasingly utilized in the characterization of complex networks from diverse origins, including different kinds of semantic networks. Human memories are associative and are known to support complex semantic nets; these nets are represented by graphs. However, it is not known how the brain can sustain these semantic graphs. The vision of cognitive brain activities, shown by modern functional imaging techniques, assigns renewed value to classical distributed associative memory models. Here we show that these neural network models, also known as correlation matrix memories, naturally support a graph representation of the stored semantic structure. We demonstrate that the adjacency matrix of this graph of associations is just the memory coded with the standard basis of the concept vector space, and that the spectrum of the graph is a code invariant of the memory. As long as the assumptions of the model remain valid this result provides a practical method to predict and modify the evolution of the cognitive dynamics. Also, it could provide us with a way to comprehend how individual brains that map the external reality, almost surely with different particular vector representations, are nevertheless able to communicate and share a common knowledge of the world. We finish presenting adaptive association graphs, an extension of the model that makes use of the tensor product, which provides a solution to the known problem of branching in semantic nets.

  13. Reciprocal food sharing in the vampire bat

    NASA Astrophysics Data System (ADS)

    Wilkinson, Gerald S.

    1984-03-01

    Behavioural reciprocity can be evolutionarily stable1-3. Initial increase in frequency depends, however, on reciprocal altruists interacting predominantly with other reciprocal altruists either by associating within kin groups or by having sufficient memory to recognize and not aid nonreciprocators. Theory thus suggests that reciprocity should evolve more easily among animals which live in kin groups. Data are available separating reciprocity from nepotism only for unrelated nonhuman animals4. Here, I show that food sharing by regurgitation of blood among wild vampire bats (Desmodus rotundus) depends equally and independently on degree of relatedness and an index of opportunity for recipro cation. That reciprocity operates within groups containing both kin and nonkin is supported further with data on the availability of blood-sharing occasions, estimates of the economics of shar ing blood, and experiments which show that unrelated bats will reciprocally exchange blood in captivity.

  14. Creativity and psychopathology: a shared vulnerability model.

    PubMed

    Carson, Shelley H

    2011-03-01

    Creativity is considered a positive personal trait. However, highly creative people have demonstrated elevated risk for certain forms of psychopathology, including mood disorders, schizophrenia spectrum disorders, and alcoholism. A model of shared vulnerability explains the relation between creativity and psychopathology. This model, supported by recent findings from neuroscience and molecular genetics, suggests that the biological determinants conferring risk for psychopathology interact with protective cognitive factors to enhance creative ideation. Elements of shared vulnerability include cognitive disinhibition (which allows more stimuli into conscious awareness), an attentional style driven by novelty salience, and neural hyperconnectivity that may increase associations among disparate stimuli. These vulnerabilities interact with superior meta-cognitive protective factors, such as high IQ, increased working memory capacity, and enhanced cognitive flexibility, to enlarge the range and depth of stimuli available in conscious awareness to be manipulated and combined to form novel and original ideas.

  15. Programming parallel architectures - The BLAZE family of languages

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush

    1989-01-01

    This paper gives an overview of the various approaches to programming multiprocessor architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive, since they remove much of the burden of exploiting parallel architectures from the user. This paper also describes recent work in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described.

  16. The biochemistry of memory.

    PubMed

    Stock, Jeffry B; Zhang, Sherry

    2013-09-01

    Almost fifty years ago, Julius Adler initiated a program of research to gain insights into the basic biochemistry of intelligent behavior by studying the molecular mechanisms that underlie the chemotactic responses of Escherichia coli. All living organisms share elements of a common biochemistry for metabolism, growth and heredity - why not intelligence? Neurobiologists have demonstrated that this is the case for nervous systems in animals ranging from worms to man. Motile unicellular organisms such as E. coli exhibit rudimentary behaviors that can be loosely described in terms of cognitive phenomena such as memory and learning. Adler's initiative at least raised the prospect that, because of the numerous experimental advantages provided by E. coli, it would be the first organism whose behavior could be understood at molecular resolution.

  17. Memories of the holocaust.

    PubMed

    Unger, Samuel

    2006-03-01

    As Alpha Omegans, we are united not only by our profession but also by a mission to educate ourselves, and others, about preserving our Jewish heritage. It was with this mission in mind that the Alpha Omegan invited me to share with my fraters a very personal, and painful, account of my boyhood in Poland, where I survived the Holocaust. Among the many gruesome episodes I encountered during the war, two remain vivid in my memories. Although this is not an easy story for me to tell, it is one that ultimately gives me great strength, especially as I prepare to disclose it among my dear friends and colleagues of Alpha Omega. May we never forget what some of us lost, what we regained and why we have chosen to build our personal and professional lives in ways that honor our history.

  18. Coordinating Shared Activities

    NASA Technical Reports Server (NTRS)

    Clement, Bradley

    2004-01-01

    Shared Activity Coordination (ShAC) is a computer program for planning and scheduling the activities of an autonomous team of interacting spacecraft and exploratory robots. ShAC could also be adapted to such terrestrial uses as helping multiple factory managers work toward competing goals while sharing such common resources as floor space, raw materials, and transports. ShAC iteratively invokes the Continuous Activity Scheduling Planning Execution and Replanning (CASPER) program to replan and propagate changes to other planning programs in an effort to resolve conflicts. A domain-expert specifies which activities and parameters thereof are shared and reports the expected conditions and effects of these activities on the environment. By specifying these conditions and effects differently for each planning program, the domain-expert subprogram defines roles that each spacecraft plays in a coordinated activity. The domain-expert subprogram also specifies which planning program has scheduling control over each shared activity. ShAC enables sharing of information, consensus over the scheduling of collaborative activities, and distributed conflict resolution. As the other planning programs incorporate new goals and alter their schedules in the changing environment, ShAC continually coordinates to respond to unexpected events.

  19. Shared direct memory access on the Explorer 2-LX

    NASA Technical Reports Server (NTRS)

    Musgrave, Jeffrey L.

    1990-01-01

    Advances in Expert System technology and Artificial Intelligence have provided a framework for applying automated Intelligence to the solution of problems which were generally perceived as intractable using more classical approaches. As a result, hybrid architectures and parallel processing capability have become more common in computing environments. The Texas Instruments Explorer II-LX is an example of a machine which combines a symbolic processing environment, and a computationally oriented environment in a single chassis for integrated problem solutions. This user's manual is an attempt to make these capabilities more accessible to a wider range of engineers and programmers with problems well suited to solution in such an environment.

  20. Processing Demand and Short-Term Memory: The Response-Prefix Effect

    ERIC Educational Resources Information Center

    Jahnke, John C.; Nowaczyk, Ronald H.

    1977-01-01

    Seven-digit strings were presented for immediate recall. Before recall, subjects either read or retrieved from memory a single item (response prefix). Results were seen in terms of the sharing of the limited capacity of an active memory system by the memory series, the response prefix, and the operations to retrieve and emit the items. (Editor/RK)

  1. Shared care (comanagement).

    PubMed

    Montero Ruiz, E

    2016-01-01

    Surgical departments have increasing difficulties in caring for their hospitalised patients due to the patients' advanced age and comorbidity, the growing specialisation in medical training and the strong political-healthcare pressure that a healthcare organisation places on them, where surgical acts take precedence over other activities. The pressure exerted by these departments on the medical area and the deficient response by the interconsultation system have led to the development of a different healthcare organisation model: Shared care, which includes perioperative medicine. In this model, 2 different specialists share the responsibility and authority in caring for hospitalised surgical patients. Internal Medicine is the most appropriate specialty for shared care. Internists who exercise this responsibility should have certain characteristics and must overcome a number of concerns from the surgeon and anaesthesiologist. PMID:26163733

  2. Shared care (comanagement).

    PubMed

    Montero Ruiz, E

    2016-01-01

    Surgical departments have increasing difficulties in caring for their hospitalised patients due to the patients' advanced age and comorbidity, the growing specialisation in medical training and the strong political-healthcare pressure that a healthcare organisation places on them, where surgical acts take precedence over other activities. The pressure exerted by these departments on the medical area and the deficient response by the interconsultation system have led to the development of a different healthcare organisation model: Shared care, which includes perioperative medicine. In this model, 2 different specialists share the responsibility and authority in caring for hospitalised surgical patients. Internal Medicine is the most appropriate specialty for shared care. Internists who exercise this responsibility should have certain characteristics and must overcome a number of concerns from the surgeon and anaesthesiologist.

  3. Test Sequence Priming in Recognition Memory

    ERIC Educational Resources Information Center

    Johns, Elizabeth E.; Mewhort, D. J. K.

    2009-01-01

    The authors examined priming within the test sequence in 3 recognition memory experiments. A probe primed its successor whenever both probes shared a feature with the same studied item ("interjacent priming"), indicating that the study item like the probe is central to the decision. Interjacent priming occurred even when the 2 probes did not…

  4. Close Associations and Memory in Brainwriting Groups

    ERIC Educational Resources Information Center

    Coskun, Hamit

    2011-01-01

    The present experiment examined whether or not the type of associations (close (e.g. apple-pear) and distant (e.g. apple-fish) word associations) and memory instruction (paying attention to the ideas of others) had effects on the idea generation performances in the brainwriting paradigm in which all participants shared their ideas by using paper…

  5. Sharing the atom bomb

    SciTech Connect

    Chace, J.

    1996-01-01

    Shaken by the devastation of Hiroshima and Nagasaki and fearful that the American atomic monopoly would spark an arms race, Dean Acheson led a push in 1946 to place the bomb-indeed, all atomic energy-under international control. But as the memories of wartime collaboration faded, relations between the superpowers grew increasingly tense, and the confrontational atmosphere undid his proposal. Had Acheson succeeded, the Cold War might not have been. 2 figs.

  6. The developmental influence of primary memory capacity on working memory and academic achievement.

    PubMed

    Hall, Debbora; Jarrold, Christopher; Towse, John N; Zarandi, Amy L

    2015-08-01

    In this study, we investigate the development of primary memory capacity among children. Children between the ages of 5 and 8 completed 3 novel tasks (split span, interleaved lists, and a modified free-recall task) that measured primary memory by estimating the number of items in the focus of attention that could be spontaneously recalled in serial order. These tasks were calibrated against traditional measures of simple and complex span. Clear age-related changes in these primary memory estimates were observed. There were marked individual differences in primary memory capacity, but each novel measure was predictive of simple span performance. Among older children, each measure shared variance with reading and mathematics performance, whereas for younger children, the interleaved lists task was the strongest single predictor of academic ability. We argue that these novel tasks have considerable potential for the measurement of primary memory capacity and provide new, complementary ways of measuring the transient memory processes that predict academic performance. The interleaved lists task also shared features with interference control tasks, and our findings suggest that young children have a particular difficulty in resisting distraction and that variance in the ability to resist distraction is also shared with measures of educational attainment. PMID:26075630

  7. The Sharing Tree: Preschool Children Learn to Share.

    ERIC Educational Resources Information Center

    Wolf, Arlene; Fine, Elaine

    1996-01-01

    This article describes a learning activity in which preschool children learn cooperative skills and metacognitive strategies as they master sharing strategies guided by leaves on a "sharing tree." Leaf colors (red, yellow, green) cue the child to stop, slow down and think about sharing and playing with others, and go ahead with a sharing activity.…

  8. Personal semantics: at the crossroads of semantic and episodic memory.

    PubMed

    Renoult, Louis; Davidson, Patrick S R; Palombo, Daniela J; Moscovitch, Morris; Levine, Brian

    2012-11-01

    Declarative memory is usually described as consisting of two systems: semantic and episodic memory. Between these two poles, however, may lie a third entity: personal semantics (PS). PS concerns knowledge of one's past. Although typically assumed to be an aspect of semantic memory, it is essentially absent from existing models of knowledge. Furthermore, like episodic memory (EM), PS is idiosyncratically personal (i.e., not culturally-shared). We show that, depending on how it is operationalized, the neural correlates of PS can look more similar to semantic memory, more similar to EM, or dissimilar to both. We consider three different perspectives to better integrate PS into existing models of declarative memory and suggest experimental strategies for disentangling PS from semantic and episodic memory.

  9. A Shared Praxis Approach.

    ERIC Educational Resources Information Center

    Street, James L.

    1988-01-01

    Develops an educational philosophy for Christian religious education as it touches the AIDS crisis. Grounded in Thomas Groome's "shared Christian praxis" model and directed toward religious educators, the philosophy contains five components: present action, critical reflection, dialog, story, and vision. Examines each component, stating that…

  10. Sharing Expertise: Consulting

    ERIC Educational Resources Information Center

    Graves, Bill

    2011-01-01

    A special breed of superintendents who have developed expertise in a particular area find ways of sharing it in other venues as outside consultants. They pull extra duty to put their special skills into practice, to give back to their communities, to stay current and grounded in the field, or to enhance their professional reputations. They teach…

  11. Learning to Share

    ERIC Educational Resources Information Center

    Raths, David

    2010-01-01

    In the tug-of-war between researchers and IT for supercomputing resources, a centralized approach can help both sides get more bang for their buck. As 2010 began, the University of Washington was preparing to launch its first shared high-performance computing cluster, a 1,500-node system called Hyak, dedicated to research activities. Like other…

  12. Bidirectional Quantum States Sharing

    NASA Astrophysics Data System (ADS)

    Peng, Jia-Yin; Bai, Ming-qiang; Mo, Zhi-Wen

    2016-05-01

    With the help of the shared entanglement and LOCC, multidirectional quantum states sharing is considered. We first put forward a protocol for implementing four-party bidirectional states sharing (BQSS) by using eight-qubit cluster state as quantum channel. In order to extend BQSS, we generalize this protocol from four sharers to multi-sharers utilizing two multi-qubit GHZ-type states as channel, and propose two multi-party BQSS schemes. On the other hand, we generalize the three schemes from two senders to multi-senders with multi GHZ-type states of multi-qubit as quantum channel, and give a multidirectional quantum states sharing protocol. In our schemes, all receivers can reconstruct the original unknown single-qubit state if and only if all sharers can cooperate. Only Pauli operations, Bell-state measurement and single-qubit measurement are used in our schemes, so these schemes are easily realized in physical experiment and their successful probabilities are all one.

  13. Hints on Sharing Books.

    ERIC Educational Resources Information Center

    Dorsey, Mary E., Comp.; Horne, Ulysses G., Comp.

    Based on the realization that each child must be given the opportunity to develop as a unique individual and that exposure to books expands a child's world, stimulating his creative thinking and his desire for new experiences, this booklet presents in outline form a variety of suggestions for encouraging children to share the books they have read.…

  14. Illegal File Sharing 101

    ERIC Educational Resources Information Center

    Wada, Kent

    2008-01-01

    Much of higher education's unease arises from the cost of dealing with illegal file sharing. Illinois State University, for example, calculated a cost of $76 to process a first claim of copyright infringement and $146 for a second. Responses range from simply passing along claims to elaborate programs architected with specific goals in mind.…

  15. Sharing Research Results

    ERIC Educational Resources Information Center

    Ashbrook, Peggy

    2011-01-01

    There are many ways to share a collection of data and students' thinking about that data. Explaining the results of science inquiry is important--working scientists and amateurs both contribute information to the body of scientific knowledge. Students can collect data about an activity that is already happening in a classroom (e.g., the qualities…

  16. Think before You Share

    ERIC Educational Resources Information Center

    Read, Brock

    2006-01-01

    Students in the US are increasingly discovering that online socializing is far from private and that sharing personal details on social-networking Web sites, such as Facebook, can have unintended consequences. A growing number of colleges are moving to disabuse students of the notion that the Internet is their private playground and what they type…

  17. Dynamic Load Balancing for Adaptive Computations on Distributed-Memory Machines

    NASA Technical Reports Server (NTRS)

    1999-01-01

    Dynamic load balancing is central to adaptive mesh-based computations on large-scale parallel computers. The principal investigator has investigated various issues on the dynamic load balancing problem under NASA JOVE and JAG rants. The major accomplishments of the project are two graph partitioning algorithms and a load balancing framework. The S-HARP dynamic graph partitioner is known to be the fastest among the known dynamic graph partitioners to date. It can partition a graph of over 100,000 vertices in 0.25 seconds on a 64- processor Cray T3E distributed-memory multiprocessor while maintaining the scalability of over 16-fold speedup. Other known and widely used dynamic graph partitioners take over a second or two while giving low scalability of a few fold speedup on 64 processors. These results have been published in journals and peer-reviewed flagship conferences.

  18. Synapsin Determines Memory Strength after Punishment- and Relief-Learning

    PubMed Central

    Niewalda, Thomas; Michels, Birgit; Jungnickel, Roswitha; Diegelmann, Sören; Kleber, Jörg; Kähne, Thilo

    2015-01-01

    Adverse life events can induce two kinds of memory with opposite valence, dependent on timing: “negative” memories for stimuli preceding them and “positive” memories for stimuli experienced at the moment of “relief.” Such punishment memory and relief memory are found in insects, rats, and man. For example, fruit flies (Drosophila melanogaster) avoid an odor after odor-shock training (“forward conditioning” of the odor), whereas after shock-odor training (“backward conditioning” of the odor) they approach it. Do these timing-dependent associative processes share molecular determinants? We focus on the role of Synapsin, a conserved presynaptic phosphoprotein regulating the balance between the reserve pool and the readily releasable pool of synaptic vesicles. We find that a lack of Synapsin leaves task-relevant sensory and motor faculties unaffected. In contrast, both punishment memory and relief memory scores are reduced. These defects reflect a true lessening of associative memory strength, as distortions in nonassociative processing (e.g., susceptibility to handling, adaptation, habituation, sensitization), discrimination ability, and changes in the time course of coincidence detection can be ruled out as alternative explanations. Reductions in punishment- and relief-memory strength are also observed upon an RNAi-mediated knock-down of Synapsin, and are rescued both by acutely restoring Synapsin and by locally restoring it in the mushroom bodies of mutant flies. Thus, both punishment memory and relief memory require the Synapsin protein and in this sense share genetic and molecular determinants. We note that corresponding molecular commonalities between punishment memory and relief memory in humans would constrain pharmacological attempts to selectively interfere with excessive associative punishment memories, e.g., after traumatic experiences. PMID:25972175

  19. Synapsin determines memory strength after punishment- and relief-learning.

    PubMed

    Niewalda, Thomas; Michels, Birgit; Jungnickel, Roswitha; Diegelmann, Sören; Kleber, Jörg; Kähne, Thilo; Gerber, Bertram

    2015-05-13

    Adverse life events can induce two kinds of memory with opposite valence, dependent on timing: "negative" memories for stimuli preceding them and "positive" memories for stimuli experienced at the moment of "relief." Such punishment memory and relief memory are found in insects, rats, and man. For example, fruit flies (Drosophila melanogaster) avoid an odor after odor-shock training ("forward conditioning" of the odor), whereas after shock-odor training ("backward conditioning" of the odor) they approach it. Do these timing-dependent associative processes share molecular determinants? We focus on the role of Synapsin, a conserved presynaptic phosphoprotein regulating the balance between the reserve pool and the readily releasable pool of synaptic vesicles. We find that a lack of Synapsin leaves task-relevant sensory and motor faculties unaffected. In contrast, both punishment memory and relief memory scores are reduced. These defects reflect a true lessening of associative memory strength, as distortions in nonassociative processing (e.g., susceptibility to handling, adaptation, habituation, sensitization), discrimination ability, and changes in the time course of coincidence detection can be ruled out as alternative explanations. Reductions in punishment- and relief-memory strength are also observed upon an RNAi-mediated knock-down of Synapsin, and are rescued both by acutely restoring Synapsin and by locally restoring it in the mushroom bodies of mutant flies. Thus, both punishment memory and relief memory require the Synapsin protein and in this sense share genetic and molecular determinants. We note that corresponding molecular commonalities between punishment memory and relief memory in humans would constrain pharmacological attempts to selectively interfere with excessive associative punishment memories, e.g., after traumatic experiences.

  20. [Neuroscience and collective memory: memory schemas linking brain, societies and cultures].

    PubMed

    Legrand, Nicolas; Gagnepain, Pierre; Peschanski, Denis; Eustache, Francis

    2015-01-01

    During the last two decades, the effect of intersubjective relationships on cognition has been an emerging topic in cognitive neurosciences leading through a so-called "social turn" to the formation of new domains integrating society and cultures to this research area. Such inquiry has been recently extended to collective memory studies. Collective memory refers to shared representations that are constitutive of the identity of a group and distributed among all its members connected by a common history. After briefly describing those evolutions in the study of human brain and behaviors, we review recent researches that have brought together cognitive psychology, neuroscience and social sciences into collective memory studies. Using the reemerging concept of memory schema, we propose a theoretical framework allowing to account for collective memories formation with a specific focus on the encoding process of historical events. We suggest that (1) if the concept of schema has been mainly used to describe rather passive framework of knowledge, such structure may also be implied in more active fashions in the understanding of significant collective events. And, (2) if some schema researches have restricted themselves to the individual level of inquiry, we describe a strong coherence between memory and cultural frameworks. Integrating the neural basis and properties of memory schema to collective memory studies may pave the way toward a better understanding of the reciprocal interaction between individual memories and cultural resources such as media or education. PMID:26820833

  1. Policy enabled information sharing system

    DOEpatents

    Jorgensen, Craig R.; Nelson, Brian D.; Ratheal, Steve W.

    2014-09-02

    A technique for dynamically sharing information includes executing a sharing policy indicating when to share a data object responsive to the occurrence of an event. The data object is created by formatting a data file to be shared with a receiving entity. The data object includes a file data portion and a sharing metadata portion. The data object is encrypted and then automatically transmitted to the receiving entity upon occurrence of the event. The sharing metadata portion includes metadata characterizing the data file and referenced in connection with the sharing policy to determine when to automatically transmit the data object to the receiving entity.

  2. Transactive memory systems scale for couples: development and validation

    PubMed Central

    Hewitt, Lauren Y.; Roberts, Lynne D.

    2015-01-01

    People in romantic relationships can develop shared memory systems by pooling their cognitive resources, allowing each person access to more information but with less cognitive effort. Research examining such memory systems in romantic couples largely focuses on remembering word lists or performing lab-based tasks, but these types of activities do not capture the processes underlying couples’ transactive memory systems, and may not be representative of the ways in which romantic couples use their shared memory systems in everyday life. We adapted an existing measure of transactive memory systems for use with romantic couples (TMSS-C), and conducted an initial validation study. In total, 397 participants who each identified as being a member of a romantic relationship of at least 3 months duration completed the study. The data provided a good fit to the anticipated three-factor structure of the components of couples’ transactive memory systems (specialization, credibility and coordination), and there was reasonable evidence of both convergent and divergent validity, as well as strong evidence of test–retest reliability across a 2-week period. The TMSS-C provides a valuable tool that can quickly and easily capture the underlying components of romantic couples’ transactive memory systems. It has potential to help us better understand this intriguing feature of romantic relationships, and how shared memory systems might be associated with other important features of romantic relationships. PMID:25999873

  3. Transactive memory systems scale for couples: development and validation.

    PubMed

    Hewitt, Lauren Y; Roberts, Lynne D

    2015-01-01

    People in romantic relationships can develop shared memory systems by pooling their cognitive resources, allowing each person access to more information but with less cognitive effort. Research examining such memory systems in romantic couples largely focuses on remembering word lists or performing lab-based tasks, but these types of activities do not capture the processes underlying couples' transactive memory systems, and may not be representative of the ways in which romantic couples use their shared memory systems in everyday life. We adapted an existing measure of transactive memory systems for use with romantic couples (TMSS-C), and conducted an initial validation study. In total, 397 participants who each identified as being a member of a romantic relationship of at least 3 months duration completed the study. The data provided a good fit to the anticipated three-factor structure of the components of couples' transactive memory systems (specialization, credibility and coordination), and there was reasonable evidence of both convergent and divergent validity, as well as strong evidence of test-retest reliability across a 2-week period. The TMSS-C provides a valuable tool that can quickly and easily capture the underlying components of romantic couples' transactive memory systems. It has potential to help us better understand this intriguing feature of romantic relationships, and how shared memory systems might be associated with other important features of romantic relationships.

  4. Processor-Group Aware Runtime Support for Shared-and Global-Address Space Models

    SciTech Connect

    Krishnan, Manoj Kumar; Tipparaju, Vinod; Palmer, Bruce; Nieplocha, Jarek

    2004-12-07

    Exploiting multilevel parallelism using processor groups is becoming increasingly important for programming on high-end systems. This paper describes a group-aware run-time support for shared-/global- address space programming models. The current effort has been undertaken in the context of the Aggregate Remote Memory Copy Interface (ARMCI) [5], a portable runtime system used as a communication layer for Global Arrays [6], Co-Array Fortran (CAF) [9], GPSHMEM [10], Co-Array Python [11], and also end-user applications. The paper describes the management of shared memory, integration of shared memory communication and RDMA on clusters with SMP nodes, and registration. These are all required for efficient multi- method and multi-protocol communication on modern systems. Focus is placed on techniques for supporting process groups while maximizing communication performance and efficiently managing global memory system-wide.

  5. Collective memory: conceptual foundations and theoretical approaches.

    PubMed

    Wertsch, James V; Roediger, Henry L

    2008-04-01

    In order to outline the conceptual landscape that frames discussions of collective memory, three oppositions are proposed: collective memory versus collective remembering; history versus collective memory; and individual memory versus collective remembering. From this perspective collective remembering is viewed as an active process that often involves contention and contestation among people rather than a static body of knowledge that they possess. Collective remembering is also viewed as privileging identity formation and contestation over the sort of objective representation of the past that is the aspiration of formal historical analysis. And finally, while collective remembering involves individual minds, it also suggests something more in the form of socially situated individuals, a claim that can usefully be formulated in terms of how members of a groups share a common set of cultural tools (e.g., narrative forms) and similar content.

  6. Shared Activity Coordination

    NASA Technical Reports Server (NTRS)

    Clement, Bradley J.; Barrett, Anthony C.

    2003-01-01

    Interacting agents that interleave planning and execution must reach consensus on their commitments to each other. In domains where agents have varying degrees of interaction and different constraints on communication and computation, agents will require different coordination protocols in order to efficiently reach consensus in real time. We briefly describe a largely unexplored class of real-time, distributed planning problems (inspired by interacting spacecraft missions), new challenges they pose, and a general approach to solving the problems. These problems involve self-interested agents that have infrequent communication but collaborate on joint activities. We describe a Shared Activity Coordination (SHAC) framework that provides a decentralized algorithm for negotiating the scheduling of shared activities in a dynamic environment, a soft, real-time approach to reaching consensus during execution with limited communication, and a foundation for customizing protocols for negotiating planner interactions. We apply SHAC to a realistic simulation of interacting Mars missions and illustrate the simplicity of protocol development.

  7. Efficient quantum secret sharing

    NASA Astrophysics Data System (ADS)

    Qin, Huawang; Dai, Yuewei

    2016-05-01

    An efficient quantum secret sharing scheme is proposed, in which the dealer generates some single particles and then uses the operations of quantum-controlled-not and Hadamard gate to encode a determinate secret into these particles. The participants get their shadows by performing the single-particle measurements on their particles, and even the dealer cannot know their shadows. Compared to the existing schemes, our scheme is more practical within the present technologies.

  8. Shared health governance.

    PubMed

    Ruger, Jennifer Prah

    2011-07-01

    Health and Social Justice (Ruger 2009a ) developed the "health capability paradigm," a conception of justice and health in domestic societies. This idea undergirds an alternative framework of social cooperation called "shared health governance" (SHG). SHG puts forth a set of moral responsibilities, motivational aspirations, and institutional arrangements, and apportions roles for implementation in striving for health justice. This article develops further the SHG framework and explains its importance and implications for governing health domestically. PMID:21745082

  9. Shared Health Governance

    PubMed Central

    Ruger, Jennifer Prah

    2014-01-01

    Health and Social Justice (Ruger 2009a) developed the “health capability paradigm,” a conception of justice and health in domestic societies. This idea undergirds an alternative framework of social cooperation called “shared health governance” (SHG). SHG puts forth a set of moral responsibilities, motivational aspirations, and institutional arrangements, and apportions roles for implementation in striving for health justice. This article develops further the SHG framework and explains its importance and implications for governing health domestically. PMID:21745082

  10. Shared health governance.

    PubMed

    Ruger, Jennifer Prah

    2011-07-01

    Health and Social Justice (Ruger 2009a ) developed the "health capability paradigm," a conception of justice and health in domestic societies. This idea undergirds an alternative framework of social cooperation called "shared health governance" (SHG). SHG puts forth a set of moral responsibilities, motivational aspirations, and institutional arrangements, and apportions roles for implementation in striving for health justice. This article develops further the SHG framework and explains its importance and implications for governing health domestically.

  11. Memory Retrieval and Interference: Working Memory Issues

    ERIC Educational Resources Information Center

    Radvansky, Gabriel A.; Copeland, David E.

    2006-01-01

    Working memory capacity has been suggested as a factor that is involved in long-term memory retrieval, particularly when that retrieval involves a need to overcome some sort of interference (Bunting, Conway, & Heitz, 2004; Cantor & Engle, 1993). Previous work has suggested that working memory is related to the acquisition of information during…

  12. Episodic memory, semantic memory, and amnesia.

    PubMed

    Squire, L R; Zola, S M

    1998-01-01

    Episodic memory and semantic memory are two types of declarative memory. There have been two principal views about how this distinction might be reflected in the organization of memory functions in the brain. One view, that episodic memory and semantic memory are both dependent on the integrity of medial temporal lobe and midline diencephalic structures, predicts that amnesic patients with medial temporal lobe/diencephalic damage should be proportionately impaired in both episodic and semantic memory. An alternative view is that the capacity for semantic memory is spared, or partially spared, in amnesia relative to episodic memory ability. This article reviews two kinds of relevant data: 1) case studies where amnesia has occurred early in childhood, before much of an individual's semantic knowledge has been acquired, and 2) experimental studies with amnesic patients of fact and event learning, remembering and knowing, and remote memory. The data provide no compelling support for the view that episodic and semantic memory are affected differently in medial temporal lobe/diencephalic amnesia. However, episodic and semantic memory may be dissociable in those amnesic patients who additionally have severe frontal lobe damage.

  13. Bonobos Share with Strangers

    PubMed Central

    Tan, Jingzhi; Hare, Brian

    2013-01-01

    Humans are thought to possess a unique proclivity to share with others – including strangers. This puzzling phenomenon has led many to suggest that sharing with strangers originates from human-unique language, social norms, warfare and/or cooperative breeding. However, bonobos, our closest living relative, are highly tolerant and, in the wild, are capable of having affiliative interactions with strangers. In four experiments, we therefore examined whether bonobos will voluntarily donate food to strangers. We show that bonobos will forego their own food for the benefit of interacting with a stranger. Their prosociality is in part driven by unselfish motivation, because bonobos will even help strangers acquire out-of-reach food when no desirable social interaction is possible. However, this prosociality has its limitations because bonobos will not donate food in their possession when a social interaction is not possible. These results indicate that other-regarding preferences toward strangers are not uniquely human. Moreover, language, social norms, warfare and cooperative breeding are unnecessary for the evolution of xenophilic sharing. Instead, we propose that prosociality toward strangers initially evolves due to selection for social tolerance, allowing the expansion of individual social networks. Human social norms and language may subsequently extend this ape-like social preference to the most costly contexts. PMID:23300956

  14. The shared reward dilemma.

    PubMed

    Cuesta, J A; Jiménez, R; Lugo, H; Sánchez, A

    2008-03-21

    One of the most direct human mechanisms of promoting cooperation is rewarding it. We study the effect of sharing a reward among cooperators in the most stringent form of social dilemma, namely the prisoner's dilemma (PD). Specifically, for a group of players that collect payoffs by playing a pairwise PD game with their partners, we consider an external entity that distributes a fixed reward equally among all cooperators. Thus, individuals confront a new dilemma: on the one hand, they may be inclined to choose the shared reward despite the possibility of being exploited by defectors; on the other hand, if too many players do that, cooperators will obtain a poor reward and defectors will outperform them. By appropriately tuning the amount to be shared a vast variety of scenarios arises, including the traditional ones in the study of cooperation as well as more complex situations where unexpected behavior can occur. We provide a complete classification of the equilibria of the n-player game as well as of its evolutionary dynamics.

  15. Shared mission operations concept

    NASA Technical Reports Server (NTRS)

    Spradlin, Gary L.; Rudd, Richard P.; Linick, Susan H.

    1994-01-01

    Historically, new JPL flight projects have developed a Mission Operations System (MOS) as unique as their spacecraft, and have utilized a mission-dedicated staff to monitor and control the spacecraft through the MOS. NASA budgetary pressures to reduce mission operations costs have led to the development and reliance on multimission ground system capabilities. The use of these multimission capabilities has not eliminated an ongoing requirement for a nucleus of personnel familiar with a given spacecraft and its mission to perform mission-dedicated operations. The high cost of skilled personnel required to support projects with diverse mission objectives has the potential for significant reduction through shared mission operations among mission-compatible projects. Shared mission operations are feasible if: (1) the missions do not conflict with one another in terms of peak activity periods, (2) a unique MOS is not required, and (3) there is sufficient similarity in the mission profiles so that greatly different skills would not be required to support each mission. This paper will further develop this shared mission operations concept. We will illustrate how a Discovery-class mission would enter a 'partner' relationship with the Voyager Project, and can minimize MOS development and operations costs by early and careful consideration of mission operations requirements.

  16. Optical memory

    DOEpatents

    Mao, Samuel S; Zhang, Yanfeng

    2013-07-02

    Optical memory comprising: a semiconductor wire, a first electrode, a second electrode, a light source, a means for producing a first voltage at the first electrode, a means for producing a second voltage at the second electrode, and a means for determining the presence of an electrical voltage across the first electrode and the second electrode exceeding a predefined voltage. The first voltage, preferably less than 0 volts, different from said second voltage. The semiconductor wire is optically transparent and has a bandgap less than the energy produced by the light source. The light source is optically connected to the semiconductor wire. The first electrode and the second electrode are electrically insulated from each other and said semiconductor wire.

  17. MEMORY FOR POETRY: MORE THAN MEANING?

    PubMed Central

    Atchley, Rachel M.; Hare, Mary L.

    2015-01-01

    The assumption has become that memory for words’ sound patterns, or form, is rapidly lost in comparison to content. Memory for form is also assumed to be verbatim rather than schematic. Oral story-telling traditions suggest otherwise. The present experiment investigated if form can be remembered schematically in spoken poetry, a context in which form is important. We also explored if sleep could help preserve memory for form. We tested whether alliterative sound patterns could cue memory for poetry lines both immediately and after a delay of 12 hours that did or did not include sleep. Twelve alliterative poetry lines were modified into same alliteration, different alliteration, and no alliteration paraphrases. We predicted that memory for original poetry lines would be less accurate after 12 hours, same alliteration paraphrases would be falsely recognized as originals more often after 12 hours, and that the no-sleep group would make more errors. Different alliteration and no alliteration paraphrases were not expected to share this effect due to schematically different sound patterns. Our data support these hypotheses and provide evidence that memory for form is schematic in nature, retained in contexts in which form matters, and that sleep may help preserve memory for sound patterns. PMID:26401226

  18. Infant Visual Recognition Memory

    ERIC Educational Resources Information Center

    Rose, Susan A.; Feldman, Judith F.; Jankowski, Jeffery J.

    2004-01-01

    Visual recognition memory is a robust form of memory that is evident from early infancy, shows pronounced developmental change, and is influenced by many of the same factors that affect adult memory; it is surprisingly resistant to decay and interference. Infant visual recognition memory shows (a) modest reliability, (b) good discriminant…

  19. Memory and the brain.

    PubMed

    Robertson, Lee T

    2002-01-01

    This review summarizes some of the recent advances in the neurobiology of memory. Current research helps us to understand how memories are created and, conversely, how our memories can be influenced by stress, drugs, and aging. An understanding of how memories are encoded by the brain may also lead to new ideas about how to maximize the long-term retention of important information. There are multiple memory systems with different functions and, in this review, we focus on the conscious recollection of one's experience of events and facts and on memories tied to emotional responses. Memories are also classified according to time: from short-term memory, lasting only seconds or minutes, to long-term memory, lasting months or years. The advent of new functional neuroimaging methods provides an opportunity to gain insight into how the human brain supports memory formation. Each memory system has a distinct anatomical organization, where different parts of the brain are recruited during phases of memory storage. Within the brain, memory is a dynamic property of populations of neurons and their interconnections. Memories are laid down in our brains via chemical changes at the neuron level. An understanding of the neurobiology of memory may stimulate health educators to consider how various teaching methods conform to the process of memory formation. PMID:12358099

  20. Fixed Access Network Sharing

    NASA Astrophysics Data System (ADS)

    Cornaglia, Bruno; Young, Gavin; Marchetta, Antonio

    2015-12-01

    Fixed broadband network deployments are moving inexorably to the use of Next Generation Access (NGA) technologies and architectures. These NGA deployments involve building fiber infrastructure increasingly closer to the customer in order to increase the proportion of fiber on the customer's access connection (Fibre-To-The-Home/Building/Door/Cabinet… i.e. FTTx). This increases the speed of services that can be sold and will be increasingly required to meet the demands of new generations of video services as we evolve from HDTV to "Ultra-HD TV" with 4k and 8k lines of video resolution. However, building fiber access networks is a costly endeavor. It requires significant capital in order to cover any significant geographic coverage. Hence many companies are forming partnerships and joint-ventures in order to share the NGA network construction costs. One form of such a partnership involves two companies agreeing to each build to cover a certain geographic area and then "cross-selling" NGA products to each other in order to access customers within their partner's footprint (NGA coverage area). This is tantamount to a bi-lateral wholesale partnership. The concept of Fixed Access Network Sharing (FANS) is to address the possibility of sharing infrastructure with a high degree of flexibility for all network operators involved. By providing greater configuration control over the NGA network infrastructure, the service provider has a greater ability to define the network and hence to define their product capabilities at the active layer. This gives the service provider partners greater product development autonomy plus the ability to differentiate from each other at the active network layer.

  1. Shared clinical decision making

    PubMed Central

    AlHaqwi, Ali I.; AlDrees, Turki M.; AlRumayyan, Ahmad; AlFarhan, Ali I.; Alotaibi, Sultan S.; AlKhashan, Hesham I.; Badri, Motasim

    2015-01-01

    Objectives: To determine preferences of patients regarding their involvement in the clinical decision making process and the related factors in Saudi Arabia. Methods: This cross-sectional study was conducted in a major family practice center in King Abdulaziz Medical City, Riyadh, Saudi Arabia, between March and May 2012. Multivariate multinomial regression models were fitted to identify factors associated with patients preferences. Results: The study included 236 participants. The most preferred decision-making style was shared decision-making (57%), followed by paternalistic (28%), and informed consumerism (14%). The preference for shared clinical decision making was significantly higher among male patients and those with higher level of education, whereas paternalism was significantly higher among older patients and those with chronic health conditions, and consumerism was significantly higher in younger age groups. In multivariate multinomial regression analysis, compared with the shared group, the consumerism group were more likely to be female [adjusted odds ratio (AOR) =2.87, 95% confidence interval [CI] 1.31-6.27, p=0.008] and non-dyslipidemic (AOR=2.90, 95% CI: 1.03-8.09, p=0.04), and the paternalism group were more likely to be older (AOR=1.03, 95% CI: 1.01-1.05, p=0.04), and female (AOR=2.47, 95% CI: 1.32-4.06, p=0.008). Conclusion: Preferences of patients for involvement in the clinical decision-making varied considerably. In our setting, underlying factors that influence these preferences identified in this study should be considered and tailored individually to achieve optimal treatment outcomes. PMID:26620990

  2. Feature-Based Memory-Driven Attentional Capture: Visual Working Memory Content Affects Visual Attention

    ERIC Educational Resources Information Center

    Olivers, Christian N. L.; Meijer, Frank; Theeuwes, Jan

    2006-01-01

    In 7 experiments, the authors explored whether visual attention (the ability to select relevant visual information) and visual working memory (the ability to retain relevant visual information) share the same content representations. The presence of singleton distractors interfered more strongly with a visual search task when it was accompanied by…

  3. Can power be shared?

    PubMed

    Ten Pas, William S

    2013-01-01

    Dental insurance began with a partnership between dental service organizations and state dental associations with a view toward expanding the number of Americans receiving oral health care and as a means for permitting firms and other organizations to offer employee benefits. The goals have been achieved, but the alliance between dentistry and insurance has become strained. A lack of dialogue has fostered mutual misconceptions, some of which are reviewed in this paper. It is possible that the public, the profession, and the dental insurance industry can all be strengthened, but only through power-sharing around the original common objective.

  4. Can power be shared?

    PubMed

    Ten Pas, William S

    2013-01-01

    Dental insurance began with a partnership between dental service organizations and state dental associations with a view toward expanding the number of Americans receiving oral health care and as a means for permitting firms and other organizations to offer employee benefits. The goals have been achieved, but the alliance between dentistry and insurance has become strained. A lack of dialogue has fostered mutual misconceptions, some of which are reviewed in this paper. It is possible that the public, the profession, and the dental insurance industry can all be strengthened, but only through power-sharing around the original common objective. PMID:24761578

  5. Interference from mere thinking: mental rehearsal temporarily disrupts recall of motor memory.

    PubMed

    Yin, Cong; Wei, Kunlin

    2014-08-01

    Interference between successively learned tasks is widely investigated to study motor memory. However, how simultaneously learned motor memories interact with each other has been rarely studied despite its prevalence in daily life. Assuming that motor memory shares common neural mechanisms with declarative memory system, we made unintuitive predictions that mental rehearsal, as opposed to further practice, of one motor memory will temporarily impair the recall of another simultaneously learned memory. Subjects simultaneously learned two sensorimotor tasks, i.e., visuomotor rotation and gain. They retrieved one memory by either practice or mental rehearsal and then had their memory evaluated. We found that mental rehearsal, instead of execution, impaired the recall of unretrieved memory. This impairment was content-independent, i.e., retrieving either gain or rotation impaired the other memory. Hence, conscious recollection of one motor memory interferes with the recall of another memory. This is analogous to retrieval-induced forgetting in declarative memory, suggesting a common neural process across memory systems. Our findings indicate that motor imagery is sufficient to induce interference between motor memories. Mental rehearsal, currently widely regarded as beneficial for motor performance, negatively affects memory recall when it is exercised for a subset of memorized items.

  6. Collective memory, group minds, and the extended mind thesis.

    PubMed

    Wilson, Robert A

    2005-12-01

    While memory is conceptualized predominantly as an individual capacity in the cognitive and biological sciences, the social sciences have most commonly construed memory as a collective phenomenon. Collective memory has been put to diverse uses, ranging from accounts of nationalism in history and political science to views of ritualization and commemoration in anthropology and sociology. These appeals to collective memory share the idea that memory "goes beyond the individual" but often run together quite different claims in spelling out that idea. This paper reviews a sampling of recent work on collective memory in the light of emerging externalist views within the cognitive sciences, and through some reflection on broader traditions of thought in the biological and social sciences that have appealed to the idea that groups have minds. The paper concludes with some thoughts about the relationship between these kinds of cognitive metaphors in the social sciences and our notion of agency. PMID:18239951

  7. 76 FR 55065 - Change in Bank Control Notices; Acquisitions of Shares of a Bank or Bank Holding Company

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-09-06

    ... Denney, Assistant Vice President) 1 Memorial Drive, Kansas City, Missouri 64198-0001: 1. Gregory J. Weed, Cheyenne Wells, Colorado; to acquire voting shares of Weed Investment Group, Inc., and thereby...

  8. 77 FR 64993 - Change in Bank Control Notices; Acquisitions of Shares of a Bank or Bank Holding Company

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-24

    ... Bancorp, Mount Airy, and thereby indirectly acquire voting shares of Surrey Bank & Trust, both in Mount...) 1 Memorial Drive, Kansas City, Missouri 64198-0001: 1. The E.L. Burch Irrevocable Trust of...

  9. Vaccines, our shared responsibility.

    PubMed

    Pagliusi, Sonia; Jain, Rishabh; Suri, Rajinder Kumar

    2015-05-01

    The Developing Countries Vaccine Manufacturers' Network (DCVMN) held its fifteenth annual meeting from October 27-29, 2014, New Delhi, India. The DCVMN, together with the co-organizing institution Panacea Biotec, welcomed over 240 delegates representing high-profile governmental and nongovernmental global health organizations from 36 countries. Over the three-day meeting, attendees exchanged information about their efforts to achieve their shared goal of preventing death and disability from known and emerging infectious diseases. Special praise was extended to all stakeholders involved in the success of polio eradication in South East Asia and highlighted challenges in vaccine supply for measles-rubella immunization over the coming decades. Innovative vaccines and vaccine delivery technologies indicated creative solutions for achieving global immunization goals. Discussions were focused on three major themes including regulatory challenges for developing countries that may be overcome with better communication; global collaborations and partnerships for leveraging investments and enable uninterrupted supply of affordable and suitable vaccines; and leading innovation in vaccines difficult to develop, such as dengue, Chikungunya, typhoid-conjugated and EV71, and needle-free technologies that may speed up vaccine delivery. Moving further into the Decade of Vaccines, participants renewed their commitment to shared responsibility toward a world free of vaccine-preventable diseases. PMID:25749248

  10. Vaccines, our shared responsibility.

    PubMed

    Pagliusi, Sonia; Jain, Rishabh; Suri, Rajinder Kumar

    2015-05-01

    The Developing Countries Vaccine Manufacturers' Network (DCVMN) held its fifteenth annual meeting from October 27-29, 2014, New Delhi, India. The DCVMN, together with the co-organizing institution Panacea Biotec, welcomed over 240 delegates representing high-profile governmental and nongovernmental global health organizations from 36 countries. Over the three-day meeting, attendees exchanged information about their efforts to achieve their shared goal of preventing death and disability from known and emerging infectious diseases. Special praise was extended to all stakeholders involved in the success of polio eradication in South East Asia and highlighted challenges in vaccine supply for measles-rubella immunization over the coming decades. Innovative vaccines and vaccine delivery technologies indicated creative solutions for achieving global immunization goals. Discussions were focused on three major themes including regulatory challenges for developing countries that may be overcome with better communication; global collaborations and partnerships for leveraging investments and enable uninterrupted supply of affordable and suitable vaccines; and leading innovation in vaccines difficult to develop, such as dengue, Chikungunya, typhoid-conjugated and EV71, and needle-free technologies that may speed up vaccine delivery. Moving further into the Decade of Vaccines, participants renewed their commitment to shared responsibility toward a world free of vaccine-preventable diseases.

  11. Jim Thomas: A Collection of Memories

    SciTech Connect

    Wong, Pak C.

    2010-12-01

    Jim Thomas, a guest editor and a long-time associate editor of Information Visualization (IVS), died in Richland, WA, on August 6, 2010 due to complications from a brain tumor. His friends and colleagues from around the world have since expressed their sadness and paid tribute to a visionary scientist in multiple public forums. For those who didn't get the chance to know Jim, I share a collection of my own memories of Jim Thomas and memories from some of his colleagues.

  12. Brief history of ETOX NOR flash memory.

    PubMed

    Lai, Stefan K

    2012-10-01

    NOR Flash memory grew from a simple concept in the 80's to worldwide revenue of US$4.8B in 2011. Stacked gate NOR (ETOX NOR at Intel) has highest revenue share of different NOR flash types. Cost reduction was made possible by continuous innovation along many fronts. Key enabler is Moore's Law scaling augmented by multiple self aligned techniques. Another key one is multilevel-cell technology giving 2 bits of information in a single cell. With emergence of NAND at much lower cost, NOR flash market is projected not to grow but NOR is still dominant memory for BIOS and program store in many electronic devices.

  13. Handling debugger breakpoints in a shared instruction system

    DOEpatents

    Gooding, Thomas Michael; Shok, Richard Michael

    2014-01-21

    A debugger debugs processes that execute shared instructions so that a breakpoint set for one process will not cause a breakpoint to occur in the other processes. A breakpoint is set by recording the original instruction at the desired location and writing a trap instruction to the shared instructions at that location. When a process encounters the breakpoint, the process passes control to the debugger for breakpoint processing if the breakpoint was set at that location for that process. If the trap was not set at that location for that process, the cacheline containing the trap is copied to a small scratchpad memory, and the virtual memory mappings are changed to translate the virtual address of the cacheline to the scratchpad. The original instruction is then written to replace the trap instruction in the scratchpad, so that process can execute the instructions in the scatchpad thereby avoiding the trap instruction.

  14. Verbal memory and menopause.

    PubMed

    Maki, Pauline M

    2015-11-01

    Midlife women frequently report memory problems during the menopausal transition. Recent studies validate those complaints by showing significant correlations between memory complaints and performance on validated memory tasks. Longitudinal studies demonstrate modest declines in verbal memory during the menopausal transition and a likely rebound during the postmenopausal stage. Clinical studies that examine changes in memory following hormonal withdrawal and add-back hormone therapy (HT) demonstrate that estradiol plays a critical role in memory. Although memory changes are frequently attributed to menopausal symptoms, studies show that the memory problems occur during the transition even after controlling for menopausal symptoms. It is well established that self-reported vasomotor symptoms (VMS) are unrelated to objective memory performance. However, emerging evidence suggests that objectively measured VMS significantly correlate with memory performance, brain activity during rest, and white matter hyperintensities. This evidence raises important questions about whether VMS and VMS treatments might affect memory during the menopausal transition. Unfortunately, there are no clinical trials to inform our understanding of how HT affects both memory and objectively measured VMS in women in whom HT is indicated for treatment of moderate to severe VMS. In clinical practice, it is helpful to normalize memory complaints, to note that evidence suggests that memory problems are temporary, and to counsel women with significant VMS that memory might improve with treatment.

  15. Memory Metals

    NASA Technical Reports Server (NTRS)

    1995-01-01

    Under contract to NASA during preparations for the space station, Memry Technologies Inc. investigated shape memory effect (SME). SME is a characteristic of certain metal alloys that can change shape in response to temperature variations. In the late 1980s and early 1990s, Memry used its NASA-acquired expertise to produce a line of home and industrial safety products, and refined the technology in the mid-1990s. Among the new products they developed are three MemrySafe units which prevent scalding from faucets. Each system contains a small valve that reacts to temperature, not pressure. When the water reaches dangerous temperatures, the unit reduces the flow to a trickle; when the scalding temperature subsides, the unit restores normal flow. Other products are the FIRECHEK 2 and 4, heat-activated shutoff valves for industrial process lines, which sense excessive heat and cut off pneumatic pressure. The newest of these products is Memry's Demand Management Water Heater which shifts the electricity requirement from peak to off-peak demands, conserving energy and money.

  16. The Relationship Between Working Memory Capacity and Executive Functioning: Evidence for a Common Executive Attention Construct

    PubMed Central

    McCabe, David P.; Roediger, Henry L.; McDaniel, Mark A.; Balota, David A.; Hambrick, David Z.

    2010-01-01

    Attentional control has been conceptualized as executive functioning by neuropsychologists and as working memory capacity by experimental psychologists. We examined the relationship between these constructs using a factor analytic approach in an adult lifespan sample. Several tests of working memory capacity and executive function were administered to over 200 subjects between the ages of 18-90 years old, along with tests of processing speed and episodic memory. The correlation between working memory capacity and executive functioning constructs was very strong (r = .97), but correlations between these constructs and processing speed were considerably weaker (r's ≈ .79). Controlling for working memory capacity or executive function eliminated age effects on episodic memory, and working memory capacity or executive function accounted for variance in episodic memory beyond that accounted for by processing speed. We conclude that tests of working memory capacity and executive function share a common underlying executive attention component that is strongly predictive of higher-level cognition. PMID:20230116

  17. Model Sharing and Collaboration using HydroShare

    NASA Astrophysics Data System (ADS)

    Goodall, J. L.; Morsy, M. M.; Castronova, A. M.; Miles, B.; Merwade, V.; Tarboton, D. G.

    2015-12-01

    HydroShare is a web-based system funded by the National Science Foundation (NSF) for sharing hydrologic data and models as resources. Resources in HydroShare can either be assigned a generic type, meaning the resource only has Dublin Core metadata properties, or one of a growing number of specific resource types with enhanced metadata profiles defined by the HydroShare development team. Examples of specific resource types in the current release of HydroShare (http://www.hydroshare.org) include time series, geographic raster, Multidimensional (NetCDF), model program, and model instance. Here we describe research and development efforts in HydroShare project for model-related resources types. This work has included efforts to define metadata profiles for common modeling resources, execute models directly through the HydroShare user interface using Docker containers, and interoperate with the 3rd party application SWATShare for model execution and visualization. These examples demonstrate the benefit of HydroShare to support model sharing and address collaborative problems involving modeling. The presentation will conclude with plans for future modeling-related development in HydroShare including supporting the publication of workflow resources, enhanced metadata for additional hydrologic models, and linking model resources with other resources in HydroShare to capture model provenance.

  18. Job-Sharing the Principalship.

    ERIC Educational Resources Information Center

    Brown, Shelley; Feltham, Wendy

    1997-01-01

    The coprincipals of a California elementary school share their ideas for building a successful job-sharing partnership. They suggest it is important to find the right partner, develop and present a job-sharing proposal, establish systems of communication with each other, evaluate one's progress, focus on the principalship, and provide leadership…

  19. Sharing Educational Services. PREP-13.

    ERIC Educational Resources Information Center

    Jongeward, Ray; Heesacker, Frank

    The focus of this report is on shared services in the rural setting. The kit contains three documents of useful information for any school planning a shared service activity to improve rural education. 13-A identifies 215 shared services in 50 states along with an indexing of each service by subject area and by state. 13-B is a series of 10…

  20. Fractions: How to Fair Share

    ERIC Educational Resources Information Center

    Wilson, P. Holt; Edgington, Cynthia P.; Nguyen, Kenny H.; Pescosolido, Ryan S.; Confrey, Jere

    2011-01-01

    Children learn from a very early age what it means to get their "fair share." Whether it is candy or birthday cake, many children successfully create equal-size groups or parts of a collection or whole but later struggle to create fair shares of multiple wholes, such as fairly sharing four pies among a family of seven. Recent research suggests…

  1. Multiprocessor computing for images

    SciTech Connect

    Cantoni, V. ); Levialdi, S. )

    1988-08-01

    A review of image processing systems developed until now is given, highlighting the weak points of such systems and the trends that have dictated their evolution through the years producing different generations of machines. Each generation may be characterized by the hardware architecture, the programmability features and the relative application areas. The need for multiprocessing hierarchical systems is discussed focusing on pyramidal architectures. Their computational paradigms, their virtual and physical implementation, their programming and software requirements, and capabilities by means of suitable languages, are discussed.

  2. SHARED TECHNOLOGY TRANSFER PROGRAM

    SciTech Connect

    GRIFFIN, JOHN M. HAUT, RICHARD C.

    2008-03-07

    The program established a collaborative process with domestic industries for the purpose of sharing Navy-developed technology. Private sector businesses were educated so as to increase their awareness of the vast amount of technologies that are available, with an initial focus on technology applications that are related to the Hydrogen, Fuel Cells and Infrastructure Technologies (Hydrogen) Program of the U.S. Department of Energy. Specifically, the project worked to increase industry awareness of the vast technology resources available to them that have been developed with taxpayer funding. NAVSEA-Carderock and the Houston Advanced Research Center teamed with Nicholls State University to catalog NAVSEA-Carderock unclassified technologies, rated the level of readiness of the technologies and established a web based catalog of the technologies. In particular, the catalog contains technology descriptions, including testing summaries and overviews of related presentations.

  3. University Reactor Sharing Program

    SciTech Connect

    W.D. Reese

    2004-02-24

    Research projects supported by the program include items such as dating geological material and producing high current super conducting magnets. The funding continues to give small colleges and universities the valuable opportunity to use the NSC for teaching courses in nuclear processes; specifically neutron activation analysis and gamma spectroscopy. The Reactor Sharing Program has supported the construction of a Fast Neutron Flux Irradiator for users at New Mexico Institute of Mining and Technology and the University of Houston. This device has been characterized and has been found to have near optimum neutron fluxes for A39/Ar 40 dating. Institution final reports and publications resulting from the use of these funds are on file at the Nuclear Science Center.

  4. Sharing a disparate landscape

    NASA Astrophysics Data System (ADS)

    Ali-Khan, Carolyne

    2010-06-01

    Working across boundaries of power, identity, and political geography is fraught with difficulties and contradictions. In Tali Tal and Iris Alkaher's, " Collaborative environmental projects in a multicultural society: Working from within separate or mutual landscapes?" the authors describe their efforts to do this in the highly charged atmosphere of Israel. This forum article offers a response to their efforts. Writing from a framework of critical pedagogy, I use the concepts of space and time to anchor my analysis, as I examine the issue of power in this Jew/Arab collaborative environmental project. This response problematizes "sharing" in a landscape fraught with disparities. It also looks to further Tal and Alkaher's work by geographically and politically grounding it in the broader current conflict and by juxtaposing sustainability with equity.

  5. Memory bias for negative emotional words in recognition memory is driven by effects of category membership.

    PubMed

    White, Corey N; Kapucu, Aycan; Bruno, Davide; Rotello, Caren M; Ratcliff, Roger

    2014-01-01

    Recognition memory studies often find that emotional items are more likely than neutral items to be labelled as studied. Previous work suggests this bias is driven by increased memory strength/familiarity for emotional items. We explored strength and bias interpretations of this effect with the conjecture that emotional stimuli might seem more familiar because they share features with studied items from the same category. Categorical effects were manipulated in a recognition task by presenting lists with a small, medium or large proportion of emotional words. The liberal memory bias for emotional words was only observed when a medium or large proportion of categorised words were presented in the lists. Similar, though weaker, effects were observed with categorised words that were not emotional (animal names). These results suggest that liberal memory bias for emotional items may be largely driven by effects of category membership.

  6. Memory bias for negative emotional words in recognition memory is driven by effects of category membership

    PubMed Central

    White, Corey N.; Kapucu, Aycan; Bruno, Davide; Rotello, Caren M.; Ratcliff, Roger

    2014-01-01

    Recognition memory studies often find that emotional items are more likely than neutral items to be labeled as studied. Previous work suggests this bias is driven by increased memory strength/familiarity for emotional items. We explored strength and bias interpretations of this effect with the conjecture that emotional stimuli might seem more familiar because they share features with studied items from the same category. Categorical effects were manipulated in a recognition task by presenting lists with a small, medium, or large proportion of emotional words. The liberal memory bias for emotional words was only observed when a medium or large proportion of categorized words were presented in the lists. Similar, though weaker, effects were observed with categorized words that were not emotional (animal names). These results suggest that liberal memory bias for emotional items may be largely driven by effects of category membership. PMID:24303902

  7. Memory beyond expression.

    PubMed

    Delorenzi, A; Maza, F J; Suárez, L D; Barreiro, K; Molina, V A; Stehberg, J

    2014-01-01

    The idea that memories are not invariable after the consolidation process has led to new perspectives about several mnemonic processes. In this framework, we review our studies on the modulation of memory expression during reconsolidation. We propose that during both memory consolidation and reconsolidation, neuromodulators can determine the probability of the memory trace to guide behavior, i.e. they can either increase or decrease its behavioral expressibility without affecting the potential of persistent memories to be activated and become labile. Our hypothesis is based on the findings that positive modulation of memory expression during reconsolidation occurs even if memories are behaviorally unexpressed. This review discusses the original approach taken in the studies of the crab Neohelice (Chasmagnathus) granulata, which was then successfully applied to test the hypothesis in rodent fear memory. Data presented offers a new way of thinking about both weak trainings and experimental amnesia: memory retrieval can be dissociated from memory expression. Furthermore, the strategy presented here allowed us to show in human declarative memory that the periods in which long-term memory can be activated and become labile during reconsolidation exceeds the periods in which that memory is expressed, providing direct evidence that conscious access to memory is not needed for reconsolidation. Specific controls based on the constraints of reminders to trigger reconsolidation allow us to distinguish between obliterated and unexpressed but activated long-term memories after amnesic treatments, weak trainings and forgetting. In the hypothesis discussed, memory expressibility--the outcome of experience-dependent changes in the potential to behave--is considered as a flexible and modulable attribute of long-term memories. Expression seems to be just one of the possible fates of re-activated memories.

  8. Formal Specification of the OpenMP Memory Model

    SciTech Connect

    Bronevetsky, G; de Supinski, B R

    2006-05-17

    OpenMP [1] is an important API for shared memory programming, combining shared memory's potential for performance with a simple programming interface. Unfortunately, OpenMP lacks a critical tool for demonstrating whether programs are correct: a formal memory model. Instead, the current official definition of the OpenMP memory model (the OpenMP 2.5 specification [1]) is in terms of informal prose. As a result, it is impossible to verify OpenMP applications formally since the prose does not provide a formal consistency model that precisely describes how reads and writes on different threads interact. This paper focuses on the formal verification of OpenMP programs through a proposed formal memory model that is derived from the existing prose model [1]. Our formalization provides a two-step process to verify whether an observed OpenMP execution is conformant. In addition to this formalization, our contributions include a discussion of ambiguities in the current prose-based memory model description. Although our formal model may not capture the current informal memory model perfectly, in part due to these ambiguities, our model reflects our understanding of the informal model's intent. We conclude with several examples that may indicate areas of the OpenMP memory model that need further refinement however it is specified. Our goal is to motivate the OpenMP community to adopt those refinements eventually, ideally through a formal model, in later OpenMP specifications.

  9. An implicit spatial memory alignment effect.

    PubMed

    Cerles, Mélanie; Gomez, Alice; Rousset, Stéphane

    2015-09-01

    The memory alignment effect is the advantage of reasoning from a perspective which is aligned with the frame of reference used to encode an environment in memory. It usually occurs when participants have to consciously take a perspective to perform a spatial memory task. The present experiment assesses whether the memory alignment effect can occur without requiring to consciously take a given perspective, when the misaligned perspective is only perceptively provided. In others words, does the memory alignment effect still arise when it is only implicitly prompted? Thirty participants learned a sequence of four objects' positions in a room from a north-as-up survey perspective. During the testing phase, they had to point to the direction of a target object from another object ('the reference') with a fixed north-up orientation. The background behind the reference object displayed either a uniform color (control condition) or a misaligned ground-level perspective. The latter displayed a reference object's position information which was either congruent with the studied environment (congruent misaligned condition) or incongruent (incongruent misaligned condition). Mean pointing errors were higher in the congruent misaligned condition than in the control condition, whereas the incongruent misaligned condition did not differ from the control one. The present study shows that the memory alignment effect can arise without requiring a conscious misaligned perspective taking. Moreover, the perceived misaligned perspective must share the same spatial content as the memorized spatial representation in order to induce an alignment effect. PMID:26233526

  10. Detailed sensory memory, sloppy working memory.

    PubMed

    Sligte, Ilja G; Vandenbroucke, Annelinde R E; Scholte, H Steven; Lamme, Victor A F

    2010-01-01

    Visual short-term memory (VSTM) enables us to actively maintain information in mind for a brief period of time after stimulus disappearance. According to recent studies, VSTM consists of three stages - iconic memory, fragile VSTM, and visual working memory - with increasingly stricter capacity limits and progressively longer lifetimes. Still, the resolution (or amount of visual detail) of each VSTM stage has remained unexplored and we test this in the present study. We presented people with a change detection task that measures the capacity of all three forms of VSTM, and we added an identification display after each change trial that required people to identify the "pre-change" object. Accurate change detection plus pre-change identification requires subjects to have a high-resolution representation of the "pre-change" object, whereas change detection or identification only can be based on the hunch that something has changed, without exactly knowing what was presented before. We observed that people maintained 6.1 objects in iconic memory, 4.6 objects in fragile VSTM, and 2.1 objects in visual working memory. Moreover, when people detected the change, they could also identify the pre-change object on 88% of the iconic memory trials, on 71% of the fragile VSTM trials and merely on 53% of the visual working memory trials. This suggests that people maintain many high-resolution representations in iconic memory and fragile VSTM, but only one high-resolution object representation in visual working memory. PMID:21897823

  11. Relating Hippocampus to Relational Memory Processing across Domains and Delays

    PubMed Central

    Monti, Jim M.; Cooke, Gillian E.; Watson, Patrick D.; Voss, Michelle W.; Kramer, Arthur F.; Cohen, Neal J.

    2015-01-01

    The hippocampus has been implicated in a diverse set of cognitive domains and paradigms, including cognitive mapping, long-term memory, and relational memory, at long or short study–test intervals. Despite the diversity of these areas, their association with the hippocampus may rely on an underlying commonality of relational memory processing shared among them. Most studies assess hippocampal memory within just one of these domains, making it difficult to know whether these paradigms all assess a similar underlying cognitive construct tied to the hippocampus. Here we directly tested the commonality among disparate tasks linked to the hippocampus by using PCA on performance from a battery of 12 cognitive tasks that included two traditional, long-delay neuropsychological tests of memory and two laboratory tests of relational memory (one of spatial and one of visual object associations) that imposed only short delays between study and test. Also included were different tests of memory, executive function, and processing speed. Structural MRI scans from a subset of participants were used to quantify the volume of the hippocampus and other subcortical regions. Results revealed that the 12 tasks clustered into four components; critically, the two neuropsychological tasks of long-term verbal memory and the two laboratory tests of relational memory loaded onto one component. Moreover, bilateral hippocampal volume was strongly tied to performance on this component. Taken together, these data emphasize the important contribution the hippocampus makes to relational memory processing across a broad range of tasks that span multiple domains. PMID:25203273

  12. Working memory's workload capacity.

    PubMed

    Heathcote, Andrew; Coleman, James R; Eidels, Ami; Watson, Jason M; Houpt, Joseph; Strayer, David L

    2015-10-01

    We examined the role of dual-task interference in working memory using a novel dual two-back task that requires a redundant-target response (i.e., a response that neither the auditory nor the visual stimulus occurred two back versus a response that one or both occurred two back) on every trial. Comparisons with performance on single two-back trials (i.e., with only auditory or only visual stimuli) showed that dual-task demands reduced both speed and accuracy. Our task design enabled a novel application of Townsend and Nozawa's (Journal of Mathematical Psychology 39: 321-359, 1995) workload capacity measure, which revealed that the decrement in dual two-back performance was mediated by the sharing of a limited amount of processing capacity. Relative to most other single and dual n-back tasks, performance measures for our task were more reliable, due to the use of a small stimulus set that induced a high and constant level of proactive interference. For a version of our dual two-back task that minimized response bias, accuracy was also more strongly correlated with complex span than has been found for most other single and dual n-back tasks.

  13. Memories of AB

    NASA Astrophysics Data System (ADS)

    Vaks, V. G.

    2013-06-01

    I had the good fortune to be a student of A. B. Migdal - AB, as we called him in person or in his absence - and to work in the sector he headed at the Kurchatov Institute, along with his other students and my friends, including Vitya Galitsky, Spartak Belyayev and Tolya Larkin. I was especially close with AB in the second half of the 1950s, the years most important for my formation, and AB's contribution to this formation was very great. To this day, I've often quoted AB on various occasions, as it's hard to put things better or more precisely than he did; I tell friends stories heard from AB, because these stories enhance life as AB himself enhanced it; my daughter is named Tanya after AB's wife Tatyana Lvovna, and so on. In what follows, I'll recount a few episodes in my life in which AB played an important or decisive role, and then will share some other memories of AB...

  14. 74. AERIAL VIEW OF MEMORIAL BRIDGE AND MEMORIAL AVENUE LOOKING ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    74. AERIAL VIEW OF MEMORIAL BRIDGE AND MEMORIAL AVENUE LOOKING EAST AT LINCOLN MEMORIAL. - George Washington Memorial Parkway, Along Potomac River from McLean to Mount Vernon, VA, Mount Vernon, Fairfax County, VA

  15. Scaling Linear Algebra Kernels using Remote Memory Access

    SciTech Connect

    Krishnan, Manoj Kumar; Lewis, Robert R.; Vishnu, Abhinav

    2010-09-13

    This paper describes the scalability of linear algebra kernels based on remote memory access approach. The current approach differs from the other linear algebra algorithms by the explicit use of shared memory and remote memory access (RMA) communication rather than message passing. It is suitable for clusters and scalable shared memory systems. The experimental results on large scale systems (Linux-Infiniband cluster, Cray XT) demonstrate consistent performance advantages over ScaLAPACK suite, the leading implementation of parallel linear algebra algorithms used today. For example, on a Cray XT4 for a matrix size of 102400, our RMA-based matrix multiplication achieved over 55 teraflops while ScaLAPACK’s pdgemm measured close to 42 teraflops on 10000 processes.

  16. The Effects of Task Structure on Time-sharing Efficiency and Resource Allocation Optimality

    NASA Technical Reports Server (NTRS)

    Tsang, P. S.; Wickens, C. D.

    1984-01-01

    A distinction was made between two aspects of time sharing performance: time sharing efficiency and attention allocation optimality. A secondary task technique was employed to evaluate the effects of the task structures of the component time shared tasks on both aspects of the time sharing performance. Five pairs of dual tasks differing in their structural configurations were investigated. The primary task was a visual/manual tracking task which requires spatial processing. The secondary task was either another tracking task or a verbal memory task with one of four different input/output configurations. Congruent to a common finding, time-sharing efficiency was observed to decrease with an increasing overlap of resources utilized by the time shared tasks. Research also tends to support the hypothesis that resource allocation is more optimal when the time shared tasks placed heavy demands on common processing resources than when they utilized separate resources.

  17. The science of sharing and the sharing of science

    PubMed Central

    Milkman, Katherine L.; Berger, Jonah

    2014-01-01

    Why do members of the public share some scientific findings and not others? What can scientists do to increase the chances that their findings will be shared widely among nonscientists? To address these questions, we integrate past research on the psychological drivers of interpersonal communication with a study examining the sharing of hundreds of recent scientific discoveries. Our findings offer insights into (i) how attributes of a discovery and the way it is described impact sharing, (ii) who generates discoveries that are likely to be shared, and (iii) which types of people are most likely to share scientific discoveries. The results described here, combined with a review of recent research on interpersonal communication, suggest how scientists can frame their work to increase its dissemination. They also provide insights about which audiences may be the best targets for the diffusion of scientific content. PMID:25225360

  18. The science of sharing and the sharing of science.

    PubMed

    Milkman, Katherine L; Berger, Jonah

    2014-09-16

    Why do members of the public share some scientific findings and not others? What can scientists do to increase the chances that their findings will be shared widely among nonscientists? To address these questions, we integrate past research on the psychological drivers of interpersonal communication with a study examining the sharing of hundreds of recent scientific discoveries. Our findings offer insights into (i) how attributes of a discovery and the way it is described impact sharing, (ii) who generates discoveries that are likely to be shared, and (iii) which types of people are most likely to share scientific discoveries. The results described here, combined with a review of recent research on interpersonal communication, suggest how scientists can frame their work to increase its dissemination. They also provide insights about which audiences may be the best targets for the diffusion of scientific content.

  19. Knowledge of memory functions in European and Asian American adults and children: the relation to autobiographical memory.

    PubMed

    Wang, Qi; Koh, Jessie Bee Kim; Song, Qingfang; Hou, Yubo

    2015-01-01

    This study investigated explicit knowledge of autobiographical memory functions using a newly developed questionnaire. European and Asian American adults (N = 57) and school-aged children (N = 68) indicated their agreement with 13 statements about why people think about and share memories pertaining to four broad functions-self, social, directive and emotion regulation. Children were interviewed for personal memories concurrently with the memory function knowledge assessment and again 3 months later. It was found that adults agreed to the self, social and directive purposes of memory to a greater extent than did children, whereas European American children agreed to the emotion regulation purposes of memory to a greater extent than did European American adults. Furthermore, European American children endorsed more self and emotion regulation functions than did Asian American children, whereas Asian American adults endorsed more directive functions than did European American adults. Children's endorsement of memory functions, particularly social functions, was associated with more detailed and personally meaningful memories. These findings are informative for the understanding of developmental and cultural influences on memory function knowledge and of the relation of such knowledge to autobiographical memory development.

  20. Sharing the Preservation Burden

    SciTech Connect

    Giaretta, D.

    2008-07-01

    Preserving digitally encoded information which is not just to be rendered, as a document, but which must processed, like data, is even harder than one might think, because understandability of the information which is encoded in the digital object(s) is what is required. Information about Nuclear Waste will include both documents as well as data. Moreover one must be able to understand the relationship between the many individual pieces of information. Furthermore the volume of information involved will require us to allow automated processing of such information. Preserving the ability to understand and process digitally encoded information over long periods of time is especially hard when so many things will change, including hardware, software, environment and the tacit and implicit knowledge that people have. Since we cannot predict these changes this cannot be just a one-off action; continued effort is required. However it seems reasonable to say that no organization, project or person can ever say for certain that their ability to provide this effort is going to last forever. What can be done? Can anything be guaranteed? Probably not guaranteed - but at least one can try to reduce the risk of losing the information. We argue that if no single organization, project or person can guarantee funding or effort (or even interest), then somehow we must share the 'preservation load', and this is more than a simple chain of preservation consisting of handing on the collection of bits from one holder to the next. Clearly the bits must be passed on (but may be transformed along the way), however something more is required - because of the need to maintain understandability, not just access. This paper describes the tools, techniques and infrastructure components which the CASPAR project is producing to help in sharing the preservation burden. In summary: CASPAR is attempting to use OAIS concepts rigorously and to the fullest extent possible, supplementing these where