Science.gov

Sample records for shared memory multiprocessors

  1. Shared versus distributed memory multiprocessors

    NASA Technical Reports Server (NTRS)

    Jordan, Harry F.

    1991-01-01

    The question of whether multiprocessors should have shared or distributed memory has attracted a great deal of attention. Some researchers argue strongly for building distributed memory machines, while others argue just as strongly for programming shared memory multiprocessors. A great deal of research is underway on both types of parallel systems. Special emphasis is placed on systems with a very large number of processors for computation intensive tasks and considers research and implementation trends. It appears that the two types of systems will likely converge to a common form for large scale multiprocessors.

  2. Efficient ICCG on a shared memory multiprocessor

    NASA Technical Reports Server (NTRS)

    Hammond, Steven W.; Schreiber, Robert

    1989-01-01

    Different approaches are discussed for exploiting parallelism in the ICCG (Incomplete Cholesky Conjugate Gradient) method for solving large sparse symmetric positive definite systems of equations on a shared memory parallel computer. Techniques for efficiently solving triangular systems and computing sparse matrix-vector products are explored. Three methods for scheduling the tasks in solving triangular systems are implemented on the Sequent Balance 21000. Sample problems that are representative of a large class of problems solved using iterative methods are used. We show that a static analysis to determine data dependences in the triangular solve can greatly improve its parallel efficiency. We also show that ignoring symmetry and storing the whole matrix can reduce solution time substantially.

  3. Performance and scalability aspects of directory-based cache coherence in shared-memory multiprocessors

    SciTech Connect

    Picano, S.; Meyer, D.G.; Brooks, E.D. III; Hoag, J.E.

    1993-05-01

    We present a study that accentuates the performance and scalability aspects of directory-based cache coherence in multiprocessor systems. Using a multiprocessor with a software-based coherence scheme, efficient implementations rely heavily on the programmer`s ability to explicitly manage the memory system, which is typically handled by hardware support on other bus-based, shared memory multiprocessors. We describe a scalable, shared memory, cache coherent multiprocessor and present simulation results obtained on three parallel programs. This multiprocessor configuration exhibits high performance at no additional parallel programming cost.

  4. MPF: A portable message passing facility for shared memory multiprocessors

    NASA Technical Reports Server (NTRS)

    Malony, Allen D.; Reed, Daniel A.; Mcguire, Patrick J.

    1987-01-01

    The design, implementation, and performance evaluation of a message passing facility (MPF) for shared memory multiprocessors are presented. The MPF is based on a message passing model conceptually similar to conversations. Participants (parallel processors) can enter or leave a conversation at any time. The message passing primitives for this model are implemented as a portable library of C function calls. The MPF is currently operational on a Sequent Balance 21000, and several parallel applications were developed and tested. Several simple benchmark programs are presented to establish interprocess communication performance for common patterns of interprocess communication. Finally, performance figures are presented for two parallel applications, linear systems solution, and iterative solution of partial differential equations.

  5. Dynamic programming on a shared-memory multiprocessor

    NASA Technical Reports Server (NTRS)

    Edmonds, Phil; Chu, Eleanor; George, Alan

    1993-01-01

    Three new algorithms for solving dynamic programming problems on a shared-memory parallel computer are described. All three algorithms attempt to balance work load, while keeping synchronization cost low. In particular, for a multiprocessor having p processors, an analysis of the best algorithm shows that the arithmetic cost is O(n-cubed/6p) and that the synchronization cost is O(absolute value of log sub C n) if p much less than n, where C = (2p-1)/(2p + 1) and n is the size of the problem. The low synchronization cost is important for machines where synchronization is expensive. Analysis and experiments show that the best algorithm is effective in balancing the work load and producing high efficiency.

  6. A robot arm simulation with a shared memory multiprocessor machine

    NASA Technical Reports Server (NTRS)

    Kim, Sung-Soo; Chuang, Li-Ping

    1989-01-01

    A parallel processing scheme for a single chain robot arm is presented for high speed computation on a shared memory multiprocessor. A recursive formulation that is derived from a virtual work form of the d'Alembert equations of motion is utilized for robot arm dynamics. A joint drive system that consists of a motor rotor and gears is included in the arm dynamics model, in order to take into account gyroscopic effects due to the spinning of the rotor. The fine grain parallelism of mechanical and control subsystem models is exploited, based on independent computation associated with bodies, joint drive systems, and controllers. Efficiency and effectiveness of the parallel scheme are demonstrated through simulations of a telerobotic manipulator arm. Two different mechanical subsystem models, i.e., with and without gyroscopic effects, are compared, to show the trade-off between efficiency and accuracy.

  7. Simulation Analysis of Data Sharing in Shared Memory Multiprocessors

    DTIC Science & Technology

    2016-06-14

    Submitted in partial satisfaction of the requirements for the degree of DOCfOR OF PHILOSOPHY in COMPUTER SCIENCE in the GRADUATE DMSION of the...California at Berkeley,Department of Electrical Engineering and Computer Sciences ,Berkeley,CA,94720 8. PERFORMING ORGANIZATION REPORT NUMBER 9...Susan J. Eggers Computer Science Division University of California Berkeley CA 94720 ABSTRACT This dissertation examines shared memory reference

  8. Parallelization of NAS Benchmarks for Shared Memory Multiprocessors

    NASA Technical Reports Server (NTRS)

    Waheed, Abdul; Yan, Jerry C.; Saini, Subhash (Technical Monitor)

    1998-01-01

    This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting to new generations of high performance computing systems to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically and present the results of parallelization to the users. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks.

  9. Parallel-vector algorithms for particle simulations on shared-memory multiprocessors

    SciTech Connect

    Nishiura, Daisuke; Sakaguchi, Hide

    2011-03-01

    Over the last few decades, the computational demands of massive particle-based simulations for both scientific and industrial purposes have been continuously increasing. Hence, considerable efforts are being made to develop parallel computing techniques on various platforms. In such simulations, particles freely move within a given space, and so on a distributed-memory system, load balancing, i.e., assigning an equal number of particles to each processor, is not guaranteed. However, shared-memory systems achieve better load balancing for particle models, but suffer from the intrinsic drawback of memory access competition, particularly during (1) paring of contact candidates from among neighboring particles and (2) force summation for each particle. Here, novel algorithms are proposed to overcome these two problems. For the first problem, the key is a pre-conditioning process during which particle labels are sorted by a cell label in the domain to which the particles belong. Then, a list of contact candidates is constructed by pairing the sorted particle labels. For the latter problem, a table comprising the list indexes of the contact candidate pairs is created and used to sum the contact forces acting on each particle for all contacts according to Newton's third law. With just these methods, memory access competition is avoided without additional redundant procedures. The parallel efficiency and compatibility of these two algorithms were evaluated in discrete element method (DEM) simulations on four types of shared-memory parallel computers: a multicore multiprocessor computer, scalar supercomputer, vector supercomputer, and graphics processing unit. The computational efficiency of a DEM code was found to be drastically improved with our algorithms on all but the scalar supercomputer. Thus, the developed parallel algorithms are useful on shared-memory parallel computers with sufficient memory bandwidth.

  10. Avoiding and tolerating latency in large-scale next-generation shared-memory multiprocessors

    NASA Technical Reports Server (NTRS)

    Probst, David K.

    1993-01-01

    A scalable solution to the memory-latency problem is necessary to prevent the large latencies of synchronization and memory operations inherent in large-scale shared-memory multiprocessors from reducing high performance. We distinguish latency avoidance and latency tolerance. Latency is avoided when data is brought to nearby locales for future reference. Latency is tolerated when references are overlapped with other computation. Latency-avoiding locales include: processor registers, data caches used temporally, and nearby memory modules. Tolerating communication latency requires parallelism, allowing the overlap of communication and computation. Latency-tolerating techniques include: vector pipelining, data caches used spatially, prefetching in various forms, and multithreading in various forms. Relaxing the consistency model permits increased use of avoidance and tolerance techniques. Each model is a mapping from the program text to sets of partial orders on program operations; it is a convention about which temporal precedences among program operations are necessary. Information about temporal locality and parallelism constrains the use of avoidance and tolerance techniques. Suitable architectural primitives and compiler technology are required to exploit the increased freedom to reorder and overlap operations in relaxed models.

  11. Performance Modeling and Measurement of Parallelized Code for Distributed Shared Memory Multiprocessors

    NASA Technical Reports Server (NTRS)

    Waheed, Abdul; Yan, Jerry

    1998-01-01

    This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. With increasing popularity of shared address space architectures, it is essential to understand their performance impact on programs that benefit from shared memory multiprocessing. We present a simple model to characterize the performance of programs that are parallelized using compiler directives for shared memory multiprocessing. We parallelized the sequential implementation of NAS benchmarks using native Fortran77 compiler directives for an Origin2000, which is a DSM system based on a cache-coherent Non Uniform Memory Access (ccNUMA) architecture. We report measurement based performance of these parallelized benchmarks from four perspectives: efficacy of parallelization process; scalability; parallelization overhead; and comparison with hand-parallelized and -optimized version of the same benchmarks. Our results indicate that sequential programs can conveniently be parallelized for DSM systems using compiler directives but realizing performance gains as predicted by the performance model depends primarily on minimizing architecture-specific data locality overhead.

  12. Reader set encoding for directory of shared cache memory in multiprocessor system

    DOEpatents

    Ahn, Dnaiel; Ceze, Luis H.; Gara, Alan; Ohmacht, Martin; Xiaotong, Zhuang

    2014-06-10

    In a parallel processing system with speculative execution, conflict checking occurs in a directory lookup of a cache memory that is shared by all processors. In each case, the same physical memory address will map to the same set of that cache, no matter which processor originated that access. The directory includes a dynamic reader set encoding, indicating what speculative threads have read a particular line. This reader set encoding is used in conflict checking. A bitset encoding is used to specify particular threads that have read the line.

  13. Shared performance monitor in a multiprocessor system

    DOEpatents

    Chiu, George; Gara, Alan G; Salapura, Valentina

    2014-12-02

    A performance monitoring unit (PMU) and method for monitoring performance of events occurring in a multiprocessor system. The multiprocessor system comprises a plurality of processor devices units, each processor device for generating signals representing occurrences of events in the processor device, and, a single shared counter resource for performance monitoring. The performance monitor unit is shared by all processor cores in the multiprocessor system. The PMU is further programmed to monitor event signals issued from non-processor devices.

  14. Layer-by-layer ordering in parallel finite element composition on shared-memory multiprocessors

    NASA Astrophysics Data System (ADS)

    Novikov, A. K.; Piminova, N. K.; Kopysov, S. P.; Sagdeeva, YA

    2016-11-01

    In this paper, we present new partitioning algorithms for unstructured meshes that prevent conflicts during parallel assembling of FEM matrices and vectors in shared memory. These algorithms use a ratio which we introduce to determine if any two mesh cells are adjacent. This adjacency ratio defines mesh layers, which are combined into domains and assigned to different parallel processes/threads. The proposed partitioning algorithms are compared with the existing algorithms on quasi-structured and unstructured meshes by the number of potential conflicts and by the load imbalance.

  15. Shared performance monitor in a multiprocessor system

    DOEpatents

    Chiu, George; Gara, Alan G.; Salapura, Valentina

    2012-07-24

    A performance monitoring unit (PMU) and method for monitoring performance of events occurring in a multiprocessor system. The multiprocessor system comprises a plurality of processor devices units, each processor device for generating signals representing occurrences of events in the processor device, and, a single shared counter resource for performance monitoring. The performance monitor unit is shared by all processor cores in the multiprocessor system. The PMU comprises: a plurality of performance counters each for counting signals representing occurrences of events from one or more the plurality of processor units in the multiprocessor system; and, a plurality of input devices for receiving the event signals from one or more processor devices of the plurality of processor units, the plurality of input devices programmable to select event signals for receipt by one or more of the plurality of performance counters for counting, wherein the PMU is shared between multiple processing units, or within a group of processors in the multiprocessing system. The PMU is further programmed to monitor event signals issued from non-processor devices.

  16. A general model for memory interference in a multiprocessor system with memory hierarchy

    NASA Technical Reports Server (NTRS)

    Taha, Badie A.; Standley, Hilda M.

    1989-01-01

    The problem of memory interference in a multiprocessor system with a hierarchy of shared buses and memories is addressed. The behavior of the processors is represented by a sequence of memory requests with each followed by a determined amount of processing time. A statistical queuing network model for determining the extent of memory interference in multiprocessor systems with clusters of memory hierarchies is presented. The performance of the system is measured by the expected number of busy memory clusters. The results of the analytic model are compared with simulation results, and the correlation between them is found to be very high.

  17. Preliminary basic performance analysis of the Cedar multiprocessor memory system

    NASA Technical Reports Server (NTRS)

    Gallivan, K.; Jalby, W.; Turner, S.; Veidenbaum, A.; Wijshoff, H.

    1991-01-01

    Some preliminary basic results on the performance of the Cedar multiprocessor memory system are presented. Empirical results are presented and used to calibrate a memory system simulator which is then used to discuss the scalability of the system.

  18. Scalable Triadic Analysis of Large-Scale Graphs: Multi-Core vs. Multi-Processor vs. Multi-Threaded Shared Memory Architectures

    SciTech Connect

    Chin, George; Marquez, Andres; Choudhury, Sutanay; Feo, John T.

    2012-09-01

    Triadic analysis encompasses a useful set of graph mining methods that is centered on the concept of a triad, which is a subgraph of three nodes and the configuration of directed edges across the nodes. Such methods are often applied in the social sciences as well as many other diverse fields. Triadic methods commonly operate on a triad census that counts the number of triads of every possible edge configuration in a graph. Like other graph algorithms, triadic census algorithms do not scale well when graphs reach tens of millions to billions of nodes. To enable the triadic analysis of large-scale graphs, we developed and optimized a triad census algorithm to efficiently execute on shared memory architectures. We will retrace the development and evolution of a parallel triad census algorithm. Over the course of several versions, we continually adapted the code’s data structures and program logic to expose more opportunities to exploit parallelism on shared memory that would translate into improved computational performance. We will recall the critical steps and modifications that occurred during code development and optimization. Furthermore, we will compare the performances of triad census algorithm versions on three specific systems: Cray XMT, HP Superdome, and AMD multi-core NUMA machine. These three systems have shared memory architectures but with markedly different hardware capabilities to manage parallelism.

  19. Optimal eigenvalue computation on distributed-memory MIMD multiprocessors

    SciTech Connect

    Crivelli, S.; Jessup, E. R.

    1992-10-01

    Simon proves that bisection is not the optimal method for computing an eigenvalue on a single vector processor. In this paper, we show that his analysis does not extend in a straightforward way to the computation of an eigenvalue on a distributed-memory MIMD multiprocessor. In particular, we show how the optimal number of sections (and processors) to use for multisection depends on variables such as the matrix size and certain parameters inherent to the machine. We also show that parallel multisection outperforms the variant of parallel bisection proposed by Swarztrauber or this problem on a distributed-memory MIMD multiprocessor. We present the results of experiments on the 64-processor Intel iPSC/2 hypercube and the 512-processor Intel Touchstone Delta mesh multiprocessor.

  20. Using Pin as a Memory Reference Generator for Multiprocessor Simulation

    SciTech Connect

    McCurdy, C

    2005-10-22

    In this paper we describe how we have used Pin to generate a multithreaded reference stream for simulation of a multiprocessor on a uniprocessor. We have taken special care to model as accurately as possible the effects of cache coherence protocol state, and lock and barrier synchronization on the performance of multithreaded applications running on multiprocessor hardware. We first describe a simplified version of the algorithm, which uses semaphores to synchronize instrumented application threads and the simulator on every memory reference. We then describe modifications to that algorithm to model the microarchitectural features of the Itanium2 that affect the timing of memory reference issue. An experimental evaluation determines that while cycle-accurate multithreaded simulation is possible using our approach, the use of semaphores has a negative impact on the performance of the simulator.

  1. Low Latency Messages on Distributed Memory Multiprocessors

    DOE PAGES

    Rosing, Matt; Saltz, Joel

    1995-01-01

    This article describes many of the issues in developing an efficient interface for communication on distributed memory machines. Although the hardware component of message latency is less than 1 ws on many distributed memory machines, the software latency associated with sending and receiving typed messages is on the order of 50 μs. The reason for this imbalance is that the software interface does not match the hardware. By changing the interface to match the hardware more closely, applications with fine grained communication can be put on these machines. This article describes several tests performed and many of the issues involvedmore » in supporting low latency messages on distributed memory machines.« less

  2. Low latency messages on distributed memory multiprocessors

    NASA Technical Reports Server (NTRS)

    Rosing, Matthew; Saltz, Joel

    1993-01-01

    Many of the issues in developing an efficient interface for communication on distributed memory machines are described and a portable interface is proposed. Although the hardware component of message latency is less than one microsecond on many distributed memory machines, the software latency associated with sending and receiving typed messages is on the order of 50 microseconds. The reason for this imbalance is that the software interface does not match the hardware. By changing the interface to match the hardware more closely, applications with fine grained communication can be put on these machines. Based on several tests that were run on the iPSC/860, an interface that will better match current distributed memory machines is proposed. The model used in the proposed interface consists of a computation processor and a communication processor on each node. Communication between these processors and other nodes in the system is done through a buffered network. Information that is transmitted is either data or procedures to be executed on the remote processor. The dual processor system is better suited for efficiently handling asynchronous communications compared to a single processor system. The ability to send data or procedure is very flexible for minimizing message latency, based on the type of communication being performed. The test performed and the proposed interface are described.

  3. Software Coherence in Multiprocessor Memory Systems. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Bolosky, William Joseph

    1993-01-01

    Processors are becoming faster and multiprocessor memory interconnection systems are not keeping up. Therefore, it is necessary to have threads and the memory they access as near one another as possible. Typically, this involves putting memory or caches with the processors, which gives rise to the problem of coherence: if one processor writes an address, any other processor reading that address must see the new value. This coherence can be maintained by the hardware or with software intervention. Systems of both types have been built in the past; the hardware-based systems tended to outperform the software ones. However, the ratio of processor to interconnect speed is now so high that the extra overhead of the software systems may no longer be significant. This issue is explored both by implementing a software maintained system and by introducing and using the technique of offline optimal analysis of memory reference traces. It finds that in properly built systems, software maintained coherence can perform comparably to or even better than hardware maintained coherence. The architectural features necessary for efficient software coherence to be profitable include a small page size, a fast trap mechanism, and the ability to execute instructions while remote memory references are outstanding.

  4. Generation-based memory synchronization in a multiprocessor system with weakly consistent memory accesses

    DOEpatents

    Ohmacht, Martin

    2014-09-09

    In a multiprocessor system, a central memory synchronization module coordinates memory synchronization requests responsive to memory access requests in flight, a generation counter, and a reclaim pointer. The central module communicates via point-to-point communication. The module includes a global OR reduce tree for each memory access requesting device, for detecting memory access requests in flight. An interface unit is implemented associated with each processor requesting synchronization. The interface unit includes multiple generation completion detectors. The generation count and reclaim pointer do not pass one another.

  5. Multi-ring performance of the Kendall square multiprocessor

    SciTech Connect

    Dunigan, T.H.

    1994-03-01

    Performance of the hierarchical shared-memory system of the Kendall Square Research multiprocessor is measured and characterized. The performance of prefetch is measured. Latency, bandwidth, and contention are analyzed on a 4-ring, 128 processor system. Scalability comparisons are made with other shared-memory and distributed-memory multiprocessors.

  6. Optical RAM-enabled cache memory and optical routing for chip multiprocessors: technologies and architectures

    NASA Astrophysics Data System (ADS)

    Pleros, Nikos; Maniotis, Pavlos; Alexoudi, Theonitsa; Fitsios, Dimitris; Vagionas, Christos; Papaioannou, Sotiris; Vyrsokinos, K.; Kanellos, George T.

    2014-03-01

    The processor-memory performance gap, commonly referred to as "Memory Wall" problem, owes to the speed mismatch between processor and electronic RAM clock frequencies, forcing current Chip Multiprocessor (CMP) configurations to consume more than 50% of the chip real-estate for caching purposes. In this article, we present our recent work spanning from Si-based integrated optical RAM cell architectures up to complete optical cache memory architectures for Chip Multiprocessor configurations. Moreover, we discuss on e/o router subsystems with up to Tb/s routing capacity for cache interconnection purposes within CMP configurations, currently pursued within the FP7 PhoxTrot project.

  7. Kendall Square multiprocessor: Early experiences and performance

    SciTech Connect

    Dunigan, T.H.

    1992-04-01

    Initial performance results and early experiences are reported for the Kendall Square Research multiprocessor. The basic architecture of the shared-memory multiprocessor is described, and computational and I/O performance is measured for both serial and parallel programs. Experiences in porting various applications are described.

  8. Vienna FORTRAN: A FORTRAN language extension for distributed memory multiprocessors

    NASA Technical Reports Server (NTRS)

    Chapman, Barbara; Mehrotra, Piyush; Zima, Hans

    1991-01-01

    Exploiting the performance potential of distributed memory machines requires a careful distribution of data across the processors. Vienna FORTRAN is a language extension of FORTRAN which provides the user with a wide range of facilities for such mapping of data structures. However, programs in Vienna FORTRAN are written using global data references. Thus, the user has the advantage of a shared memory programming paradigm while explicitly controlling the placement of data. The basic features of Vienna FORTRAN are presented along with a set of examples illustrating the use of these features.

  9. Direct Deposit -- When Message Passing Meets Shared Memory

    DTIC Science & Technology

    2000-05-19

    by H. Karl in [64]. The paper implements the pure DSM code, the pure message passing code and a few intermediate forms on the Charlotte DSM system [8...ASPLOS VI), pages 51–60, San Jose, October 1994. ACM. [64] H. Karl . Bridging the gap between distributed shared memory and message passing. Concurrency...pages 94 – 101, 1988. [73] P.N. Loewenstein and D.L. Dill. Verification of a multiprocessor cache protocol using simulation relations and higher-order

  10. Communications Patterns in a Symbolic Multiprocessor.

    DTIC Science & Technology

    1987-06-01

    tied to memory through a ’ butterfly ’ switching network. All system memory is shared among all the proces- sors. There are no fundamental engineering...Palo Alto, CA, 1.0 edition, November 2 1982. , [51 Development of a Butterfly Multiprocessor Test Bed. Quarterly Technical Re- ,:€ port 5872, Bolt... Monarch multiprocessor. MIT VLSI Seminar, April 14 1987. Author is employed by B.B.N. Inc. [26] Alloan Gottlieb, Ralph Grishman, Clyde Kruskal, Kevin

  11. Conditional load and store in a shared memory

    DOEpatents

    Blumrich, Matthias A; Ohmacht, Martin

    2015-02-03

    A method, system and computer program product for implementing load-reserve and store-conditional instructions in a multi-processor computing system. The computing system includes a multitude of processor units and a shared memory cache, and each of the processor units has access to the memory cache. In one embodiment, the method comprises providing the memory cache with a series of reservation registers, and storing in these registers addresses reserved in the memory cache for the processor units as a result of issuing load-reserve requests. In this embodiment, when one of the processor units makes a request to store data in the memory cache using a store-conditional request, the reservation registers are checked to determine if an address in the memory cache is reserved for that processor unit. If an address in the memory cache is reserved for that processor, the data are stored at this address.

  12. Multiprocessor execution of functional programs

    SciTech Connect

    Goldberg, B. )

    1988-10-01

    Functional languages have recently gained attention as vehicles for programming in a concise and element manner. In addition, it has been suggested that functional programming provides a natural methodology for programming multiprocessor computers. This paper describes research that was performed to demonstrate that multiprocessor execution of functional programs on current multiprocessors is feasible, and results in a significant reduction in their execution times. Two implementations of the functional language ALFL were built on commercially available multiprocessors. Alfalfa is an implementation on the Intel iPSC hypercube multiprocessor, and Buckwheat is an implementation on the Encore Multimax shared-memory multiprocessor. Each implementation includes a compiler that performs automatic decomposition of ALFL programs and a run-time system that supports their execution. The compiler is responsible for detecting the inherent parallelism in a program, and decomposing the program into a collection of tasks, called serial combinators, that can be executed in parallel. The abstract machine model supported by Alfalfa and Buckwheat is called heterogeneous graph reduction, which is a hybrid of graph reduction and conventional stack-oriented execution. This model supports parallelism, lazy evaluation, and higher order functions while at the same time making efficient use of the processors in the system. The Alfalfa and Buckwheat runtime systems support dynamic load balancing, interprocessor communication (if required), and storage management. A large number of experiments were performed on Alfalfa and Buckwheat for a variety of programs. The results of these experiments, as well as the conclusions drawn from them, are presented.

  13. Dynamically reconfigurable multiprocessor system for high-order-bidirectional-associative-memory-based image recognition

    NASA Astrophysics Data System (ADS)

    Wu, Chwan-Hwa; Roland, David A.

    1991-08-01

    In this paper a high-order bidirectional associative memory (HOBAM) based image recognition system and a dynamically reconfigurable multiprocessor system that achieves real- time response are reported. The HOBAM has been utilized to recognize corrupted images of human faces (with hats, glasses, masks, and slight translation and scaling effects). In addition, the HOBAM, incorporated with edge detection techniques, has been used to recognize isolated objects within multiple-object images. Successful recognition rates have been achieved. A dynamically reconfigurable multiprocessor system and parallel software have been developed to achieve real-time response for image recognition. The system consists of Inmos transputers and crossbar switches (IMS C004). The communication links can be dynamically connected by circuit switching. This is the first time and the transputers and crossbar switches are reported to form a low-cost multiprocessor system connected by a switching network. Moreover, the switching network simplifies the design of the communication in parallel software without handling the message routing. Although the HOBAM is a fully connected network, the algorithm minimizes the amount of information that needs to be exchanged between processors using a data compression technique. The detailed design of both hardware and software are discussed in the paper. Significant speedup through parallel processing is accomplished. The architecture of the experimental system is a cost-effective design for an embedded system for neural network applications on computer vision.

  14. Memory access in shared virtual memory

    SciTech Connect

    Berrendorf, R. )

    1992-01-01

    Shared virtual memory (SVM) is a virtual memory layer with a single address space on top of a distributed real memory on parallel computers. We examine the behavior and performance of SVM running a parallel program with medium-grained, loop-level parallelism on top of it. A simulator for the underlying parallel architecture can be used to examine the behavior of SVM more deeply. The influence of several parameters, such as the number of processors, page size, cold or warm start, and restricted page replication, is studied.

  15. Memory access in shared virtual memory

    SciTech Connect

    Berrendorf, R.

    1992-09-01

    Shared virtual memory (SVM) is a virtual memory layer with a single address space on top of a distributed real memory on parallel computers. We examine the behavior and performance of SVM running a parallel program with medium-grained, loop-level parallelism on top of it. A simulator for the underlying parallel architecture can be used to examine the behavior of SVM more deeply. The influence of several parameters, such as the number of processors, page size, cold or warm start, and restricted page replication, is studied.

  16. Solution of large nonlinear quasistatic structural mechanics problems on distributed-memory multiprocessor computers

    SciTech Connect

    Blanford, M.

    1997-12-31

    Most commercially-available quasistatic finite element programs assemble element stiffnesses into a global stiffness matrix, then use a direct linear equation solver to obtain nodal displacements. However, for large problems (greater than a few hundred thousand degrees of freedom), the memory size and computation time required for this approach becomes prohibitive. Moreover, direct solution does not lend itself to the parallel processing needed for today`s multiprocessor systems. This talk gives an overview of the iterative solution strategy of JAS3D, the nonlinear large-deformation quasistatic finite element program. Because its architecture is derived from an explicit transient-dynamics code, it does not ever assemble a global stiffness matrix. The author describes the approach he used to implement the solver on multiprocessor computers, and shows examples of problems run on hundreds of processors and more than a million degrees of freedom. Finally, he describes some of the work he is presently doing to address the challenges of iterative convergence for ill-conditioned problems.

  17. Shared Memory Parallelization of an Implicit ADI-type CFD Code

    NASA Technical Reports Server (NTRS)

    Hauser, Th.; Huang, P. G.

    1999-01-01

    A parallelization study designed for ADI-type algorithms is presented using the OpenMP specification for shared-memory multiprocessor programming. Details of optimizations specifically addressed to cache-based computer architectures are described and performance measurements for the single and multiprocessor implementation are summarized. The paper demonstrates that optimization of memory access on a cache-based computer architecture controls the performance of the computational algorithm. A hybrid MPI/OpenMP approach is proposed for clusters of shared memory machines to further enhance the parallel performance. The method is applied to develop a new LES/DNS code, named LESTool. A preliminary DNS calculation of a fully developed channel flow at a Reynolds number of 180, Re(sub tau) = 180, has shown good agreement with existing data.

  18. A multiprocessor computer simulation model employing a feedback scheduler/allocator for memory space and bandwidth matching and TMR processing

    NASA Technical Reports Server (NTRS)

    Bradley, D. B.; Irwin, J. D.

    1974-01-01

    A computer simulation model for a multiprocessor computer is developed that is useful for studying the problem of matching multiprocessor's memory space, memory bandwidth and numbers and speeds of processors with aggregate job set characteristics. The model assumes an input work load of a set of recurrent jobs. The model includes a feedback scheduler/allocator which attempts to improve system performance through higher memory bandwidth utilization by matching individual job requirements for space and bandwidth with space availability and estimates of bandwidth availability at the times of memory allocation. The simulation model includes provisions for specifying precedence relations among the jobs in a job set, and provisions for specifying precedence execution of TMR (Triple Modular Redundant and SIMPLEX (non redundant) jobs.

  19. Compiler-directed cache management in multiprocessors

    NASA Technical Reports Server (NTRS)

    Cheong, Hoichi; Veidenbaum, Alexander V.

    1990-01-01

    The necessity of finding alternatives to hardware-based cache coherence strategies for large-scale multiprocessor systems is discussed. Three different software-based strategies sharing the same goals and general approach are presented. They consist of a simple invalidation approach, a fast selective invalidation scheme, and a version control scheme. The strategies are suitable for shared-memory multiprocessor systems with interconnection networks and a large number of processors. Results of trace-driven simulations conducted on numerical benchmark routines to compare the performance of the three schemes are presented.

  20. The performance of disk arrays in shared-memory database machines

    NASA Technical Reports Server (NTRS)

    Katz, Randy H.; Hong, Wei

    1993-01-01

    In this paper, we examine how disk arrays and shared memory multiprocessors lead to an effective method for constructing database machines for general-purpose complex query processing. We show that disk arrays can lead to cost-effective storage systems if they are configured from suitably small formfactor disk drives. We introduce the storage system metric data temperature as a way to evaluate how well a disk configuration can sustain its workload, and we show that disk arrays can sustain the same data temperature as a more expensive mirrored-disk configuration. We use the metric to evaluate the performance of disk arrays in XPRS, an operational shared-memory multiprocessor database system being developed at the University of California, Berkeley.

  1. A simple modern correctness condition for a space-based high-performance multiprocessor

    NASA Technical Reports Server (NTRS)

    Probst, David K.; Li, Hon F.

    1992-01-01

    A number of U.S. national programs, including space-based detection of ballistic missile launches, envisage putting significant computing power into space. Given sufficient progress in low-power VLSI, multichip-module packaging and liquid-cooling technologies, we will see design of high-performance multiprocessors for individual satellites. In very high speed implementations, performance depends critically on tolerating large latencies in interprocessor communication; without latency tolerance, performance is limited by the vastly differing time scales in processor and data-memory modules, including interconnect times. The modern approach to tolerating remote-communication cost in scalable, shared-memory multiprocessors is to use a multithreaded architecture, and alter the semantics of shared memory slightly, at the price of forcing the programmer either to reason about program correctness in a relaxed consistency model or to agree to program in a constrained style. The literature on multiprocessor correctness conditions has become increasingly complex, and sometimes confusing, which may hinder its practical application. We propose a simple modern correctness condition for a high-performance, shared-memory multiprocessor; the correctness condition is based on a simple interface between the multiprocessor architecture and a high-performance, shared-memory multiprocessor; the correctness condition is based on a simple interface between the multiprocessor architecture and the parallel programming system.

  2. Multiprocessor execution of functional programs

    SciTech Connect

    Goldberg, B.F.

    1988-01-01

    Functional languages have recently gained attention as vehicles for programming in a concise and elegant manner. In addition, it has been suggested that functional programming provides a natural methodology for programming multiprocessor computers. This dissertation demonstrates that multiprocessor execution of functional programs is feasible, and results in a significant reduction in their execution times. Two implementations of the functional language ALFL were built on commercially available multiprocessors. ALFL is an implementation on the Intel iPSC hypercube multiprocessor, and Buckwheat is an implementation on the Encore Multimax shared-memory multiprocessor. Each implementation includes a compiler that performs automatic decomposition of ALFL programs. The compiler is responsible for detecting the inherent parallelism in a program, and decomposing the program into a collection of tasks, called serial combinators, that can be executed in parallel. One of the primary goals of the compiler is to generate serial combinators exhibiting the coarsest granularity possibly without sacrificing useful parallelism. This dissertation describes the algorithms used by the compiler to analyze, decompose, and optimize functional programs. The abstract machine model supported by Alfalfa and Buckwheat is called heterogeneous graph reduction, which is a hybrid of graph reduction and conventional stack-oriented execution. This model supports parallelism, lazy evaluation, and higher order functions while at the same time making efficient use of the processors in the system. The Alfalfa and Buckwheat run-time systems support dynamic load balancing, interprocessor communication (if required) and storage management. A large number of experiments were performed on Alfalfa and Buckwheat for a variety of programs. The results of these experiments, as well as the conclusions drawn from them, are presented.

  3. Multiprocessor architectural study

    NASA Technical Reports Server (NTRS)

    Kosmala, A. L.; Stanten, S. F.; Vandever, W. H.

    1972-01-01

    An architectural design study was made of a multiprocessor computing system intended to meet functional and performance specifications appropriate to a manned space station application. Intermetrics, previous experience, and accumulated knowledge of the multiprocessor field is used to generate a baseline philosophy for the design of a future SUMC* multiprocessor. Interrupts are defined and the crucial questions of interrupt structure, such as processor selection and response time, are discussed. Memory hierarchy and performance is discussed extensively with particular attention to the design approach which utilizes a cache memory associated with each processor. The ability of an individual processor to approach its theoretical maximum performance is then analyzed in terms of a hit ratio. Memory management is envisioned as a virtual memory system implemented either through segmentation or paging. Addressing is discussed in terms of various register design adopted by current computers and those of advanced design.

  4. Debugging in a multi-processor environment

    SciTech Connect

    Spann, J.M.

    1981-09-29

    The Supervisory Control and Diagnostic System (SCDS) for the Mirror Fusion Test Facility (MFTF) consists of nine 32-bit minicomputers arranged in a tightly coupled distributed computer system utilizing a share memory as the data exchange medium. Debugging of more than one program in the multi-processor environment is a difficult process. This paper describes what new tools were developed and how the testing of software is performed in the SCDS for the MFTF project.

  5. Shared-memory parallel programming in C++

    SciTech Connect

    Beck, B. )

    1990-07-01

    This paper discusses how researchers have produced a set of portable parallel-programming constructs for C, implemented in M4 macros. These parallel-programming macros are available under the name Parmacs. The Parmacs macros let one write parallel C programs for shared-memory, distributed-memory, and mixed-memory (shared and distributed) systems. They have been implemented on several machines. Because Parmacs offers useful parallel-programming features, the author has considered how these problems might be overcome or avoided. The author thought that using C++, rather than C, would address these problems adequately, and describes the C++ features exploited. The work described addresses shared-memory constructs.

  6. Performing an allreduce operation using shared memory

    DOEpatents

    Archer, Charles J; Dozsa, Gabor; Ratterman, Joseph D; Smith, Brian E

    2014-06-10

    Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.

  7. Performing an allreduce operation using shared memory

    DOEpatents

    Archer, Charles J [Rochester, MN; Dozsa, Gabor [Ardsley, NY; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

    2012-04-17

    Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.

  8. Rollback-recovery techniques and architectural support for multiprocessor systems

    SciTech Connect

    Chiang Chungyang.

    1991-01-01

    The author proposes efficient and robust fault diagnosis and rollback-recovery techniques to enhance system availability as well as performance in both distributed-memory and shared-bus shared-memory multiprocessor systems. Architectural support for the proposed rollback-recovery technique in a bus-based shared-memory multiprocessor system is also investigated to adaptively fine tune the proposed rollback-recovery technique in this type of system. A comparison of the performance of the proposed techniques with other existing techniques is made, a topic on which little quantitative information is available in the literature. New diagnosis concepts are introduced to show that the author's diagnosis technique yields higher diagnosis coverage and facilitates the performance evaluation of various fault-diagnosis techniques.

  9. Direct access inter-process shared memory

    DOEpatents

    Brightwell, Ronald B; Pedretti, Kevin; Hudson, Trammell B

    2013-10-22

    A technique for directly sharing physical memory between processes executing on processor cores is described. The technique includes loading a plurality of processes into the physical memory for execution on a corresponding plurality of processor cores sharing the physical memory. An address space is mapped to each of the processes by populating a first entry in a top level virtual address table for each of the processes. The address space of each of the processes is cross-mapped into each of the processes by populating one or more subsequent entries of the top level virtual address table with the first entry in the top level virtual address table from other processes.

  10. Multiprocessor architecture: Synthesis and evaluation

    NASA Technical Reports Server (NTRS)

    Standley, Hilda M.

    1990-01-01

    Multiprocessor computed architecture evaluation for structural computations is the focus of the research effort described. Results obtained are expected to lead to more efficient use of existing architectures and to suggest designs for new, application specific, architectures. The brief descriptions given outline a number of related efforts directed toward this purpose. The difficulty is analyzing an existing architecture or in designing a new computer architecture lies in the fact that the performance of a particular architecture, within the context of a given application, is determined by a number of factors. These include, but are not limited to, the efficiency of the computation algorithm, the programming language and support environment, the quality of the program written in the programming language, the multiplicity of the processing elements, the characteristics of the individual processing elements, the interconnection network connecting processors and non-local memories, and the shared memory organization covering the spectrum from no shared memory (all local memory) to one global access memory. These performance determiners may be loosely classified as being software or hardware related. This distinction is not clear or even appropriate in many cases. The effect of the choice of algorithm is ignored by assuming that the algorithm is specified as given. Effort directed toward the removal of the effect of the programming language and program resulted in the design of a high-level parallel programming language. Two characteristics of the fundamental structure of the architecture (memory organization and interconnection network) are examined.

  11. Comparative Study of Message Passing and Shared Memory Parallel Programming Models in Neural Network Training

    SciTech Connect

    Vitela, J.; Gordillo, J.; Cortina, L; Hanebutte, U.

    1999-12-14

    It is presented a comparative performance study of a coarse grained parallel neural network training code, implemented in both OpenMP and MPI, standards for shared memory and message passing parallel programming environments, respectively. In addition, these versions of the parallel training code are compared to an implementation utilizing SHMEM the native SGI/CRAY environment for shared memory programming. The multiprocessor platform used is a SGI/Cray Origin 2000 with up to 32 processors. It is shown that in this study, the native CRAY environment outperforms MPI for the entire range of processors used, while OpenMP shows better performance than the other two environments when using more than 19 processors. In this study, the efficiency is always greater than 60% regardless of the parallel programming environment used as well as of the number of processors.

  12. Distributed job scheduling in SCI Local Area MultiProcessors

    SciTech Connect

    Agasaveeran, S.; Li, Qiang

    1996-12-31

    Local Area MultiProcessors (LAMP) is a network of personal workstations with distributed shared physical memory provided by high performance technologies such as SCI. LAMP is more tightly coupled than the traditional local area networks (LAN) but is more loosely coupled than the bus based multiprocessors. This paper presents a distributed scheduling algorithm which exploits the distributed shared memory in SCI-LAMP to schedule the idle remote processors among the requesting workstations. It considers fairness by allocating remote processing capacity to the requesting workstations based on their priorities according to the decay-usage scheduling approach. The performance of the algorithm in scheduling both sequential and parallel jobs is evaluated by simulation. It is found that the higher priority nodes achieve faster job response times and higher speedups than that of the lower priority nodes. Lower scheduling overhead allows finer granularity of remote processors sharing than in LAN.

  13. C-MOS array design techniques: SUMC multiprocessor system study

    NASA Technical Reports Server (NTRS)

    Clapp, W. A.; Helbig, W. A.; Merriam, A. S.

    1972-01-01

    The current capabilities of LSI techniques for speed and reliability, plus the possibilities of assembling large configurations of LSI logic and storage elements, have demanded the study of multiprocessors and multiprocessing techniques, problems, and potentialities. Evaluated are three previous systems studies for a space ultrareliable modular computer multiprocessing system, and a new multiprocessing system is proposed that is flexibly configured with up to four central processors, four 1/0 processors, and 16 main memory units, plus auxiliary memory and peripheral devices. This multiprocessor system features a multilevel interrupt, qualified S/360 compatibility for ground-based generation of programs, virtual memory management of a storage hierarchy through 1/0 processors, and multiport access to multiple and shared memory units.

  14. Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 1: FTMP principles of operation

    NASA Technical Reports Server (NTRS)

    Smith, T. B., Jr.; Lala, J. H.

    1983-01-01

    The basic organization of the fault tolerant multiprocessor, (FTMP) is that of a general purpose homogeneous multiprocessor. Three processors operate on a shared system (memory and I/O) bus. Replication and tight synchronization of all elements and hardware voting is employed to detect and correct any single fault. Reconfiguration is then employed to repair a fault. Multiple faults may be tolerated as a sequence of single faults with repair between fault occurrences.

  15. An optical simulation of shared memory

    SciTech Connect

    Goldberg, L.A.; Matias, Y.; Rao, S.

    1994-06-01

    We present a work-optimal randomized algorithm for simulating a shared memory machine (PRAM) on an optical communication parallel computer (OCPC). The OCPC model is motivated by the potential of optical communication for parallel computation. The memory of an OCPC is divided into modules, one module per processor. Each memory module only services a request on a timestep if it receives exactly one memory request. Our algorithm simulates each step of an n lg lg n-processor EREW PRAM on an n-processor OCPC in O(lg lg n) expected delay. (The probability that the delay is longer than this is at most n{sup {minus}{alpha}} for any constant {alpha}). The best previous simulation, due to Valiant, required {Theta}(lg n) expected delay.

  16. Parallel Navier-Stokes computations on shared and distributed memory architectures

    NASA Technical Reports Server (NTRS)

    Hayder, M. Ehtesham; Jayasimha, D. N.; Pillay, Sasi Kumar

    1995-01-01

    We study a high order finite difference scheme to solve the time accurate flow field of a jet using the compressible Navier-Stokes equations. As part of our ongoing efforts, we have implemented our numerical model on three parallel computing platforms to study the computational, communication, and scalability characteristics. The platforms chosen for this study are a cluster of workstations connected through fast networks (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and a distributed memory multiprocessor (the IBM SPI). Our focus in this study is on the LACE testbed. We present some results for the Cray YMP and the IBM SP1 mainly for comparison purposes. On the LACE testbed, we study: (1) the communication characteristics of Ethernet, FDDI, and the ALLNODE networks and (2) the overheads induced by the PVM message passing library used for parallelizing the application. We demonstrate that clustering of workstations is effective and has the potential to be computationally competitive with supercomputers at a fraction of the cost.

  17. Checkpointing Shared Memory Programs at the Application-level

    SciTech Connect

    Bronevetsky, G; Schulz, M; Szwed, P; Marques, D; Pingali, K

    2004-09-08

    Trends in high-performance computing are making it necessary for long-running applications to tolerate hardware faults. The most commonly used approach is checkpoint and restart(CPR)-the state of the computation is saved periodically on disk, and when a failure occurs, the computation is restarted from the last saved state. At present, it is the responsibility of the programmer to instrument applications for CPR. Our group is investigating the use of compiler technology to instrument codes to make them self-checkpointing and self-restarting, thereby providing an automatic solution to the problem of making long-running scientific applications resilient to hardware faults. Our previous work focused on message-passing programs. In this paper, we describe such a system for shared-memory programs running on symmetric multiprocessors. The system has two components: (i)a pre-compiler for source-to-source modification of applications, and (ii) a runtime system that implements a protocol for coordinating CPR among the threads of the parallel application. For the sake of concreteness, we focus on a non-trivial subset of OpenMP that includes barriers and locks. One of the advantages of this approach is that the ability to tolerate faults becomes embedded within the application itself, so applications become self-checkpointing and self-restarting on any platform. We demonstrate this by showing that our transformed benchmarks can checkpoint and restart on three different platforms (Windows/x86, Linux/x86, and Tru64/Alpha). Our experiments show that the overhead introduced by this approach is usually quite small; they also suggest ways in which the current implementation can be tuned to reduced overheads further.

  18. Exploring Shared Memory Protocols in FLASH

    SciTech Connect

    Horowitz, Mark; Kunz, Robert; Hall, Mary; Lucas, Robert; Chame, Jacqueline

    2007-04-01

    ABSTRACT The goal of this project was to improve the performance of large scientific and engineering applications through collaborative hardware and software mechanisms to manage the memory hierarchy of non-uniform memory access time (NUMA) shared-memory machines, as well as their component individual processors. In spite of the programming advantages of shared-memory platforms, obtaining good performance for large scientific and engineering applications on such machines can be challenging. Because communication between processors is managed implicitly by the hardware, rather than expressed by the programmer, application performance may suffer from unintended communication – communication that the programmer did not consider when developing his/her application. In this project, we developed and evaluated a collection of hardware, compiler, languages and performance monitoring tools to obtain high performance on scientific and engineering applications on NUMA platforms by managing communication through alternative coherence mechanisms. Alternative coherence mechanisms have often been discussed as a means for reducing unintended communication, although architecture implementations of such mechanisms are quite rare. This report describes an actual implementation of a set of coherence protocols that support coherent, non-coherent and write-update accesses for a CC-NUMA shared-memory architecture, the Stanford FLASH machine. Such an approach has the advantages of using alternative coherence only where it is beneficial, and also provides an evolutionary migration path for improving application performance. We present data on two computations, RandomAccess from the HPC Challenge benchmarks and a forward solver derived from LS-DYNA, showing the performance advantages of the alternative coherence mechanisms. For RandomAccess, the non-coherent and write-update versions can outperform the coherent version by factors of 5 and 2.5, respectively. In LS-DYNA, we obtain

  19. Reducing communication costs in the conjugate gradient algorithm on distributed memory multiprocessors

    SciTech Connect

    D`Azevedo, E.F.; Romine, C.H.

    1992-09-01

    The standard formulation of the conjugate gradient algorithm involves two inner product computations. The results of these two inner products are needed to update the search direction and the computed solution. In a distributed memory parallel environment, the computation and subsequent distribution of these two values requires two separate communication and synchronization phases. In this paper, we present a mathematically equivalent rearrangement of the standard algorithm that reduces the number of communication phases. We give a second derivation of the modified conjugate gradient algorithm in terms of the natural relationship with the underlying Lanczos process. We also present empirical evidence of the stability of this modified algorithm.

  20. Shared memory for a fault-tolerant computer

    NASA Technical Reports Server (NTRS)

    Gilley, G. C. (Inventor)

    1976-01-01

    A system is described for sharing a memory in a fault-tolerant computer. The memory is under the direct control and monitoring of error detecting and error diagnostic units in the fault-tolerant computer. This computer verifies that data to and from the memory is legally encoded and verifies that words read from the memory at a desired address are, in fact, actually delivered from that desired address. The means are provided for a second processor, which is independent of the direct control and monitoring of the error checking and diagnostic units of the fault-tolerant computer, and to share the memory of the fault-tolerant computer. Circuitry is included to verify that: (1) the processor has properly accessed a desired memory location in the memory; (2) a data word read-out from the memory is properly coded; and (3) no inactive memory was erroneously outputting data onto the shared memory bus.

  1. Impact of Load Balancing on Unstructured Adaptive Grid Computations for Distributed-Memory Multiprocessors

    NASA Technical Reports Server (NTRS)

    Sohn, Andrew; Biswas, Rupak; Simon, Horst D.

    1996-01-01

    The computational requirements for an adaptive solution of unsteady problems change as the simulation progresses. This causes workload imbalance among processors on a parallel machine which, in turn, requires significant data movement at runtime. We present a new dynamic load-balancing framework, called JOVE, that balances the workload across all processors with a global view. Whenever the computational mesh is adapted, JOVE is activated to eliminate the load imbalance. JOVE has been implemented on an IBM SP2 distributed-memory machine in MPI for portability. Experimental results for two model meshes demonstrate that mesh adaption with load balancing gives more than a sixfold improvement over one without load balancing. We also show that JOVE gives a 24-fold speedup on 64 processors compared to sequential execution.

  2. Shared virtual memory and generalized speedup

    NASA Technical Reports Server (NTRS)

    Sun, Xian-He; Zhu, Jianping

    1994-01-01

    Generalized speedup is defined as parallel speed over sequential speed. The generalized speedup and its relation with other existing performance metrics, such as traditional speedup, efficiency, scalability, etc., are carefully studied. In terms of the introduced asymptotic speed, it was shown that the difference between the generalized speedup and the traditional speedup lies in the definition of the efficiency of uniprocessor processing, which is a very important issue in shared virtual memory machines. A scientific application was implemented on a KSR-1 parallel computer. Experimental and theoretical results show that the generalized speedup is distinct from the traditional speedup and provides a more reasonable measurement. In the study of different speedups, various causes of superlinear speedup are also presented.

  3. Study of performance on SMP and distributed memory architectures using a shared memory programming model

    SciTech Connect

    Brooks, E.D.; Warren, K.H.

    1997-08-08

    In this paper we examine the use of a shared memory programming model to address the problem of portability of application codes between distributed memory and shared memory architectures. We do this with an extension of the Parallel C Preprocessor. The extension, borrowed from Split-C and AC, uses type qualifiers instead of storage class modifiers to declare variables that are shared among processors. The type qualifier declaration supports an abstract shared memory facility on distributed memory machines while making direct use of hardware support on shared memory architectures. Our benchmarking study spans a wide range of shared memory and distributed memory platforms. Benchmarks include Gaussian elimination with back substitution, a two-dimensional fast Fourier transform, and a matrix-matrix multiply. We find that the type-qualifier-based shared memory programming model is capable of efficiently spanning both distributed memory and shared memory architectures. Although the resulting shared memory programming model is portable, it does not remove the need to arrange for overlapped or blocked remote memory references on platforms that require these tuning measures in order to obtain good performance.

  4. A parallel numerical simulation for supersonic flows using zonal overlapped grids and local time steps for common and distributed memory multiprocessors

    SciTech Connect

    Patel, N.R.; Sturek, W.B.; Hiromoto, R.

    1989-01-01

    Parallel Navier-Stokes codes are developed to solve both two- dimensional and three-dimensional flow fields in and around ramjet and nose tip configurations. A multi-zone overlapped grid technique is used to extend an explicit finite-difference method to more complicated geometries. Parallel implementations are developed for execution on both distributed and common-memory multiprocessor architectures. For the steady-state solutions, the use of the local time-step method has the inherent advantage of reducing the communications overhead commonly incurred by parallel implementations. Computational results of the codes are given for a series of test problems. The parallel partitioning of computational zones is also discussed. 5 refs., 18 figs.

  5. Validation of multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Siewiorek, D. P.; Segall, Z.; Kong, T.

    1982-01-01

    Experiments that can be used to validate fault free performance of multiprocessor systems in aerospace systems integrating flight controls and avionics are discussed. Engineering prototypes for two fault tolerant multiprocessors are tested.

  6. The hierarchical spatial decomposition of three-dimensional particle- in-cell plasma simulations on MIMD distributed memory multiprocessors

    SciTech Connect

    Walker, D.W.

    1992-07-01

    The hierarchical spatial decomposition method is a promising approach to decomposing the particles and computational grid in parallel particle-in-cell application codes, since it is able to maintain approximate dynamic load balance while keeping communication costs low. In this paper we investigate issues in implementing a hierarchical spatial decomposition on a hypercube multiprocessor. Particular attention is focused on the communication needed to update guard ring data, and on the load balancing method. The hierarchical approach is compared with other dynamic load balancing schemes.

  7. Parallel implementation and evaluation of motion estimation system algorithms on a distributed memory multiprocessor using knowledge based mappings

    NASA Technical Reports Server (NTRS)

    Choudhary, Alok Nidhi; Leung, Mun K.; Huang, Thomas S.; Patel, Janak H.

    1989-01-01

    Several techniques to perform static and dynamic load balancing techniques for vision systems are presented. These techniques are novel in the sense that they capture the computational requirements of a task by examining the data when it is produced. Furthermore, they can be applied to many vision systems because many algorithms in different systems are either the same, or have similar computational characteristics. These techniques are evaluated by applying them on a parallel implementation of the algorithms in a motion estimation system on a hypercube multiprocessor system. The motion estimation system consists of the following steps: (1) extraction of features; (2) stereo match of images in one time instant; (3) time match of images from different time instants; (4) stereo match to compute final unambiguous points; and (5) computation of motion parameters. It is shown that the performance gains when these data decomposition and load balancing techniques are used are significant and the overhead of using these techniques is minimal.

  8. Supporting shared data structures on distributed memory architectures

    NASA Technical Reports Server (NTRS)

    Koelbel, Charles; Mehrotra, Piyush; Vanrosendale, John

    1990-01-01

    Programming nonshared memory systems is more difficult than programming shared memory systems, since there is no support for shared data structures. Current programming languages for distributed memory architectures force the user to decompose all data structures into separate pieces, with each piece owned by one of the processors in the machine, and with all communication explicitly specified by low-level message-passing primitives. A new programming environment is presented for distributed memory architectures, providing a global name space and allowing direct access to remote parts of data values. The analysis and program transformations required to implement this environment are described, and the efficiency of the resulting code on the NCUBE/7 and IPSC/2 hypercubes are described.

  9. PANDA: A distributed multiprocessor operating system

    SciTech Connect

    Chubb, P.

    1989-01-01

    PANDA is a design for a distributed multiprocessor and an operating system. PANDA is designed to allow easy expansion of both hardware and software. As such, the PANDA kernel provides only message passing and memory and process management. The other features needed for the system (device drivers, secondary storage management, etc.) are provided as replaceable user tasks. The thesis presents PANDA's design and implementation, both hardware and software. PANDA uses multiple 68010 processors sharing memory on a VME bus, each such node potentially connected to others via a high speed network. The machine is completely homogeneous: there are no differences between processors that are detectable by programs running on the machine. A single two-processor node has been constructed. Each processor contains memory management circuits designed to allow processors to share page tables safely. PANDA presents a programmers' model similar to the hardware model: a job is divided into multiple tasks, each having its own address space. Within each task, multiple processes share code and data. Tasks can send messages to each other, and set up virtual circuits between themselves. Peripheral devices such as disc drives are represented within PANDA by tasks. PANDA divides secondary storage into volumes, each volume being accessed by a volume access task, or VAT. All knowledge about the way that data is stored on a disc is kept in its volume's VAT. The design is such that PANDA should provide a useful testbed for file systems and device drivers, as these can be installed without recompiling PANDA itself, and without rebooting the machine.

  10. Graphical Visualization on Computational Simulation Using Shared Memory

    NASA Astrophysics Data System (ADS)

    Lima, A. B.; Correa, Eberth

    2014-03-01

    The Shared Memory technique is a powerful tool for parallelizing computer codes. In particular it can be used to visualize the results "on the fly" without stop running the simulation. In this presentation we discuss and show how to use the technique conjugated with a visualization code using openGL.

  11. Sharing Memory Robustly in Message-Passing Systems

    DTIC Science & Technology

    1990-02-16

    ust: a very restricted form of communication. Chor and Moscovici ([20]) present a hierarchy of resiliency for problems in shared-memory systems and...1985. [20] B. Chor, and L. Moscovici , Solvability in Asynchronous Environments, Proc. 30th Syrup. on Foun- * dations of Comp. Sc~en ce, pp. 422-427, 1989

  12. Distributed simulation using a real-time shared memory network

    NASA Technical Reports Server (NTRS)

    Simon, Donald L.; Mattern, Duane L.; Wong, Edmond; Musgrave, Jeffrey L.

    1993-01-01

    The Advanced Control Technology Branch of the NASA Lewis Research Center performs research in the area of advanced digital controls for aeronautic and space propulsion systems. This work requires the real-time implementation of both control software and complex dynamical models of the propulsion system. We are implementing these systems in a distributed, multi-vendor computer environment. Therefore, a need exists for real-time communication and synchronization between the distributed multi-vendor computers. A shared memory network is a potential solution which offers several advantages over other real-time communication approaches. A candidate shared memory network was tested for basic performance. The shared memory network was then used to implement a distributed simulation of a ramjet engine. The accuracy and execution time of the distributed simulation was measured and compared to the performance of the non-partitioned simulation. The ease of partitioning the simulation, the minimal time required to develop for communication between the processors and the resulting execution time all indicate that the shared memory network is a real-time communication technique worthy of serious consideration.

  13. Neural networks and MIMD-multiprocessors

    NASA Technical Reports Server (NTRS)

    Vanhala, Jukka; Kaski, Kimmo

    1990-01-01

    Two artificial neural network models are compared. They are the Hopfield Neural Network Model and the Sparse Distributed Memory model. Distributed algorithms for both of them are designed and implemented. The run time characteristics of the algorithms are analyzed theoretically and tested in practice. The storage capacities of the networks are compared. Implementations are done using a distributed multiprocessor system.

  14. Rollback Hardware For Time Warp Multiprocessor Systems

    NASA Technical Reports Server (NTRS)

    Robb, Michael J.; Buzzell, Calvin A.

    1996-01-01

    Rollback Chip (RBC) module is computer circuit board containing special-purpose memory circuits for use in multiprocessor computer system. Designed to help realize speedup potential of parallel processing for simulation of discrete events by use of Time Warp operating system.

  15. Multi-processor including data flow accelerator module

    DOEpatents

    Davidson, George S.; Pierce, Paul E.

    1990-01-01

    An accelerator module for a data flow computer includes an intelligent memory. The module is added to a multiprocessor arrangement and uses a shared tagged memory architecture in the data flow computer. The intelligent memory module assigns locations for holding data values in correspondence with arcs leading to a node in a data dependency graph. Each primitive computation is associated with a corresponding memory cell, including a number of slots for operands needed to execute a primitive computation, a primitive identifying pointer, and linking slots for distributing the result of the cell computation to other cells requiring that result as an operand. Circuitry is provided for utilizing tag bits to determine automatically when all operands required by a processor are available and for scheduling the primitive for execution in a queue. Each memory cell of the module may be associated with any of the primitives, and the particular primitive to be executed by the processor associated with the cell is identified by providing an index, such as the cell number for the primitive, to the primitive lookup table of starting addresses. The module thus serves to perform functions previously performed by a number of sections of data flow architectures and coexists with conventional shared memory therein. A multiprocessing system including the module operates in a hybrid mode, wherein the same processing modules are used to perform some processing in a sequential mode, under immediate control of an operating system, while performing other processing in a data flow mode.

  16. Performance Evaluation of Remote Memory Access (RMA) Programming on Shared Memory Parallel Computers

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Jost, Gabriele; Biegel, Bryan A. (Technical Monitor)

    2002-01-01

    The purpose of this study is to evaluate the feasibility of remote memory access (RMA) programming on shared memory parallel computers. We discuss different RMA based implementations of selected CFD application benchmark kernels and compare them to corresponding message passing based codes. For the message-passing implementation we use MPI point-to-point and global communication routines. For the RMA based approach we consider two different libraries supporting this programming model. One is a shared memory parallelization library (SMPlib) developed at NASA Ames, the other is the MPI-2 extensions to the MPI Standard. We give timing comparisons for the different implementation strategies and discuss the performance.

  17. High Performance, Dependable Multiprocessor

    NASA Technical Reports Server (NTRS)

    Ramos, Jeremy; Samson, John R.; Troxel, Ian; Subramaniyan, Rajagopal; Jacobs, Adam; Greco, James; Cieslewski, Grzegorz; Curreri, John; Fischer, Michael; Grobelny, Eric; George, Alan; Aggarwal, Vikas; Patel, Minesh; Some, Raphael

    2006-01-01

    With the ever increasing demand for higher bandwidth and processing capacity of today's space exploration, space science, and defense missions, the ability to efficiently apply commercial-off-the-shelf (COTS) processors for on-board computing is now a critical need. In response to this need, NASA's New Millennium Program office has commissioned the development of Dependable Multiprocessor (DM) technology for use in payload and robotic missions. The Dependable Multiprocessor technology is a COTS-based, power efficient, high performance, highly dependable, fault tolerant cluster computer. To date, Honeywell has successfully demonstrated a TRL4 prototype of the Dependable Multiprocessor [I], and is now working on the development of a TRLS prototype. For the present effort Honeywell has teamed up with the University of Florida's High-performance Computing and Simulation (HCS) Lab, and together the team has demonstrated major elements of the Dependable Multiprocessor TRLS system.

  18. Multiprocessors and runtime compilation

    NASA Technical Reports Server (NTRS)

    Saltz, Joel; Berryman, Harry; Wu, Janet

    1990-01-01

    Runtime preprocessing plays a major role in many efficient algorithms in computer science, as well as playing an important role in exploiting multiprocessor architectures. Examples are given that elucidate the importance of runtime preprocessing and show how these optimizations can be integrated into compilers. To support the arguments, transformations implemented in prototype multiprocessor compilers are described and benchmarks from the iPSC2/860, the CM-2, and the Encore Multimax/320 are presented.

  19. Collaboration changes both the content and the structure of memory: Building the architecture of shared representations.

    PubMed

    Congleton, Adam R; Rajaram, Suparna

    2014-08-01

    Memory research has primarily focused on how individuals form and maintain memories across time. However, less is known about how groups of people working together can create and maintain shared memories of the past. Recent studies have focused on understanding the processes behind the formation of such shared memories, but none has investigated the structure of shared memory. This study investigated the circumstances under which collaboration would influence the likelihood that participants come to share both a similar content and a similar organization of the past by aligning their individual representations into a shared rendering. We tested how the frequency and the timing of collaboration affect participants' retrieval organization, and how this in turn influences the formation of shared memory and its persistence over time. Across numerous foundational and novel analyses, we observed that as the size of the collaborative inhibition effect-a counterintuitive finding that collaboration reduces group recall-increased, so did the amount of shared memory and the shared organization of memories. These findings reveal the interconnected relationship between collaborative inhibition, retrieval disruption, shared memory, and shared organization. Together, these relationships have intriguing implications for research across a wide variety of domains, including the formation of collective memory, beliefs and attitudes, parent-child narratives and the development of autobiographical memory, and the emergence of shared representations in educational settings.

  20. Efficient partitioning and assignment on programs for multiprocessor execution

    NASA Technical Reports Server (NTRS)

    Standley, Hilda M.

    1993-01-01

    The general problem studied is that of segmenting or partitioning programs for distribution across a multiprocessor system. Efficient partitioning and the assignment of program elements are of great importance since the time consumed in this overhead activity may easily dominate the computation, effectively eliminating any gains made by the use of the parallelism. In this study, the partitioning of sequentially structured programs (written in FORTRAN) is evaluated. Heuristics, developed for similar applications are examined. Finally, a model for queueing networks with finite queues is developed which may be used to analyze multiprocessor system architectures with a shared memory approach to the problem of partitioning. The properties of sequentially written programs form obstacles to large scale (at the procedure or subroutine level) parallelization. Data dependencies of even the minutest nature, reflecting the sequential development of the program, severely limit parallelism. The design of heuristic algorithms is tied to the experience gained in the parallel splitting. Parallelism obtained through the physical separation of data has seen some success, especially at the data element level. Data parallelism on a grander scale requires models that accurately reflect the effects of blocking caused by finite queues. A model for the approximation of the performance of finite queueing networks is developed. This model makes use of the decomposition approach combined with the efficiency of product form solutions.

  1. Performance Analysis of Multilevel Parallel Applications on Shared Memory Architectures

    NASA Technical Reports Server (NTRS)

    Jost, Gabriele; Jin, Haoqiang; Labarta, Jesus; Gimenez, Judit; Caubet, Jordi; Biegel, Bryan A. (Technical Monitor)

    2002-01-01

    In this paper we describe how to apply powerful performance analysis techniques to understand the behavior of multilevel parallel applications. We use the Paraver/OMPItrace performance analysis system for our study. This system consists of two major components: The OMPItrace dynamic instrumentation mechanism, which allows the tracing of processes and threads and the Paraver graphical user interface for inspection and analyses of the generated traces. We describe how to use the system to conduct a detailed comparative study of a benchmark code implemented in five different programming paradigms applicable for shared memory

  2. Multiprocessor programming environment

    SciTech Connect

    Smith, M.B.; Fornaro, R.

    1988-12-01

    Programming tools and techniques have been well developed for traditional uniprocessor computer systems. The focus of this research project is on the development of a programming environment for a high speed real time heterogeneous multiprocessor system, with special emphasis on languages and compilers. The new tools and techniques will allow a smooth transition for programmers with experience only on single processor systems.

  3. Ensuring correct rollback recovery in distributed shared memory systems

    NASA Technical Reports Server (NTRS)

    Janssens, Bob; Fuchs, W. Kent

    1995-01-01

    Distributed shared memory (DSM) implemented on a cluster of workstations is an increasingly attractive platform for executing parallel scientific applications. Checkpointing and rollback techniques can be used in such a system to allow the computation to progress in spite of the temporary failure of one or more processing nodes. This paper presents the design of an independent checkpointing method for DSM that takes advantage of DSM's specific properties to reduce error-free and rollback overhead. The scheme reduces the dependencies that need to be considered for correct rollback to those resulting from transfers of pages. Furthermore, in-transit messages can be recovered without the use of logging. We extend the scheme to a DSM implementation using lazy release consistency, where the frequency of dependencies is further reduced.

  4. Reducing Interprocessor Dependence in Recoverable Distributed Shared Memory

    NASA Technical Reports Server (NTRS)

    Janssens, Bob; Fuchs, W. Kent

    1994-01-01

    Checkpointing techniques in parallel systems use dependency tracking and/or message logging to ensure that a system rolls back to a consistent state. Traditional dependency tracking in distributed shared memory (DSM) systems is expensive because of high communication frequency. In this paper we show that, if designed correctly, a DSM system only needs to consider dependencies due to the transfer of blocks of data, resulting in reduced dependency tracking overhead and reduced potential for rollback propagation. We develop an ownership timestamp scheme to tolerate the loss of block state information and develop a passive server model of execution where interactions between processors are considered atomic. With our scheme, dependencies are significantly reduced compared to the traditional message-passing model.

  5. Parallel k-means++ for Multiple Shared-Memory Architectures

    SciTech Connect

    Mackey, Patrick S.; Lewis, Robert R.

    2016-09-22

    In recent years k-means++ has become a popular initialization technique for improved k-means clustering. To date, most of the work done to improve its performance has involved parallelizing algorithms that are only approximations of k-means++. In this paper we present a parallelization of the exact k-means++ algorithm, with a proof of its correctness. We develop implementations for three distinct shared-memory architectures: multicore CPU, high performance GPU, and the massively multithreaded Cray XMT platform. We demonstrate the scalability of the algorithm on each platform. In addition we present a visual approach for showing which platform performed k-means++ the fastest for varying data sizes.

  6. Parallel discrete event simulation: A shared memory approach

    NASA Technical Reports Server (NTRS)

    Reed, Daniel A.; Malony, Allen D.; Mccredie, Bradley D.

    1987-01-01

    With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models.

  7. A Parallel Saturation Algorithm on Shared Memory Architectures

    NASA Technical Reports Server (NTRS)

    Ezekiel, Jonathan; Siminiceanu

    2007-01-01

    Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.

  8. Translation techniques for distributed-shared memory programming models

    SciTech Connect

    Fuller, Douglas James

    2005-01-01

    The high performance computing community has experienced an explosive improvement in distributed-shared memory hardware. Driven by increasing real-world problem complexity, this explosion has ushered in vast numbers of new systems. Each new system presents new challenges to programmers and application developers. Part of the challenge is adapting to new architectures with new performance characteristics. Different vendors release systems with widely varying architectures that perform differently in different situations. Furthermore, since vendors need only provide a single performance number (total MFLOPS, typically for a single benchmark), they only have strong incentive initially to optimize the API of their choice. Consequently, only a fraction of the available APIs are well optimized on most systems. This causes issues porting and writing maintainable software, let alone issues for programmers burdened with mastering each new API as it is released. Also, programmers wishing to use a certain machine must choose their API based on the underlying hardware instead of the application. This thesis argues that a flexible, extensible translator for distributed-shared memory APIs can help address some of these issues. For example, a translator might take as input code in one API and output an equivalent program in another. Such a translator could provide instant porting for applications to new systems that do not support the application's library or language natively. While open-source APIs are abundant, they do not perform optimally everywhere. A translator would also allow performance testing using a single base code translated to a number of different APIs. Most significantly, this type of translator frees programmers to select the most appropriate API for a given application based on the application (and developer) itself instead of the underlying hardware.

  9. Distributed parallel messaging for multiprocessor systems

    SciTech Connect

    Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M; Steinmacher-Burrow, Burhard; Sugawara, Yutaka

    2013-06-04

    A method and apparatus for distributed parallel messaging in a parallel computing system. The apparatus includes, at each node of a multiprocessor network, multiple injection messaging engine units and reception messaging engine units, each implementing a DMA engine and each supporting both multiple packet injection into and multiple reception from a network, in parallel. The reception side of the messaging unit (MU) includes a switch interface enabling writing of data of a packet received from the network to the memory system. The transmission side of the messaging unit, includes switch interface for reading from the memory system when injecting packets into the network.

  10. Coupled cluster algorithms for networks of shared memory parallel processors

    NASA Astrophysics Data System (ADS)

    Bentz, Jonathan L.; Olson, Ryan M.; Gordon, Mark S.; Schmidt, Michael W.; Kendall, Ricky A.

    2007-05-01

    As the popularity of using SMP systems as the building blocks for high performance supercomputers increases, so too increases the need for applications that can utilize the multiple levels of parallelism available in clusters of SMPs. This paper presents a dual-layer distributed algorithm, using both shared-memory and distributed-memory techniques to parallelize a very important algorithm (often called the "gold standard") used in computational chemistry, the single and double excitation coupled cluster method with perturbative triples, i.e. CCSD(T). The algorithm is presented within the framework of the GAMESS [M.W. Schmidt, K.K. Baldridge, J.A. Boatz, S.T. Elbert, M.S. Gordon, J.J. Jensen, S. Koseki, N. Matsunaga, K.A. Nguyen, S. Su, T.L. Windus, M. Dupuis, J.A. Montgomery, General atomic and molecular electronic structure system, J. Comput. Chem. 14 (1993) 1347-1363]. (General Atomic and Molecular Electronic Structure System) program suite and the Distributed Data Interface [M.W. Schmidt, G.D. Fletcher, B.M. Bode, M.S. Gordon, The distributed data interface in GAMESS, Comput. Phys. Comm. 128 (2000) 190]. (DDI), however, the essential features of the algorithm (data distribution, load-balancing and communication overhead) can be applied to more general computational problems. Timing and performance data for our dual-level algorithm is presented on several large-scale clusters of SMPs.

  11. Low latency memory access and synchronization

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Heidelberger, Philip; Hoenicke, Dirk; Ohmacht, Martin; Steinmacher-Burow, Burkhard D.; Takken, Todd E.; Vranas, Pavlos M.

    2007-02-06

    A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.

  12. Low latency memory access and synchronization

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Heidelberger, Philip; Hoenicke, Dirk; Ohmacht, Martin; Steinmacher-Burow, Burkhard D.; Takken, Todd E. , Vranas; Pavlos M.

    2010-10-19

    A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Bach processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.

  13. Cache as point of coherence in multiprocessor system

    DOEpatents

    Blumrich, Matthias A.; Ceze, Luis H.; Chen, Dong; Gara, Alan; Heidelberger, Phlip; Ohmacht, Martin; Steinmacher-Burow, Burkhard; Zhuang, Xiaotong

    2016-11-29

    In a multiprocessor system, a conflict checking mechanism is implemented in the L2 cache memory. Different versions of speculative writes are maintained in different ways of the cache. A record of speculative writes is maintained in the cache directory. Conflict checking occurs as part of directory lookup. Speculative versions that do not conflict are aggregated into an aggregated version in a different way of the cache. Speculative memory access requests do not go to main memory.

  14. Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments

    SciTech Connect

    Jin, Shuangshuang; Chen, Yousu; Wu, Di; Diao, Ruisheng; Huang, Zhenyu

    2015-12-09

    Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Message Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.

  15. Parallelization of the NAS Conjugate Gradient Benchmark Using the Global Arrays Shared Memory Programming Model

    SciTech Connect

    Zhang, Yeliang; Tipparaju, Vinod; Nieplocha, Jarek; Hariri, Salim

    2005-04-08

    The NAS Conjugate Gradient (CG) benchmark is an important scientific kernel used to evaluate machine performance and compare characteristics of different programming models. Global Arrays (GA) toolkit supports a shared memory programming paradigm— even on distributed memory systems— and offers the programmer control over the distribution and locality that are important for optimizing performance on scalable architectures. In this paper, we describe and compare two different parallelization strategies of the CG benchmark using GA and report performance results on a shared-memory system as well as on a cluster. Performance benefits of using shared memory for irregular/sparse computations have been demonstrated before in context of the CG benchmark using OpenMP. Similarly, the GA implementation outperforms the standard MPI implementation on shared memory system, in our case the SGI Altix. However, with GA these benefits are extended to distributed memory systems and demonstrated on a Linux cluster with Myrinet.

  16. Large-grain pipelining on hypercube multiprocessors

    SciTech Connect

    King, Chung-Ta; Ni, Lionel M.

    1988-01-01

    A new paradigm, called large-grain pipelining, for developing efficient parallel algorithms on distributed-memory multiprocessors, e.g., hypercube machines, is introduced. Large-grain pipelining attempts to maximize the degree of overlapping and minimize the effect of communication overhead in a multiprocessor system through macro-pipelining between the nodes. Algorithms developed through large-grain pipelining to perform matrix multiplication are presented. To model the pipelined computations, an analytic model is introduced, which takes into account both underlying architecture and algorithm behavior. Through the analytical model, important design parameters, such as data partition sizes, can be determined. Experiments were conducted on a 64-node NCUBE multiprocessor. The measured results match closely with the analyzed results, which establishes the analytic model as an integral part of algorithm design. Comparison with an algorithm which does not use large-grain pipelining also shows that large-grain pipelining is an efficient scheme for achieving a greater parallelism. 14 refs., 12 figs.

  17. Back propagation parameter analysis on multiprocessors

    SciTech Connect

    Cerf, G.; Mokry, R.; Weintraub, J.

    1988-09-01

    In order to develop systems of artificial neural networks which can be scaled up to perform practical tasks such as pattern recognition or speech processing, the use of powerful computing tools is essential. Multiprocessors are becoming increasingly popular in the simulation and study of large networks, as the inherent parallelism of many neural architectures and learning algorithms lends itself quite naturally to implementation on concurrent processors. In this study, a multiprocessor system based on the Inmos transputer has been used to examine the stability and convergence rates of the back propagation algorithm as a function of changes in parameters such as activation values, number of hidden units, learning rate, momentum, and initial weight and bias configurations. The Victor V32 is a prototype low-cost message-passing multiprocessor system, designed and implemented by the Victor project in the Microsystems and VLSI group at the IBM T.J. Watson Research Center. A sample topology for the system is 32 nodes in a fixed 4 x 8 mesh. A host processor interfaced to a PC AT and connected to one of the corners of the mesh provides screen and disc I/O. Each of the 32 nodes consists of an INMOS T414 transputer and 4 Megabytes of local memory. Four high-speed (20 Mbits/sec) serial links provide communication among the nodes.

  18. Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver

    NASA Astrophysics Data System (ADS)

    Moustafa, Salli; Dutka-Malen, Ivan; Plagne, Laurent; Ponçot, Angélique; Ramet, Pierre

    2014-06-01

    This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that implements two nested levels of parallelism (multicore+SIMD) on shared memory computation nodes. DOMINO is written in C++, a multi-paradigm programming language that enables the use of powerful and generic parallel programming tools such as Intel TBB and Eigen. These two libraries allow us to combine multi-thread parallelism with vector operations in an efficient and yet portable way. As a result, DOMINO can exploit the full power of modern multi-core processors and is able to tackle very large simulations, that usually require large HPC clusters, using a single computing node. For example, DOMINO solves a 3D full core PWR eigenvalue problem involving 26 energy groups, 288 angular directions (S16), 46 × 106 spatial cells and 1 × 1012 DoFs within 11 hours on a single 32-core SMP node. This represents a sustained performance of 235 GFlops and 40:74% of the SMP node peak performance for the DOMINO sweep implementation. The very high Flops/Watt ratio of DOMINO makes it a very interesting building block for a future many-nodes nuclear simulation tool.

  19. Sharing specific "We" autobiographical memories in close relationships: the role of contact frequency.

    PubMed

    Beike, Denise R; Cole, Holly E; Merrick, Carmen R

    2017-04-10

    Sharing memories in conversations with close others is posited to be part of the social function of autobiographical memory. The present research focused on the sharing of a particular type of memory: Specific memories about one-time co-experienced events, which we termed Specific We memories. Two studies with 595 total participants examined the factors that lead to and/or are influenced by the sharing of Specific We memories. In Study 1, participants reported on their most recent conversation. Specific We memories were reportedly discussed most often in conversations with others who were close and with whom the participant had frequent communication. In Study 2, participants were randomly assigned either to increase or to simply record the frequency of communication with a close other (parent). Increases in the frequency of reported sharing of Specific We memories as well as closeness to the parent resulted. Mediation analyses of both studies revealed causal relationships among reported sharing of Specific We memories and closeness. We discuss the relevance of these results for understanding the social function of autobiographical memory.

  20. The C.mmp Multiprocessor

    DTIC Science & Technology

    1978-10-27

    minicomputers connected to a large shared memory through a central crosspoint switch . The system was constructed beginning in 1971, and for several years has...i 2.1. The Processor -Memory Switch 4 %_1ŕ 2.2. Memory Mapping and the Relocation Unit 6 2.3. Caches 8 2.4. Processor Extensions 8 2.4.1. Address...The Present C.mmp Configuration 12 3.1. Processors 12 3.2. Memory 15 3.3. Switch and IP Bus 15 3.4. Peripheral Devices 15 3.5. Links to Other

  1. Design, Implementation, and Evaluation of a Shared.Memory Parallel Processing System (SMPPS)

    DTIC Science & Technology

    1999-07-26

    AGENCY NAME(S) AND ADDRESS(ES) THE DEPARTMENT OF THE AIR FORCE AFIT/CIA, BLDG 125 2950 P STREET WPAFB OH 45433 10 . SPONSORING/MONITORING AGENCY...Existing Machines 4 1.2.1 Message-Passing 4 1.2.2 Shared-Memory 6 2 IMPLEMENTING A SHARED-MEMORY PARALLEL PROCESSING SYSTEM (SMPPS) 10 2.1...Objectives 10 2.2 A Dual-Processor Shared-Memory Parallel Processing System 10 2.2.1 Meeting Design Objectives 10 2.2.2 The Design 11 2.2.3 Timer

  2. Programmable partitioning for high-performance coherence domains in a multiprocessor system

    DOEpatents

    Blumrich, Matthias A [Ridgefield, CT; Salapura, Valentina [Chappaqua, NY

    2011-01-25

    A multiprocessor computing system and a method of logically partitioning a multiprocessor computing system are disclosed. The multiprocessor computing system comprises a multitude of processing units, and a multitude of snoop units. Each of the processing units includes a local cache, and the snoop units are provided for supporting cache coherency in the multiprocessor system. Each of the snoop units is connected to a respective one of the processing units and to all of the other snoop units. The multiprocessor computing system further includes a partitioning system for using the snoop units to partition the multitude of processing units into a plurality of independent, memory-consistent, adjustable-size processing groups. Preferably, when the processor units are partitioned into these processing groups, the partitioning system also configures the snoop units to maintain cache coherency within each of said groups.

  3. Spaceborne autonomous multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Fernquist, Alan

    1990-01-01

    The goal of this task is to provide technology for the specification and integration of advanced processors into the Space Station Freedom data management system environment through computer performance measurement tools, simulators, and an extended testbed facility. The approach focuses on five categories: (1) user requirements--determine the suitability of existing computer technologies and systems for real-time requirements of NASA missions; (2) system performance analysis--characterize the effects of languages, architectures, and commercially available hardware on real-time benchmarks; (3) system architecture--expand NASA's capability to solve problems with integrated numeric and symbolic requirements using advanced multiprocessor architectures; (4) parallel Ada technology--extend Ada software technology to utilize parallel architectures more efficiently; and (5) testbed--extend in-house testbed to support system performance and system analysis studies.

  4. Implementation of a parallel unstructured Euler solver on shared and distributed memory architectures

    NASA Technical Reports Server (NTRS)

    Mavriplis, D. J.; Das, Raja; Saltz, Joel; Vermeland, R. E.

    1992-01-01

    An efficient three dimensional unstructured Euler solver is parallelized on a Cray Y-MP C90 shared memory computer and on an Intel Touchstone Delta distributed memory computer. This paper relates the experiences gained and describes the software tools and hardware used in this study. Performance comparisons between two differing architectures are made.

  5. A shared neural ensemble links distinct contextual memories encoded close in time

    NASA Astrophysics Data System (ADS)

    Cai, Denise J.; Aharoni, Daniel; Shuman, Tristan; Shobe, Justin; Biane, Jeremy; Song, Weilin; Wei, Brandon; Veshkini, Michael; La-Vu, Mimi; Lou, Jerry; Flores, Sergio E.; Kim, Isaac; Sano, Yoshitake; Zhou, Miou; Baumgaertel, Karsten; Lavi, Ayal; Kamata, Masakazu; Tuszynski, Mark; Mayford, Mark; Golshani, Peyman; Silva, Alcino J.

    2016-06-01

    Recent studies suggest that a shared neural ensemble may link distinct memories encoded close in time. According to the memory allocation hypothesis, learning triggers a temporary increase in neuronal excitability that biases the representation of a subsequent memory to the neuronal ensemble encoding the first memory, such that recall of one memory increases the likelihood of recalling the other memory. Here we show in mice that the overlap between the hippocampal CA1 ensembles activated by two distinct contexts acquired within a day is higher than when they are separated by a week. Several findings indicate that this overlap of neuronal ensembles links two contextual memories. First, fear paired with one context is transferred to a neutral context when the two contexts are acquired within a day but not across a week. Second, the first memory strengthens the second memory within a day but not across a week. Older mice, known to have lower CA1 excitability, do not show the overlap between ensembles, the transfer of fear between contexts, or the strengthening of the second memory. Finally, in aged mice, increasing cellular excitability and activating a common ensemble of CA1 neurons during two distinct context exposures rescued the deficit in linking memories. Taken together, these findings demonstrate that contextual memories encoded close in time are linked by directing storage into overlapping ensembles. Alteration of these processes by ageing could affect the temporal structure of memories, thus impairing efficient recall of related information.

  6. Method and apparatus for single-stepping coherence events in a multiprocessor system under software control

    DOEpatents

    Blumrich, Matthias A.; Salapura, Valentina

    2010-11-02

    An apparatus and method are disclosed for single-stepping coherence events in a multiprocessor system under software control in order to monitor the behavior of a memory coherence mechanism. Single-stepping coherence events in a multiprocessor system is made possible by adding one or more step registers. By accessing these step registers, one or more coherence requests are processed by the multiprocessor system. The step registers determine if the snoop unit will operate by proceeding in a normal execution mode, or operate in a single-step mode.

  7. DiFX: A Software Correlator for Very Long Baseline Interferometry Using Multiprocessor Computing Environments

    NASA Astrophysics Data System (ADS)

    Deller, A. T.; Tingay, S. J.; Bailes, M.; West, C.

    2007-03-01

    We describe the development of an FX-style correlator for very long baseline interferometry (VLBI), implemented in software and intended to run in multiprocessor computing environments, such as large clusters of commodity machines (Beowulf clusters) or computers specifically designed for high-performance computing, such as multiprocessor shared-memory machines. We outline the scientific and practical benefits for VLBI correlation, these chiefly being due to the inherent flexibility of software and the fact that the highly parallel and scalable nature of the correlation task is well suited to a multiprocessor computing environment. We suggest scientific applications where such an approach to VLBI correlation is most suited and will give the best returns. We report detailed results from the Distributed FX (DiFX) software correlator running on the Swinburne supercomputer (a Beowulf cluster of ~300 commodity processors), including measures of the performance of the system. For example, to correlate all Stokes products for a 10 antenna array with an aggregate bandwidth of 64 MHz per station, and using typical time and frequency resolution, currently requires an order of 100 desktop-class compute nodes. Due to the effect of Moore's law on commodity computing performance, the total number and cost of compute nodes required to meet a given correlation task continues to decrease rapidly with time. We show detailed comparisons between DiFX and two existing hardware-based correlators: the Australian Long Baseline Array S2 correlator and the NRAO Very Long Baseline Array correlator. In both cases, excellent agreement was found between the correlators. Finally, we describe plans for the future operation of DiFX on the Swinburne supercomputer for both astrophysical and geodetic science.

  8. Principles for problem aggregation and assignment in medium scale multiprocessors

    NASA Technical Reports Server (NTRS)

    Nicol, David M.; Saltz, Joel H.

    1987-01-01

    One of the most important issues in parallel processing is the mapping of workload to processors. This paper considers a large class of problems having a high degree of potential fine grained parallelism, and execution requirements that are either not predictable, or are too costly to predict. The main issues in mapping such a problem onto medium scale multiprocessors are those of aggregation and assignment. We study a method of parameterized aggregation that makes few assumptions about the workload. The mapping of aggregate units of work onto processors is uniform, and exploits locality of workload intensity to balance the unknown workload. In general, a finer aggregate granularity leads to a better balance at the price of increased communication/synchronization costs; the aggregation parameters can be adjusted to find a reasonable granularity. The effectiveness of this scheme is demonstrated on three model problems: an adaptive one-dimensional fluid dynamics problem with message passing, a sparse triangular linear system solver on both a shared memory and a message-passing machine, and a two-dimensional time-driven battlefield simulation employing message passing. Using the model problems, the tradeoffs are studied between balanced workload and the communication/synchronization costs. Finally, an analytical model is used to explain why the method balances workload and minimizes the variance in system behavior.

  9. Interference due to shared features between action plans is influenced by working memory span.

    PubMed

    Fournier, Lisa R; Behmer, Lawrence P; Stubblefield, Alexandra M

    2014-12-01

    In this study, we examined the interactions between the action plans that we hold in memory and the actions that we carry out, asking whether the interference due to shared features between action plans is due to selection demands imposed on working memory. Individuals with low and high working memory spans learned arbitrary motor actions in response to two different visual events (A and B), presented in a serial order. They planned a response to the first event (A) and while maintaining this action plan in memory they then executed a speeded response to the second event (B). Afterward, they executed the action plan for the first event (A) maintained in memory. Speeded responses to the second event (B) were delayed when it shared an action feature (feature overlap) with the first event (A), relative to when it did not (no feature overlap). The size of the feature-overlap delay was greater for low-span than for high-span participants. This indicates that interference due to overlapping action plans is greater when fewer working memory resources are available, suggesting that this interference is due to selection demands imposed on working memory. Thus, working memory plays an important role in managing current and upcoming action plans, at least for newly learned tasks. Also, managing multiple action plans is compromised in individuals who have low versus high working memory spans.

  10. A system for simulating shared memory in heterogeneous distributed-memory networks with specialization for robotics applications

    SciTech Connect

    Jones, J.P.; Bangs, A.L.; Butler, P.L.

    1991-01-01

    Hetero Helix is a programming environment which simulates shared memory on a heterogeneous network of distributed-memory computers. The machines in the network may vary with respect to their native operating systems and internal representation of numbers. Hetero Helix presents a simple programming model to developers, and also considers the needs of designers, system integrators, and maintainers. The key software technology underlying Hetero Helix is the use of a compiler'' which analyzes the data structures in shared memory and automatically generates code which translates data representations from the format native to each machine into a common format, and vice versa. The design of Hetero Helix was motivated in particular by the requirements of robotics applications. Hetero Helix has been used successfully in an integration effort involving 27 CPUs in a heterogeneous network and a body of software totaling roughly 100,00 lines of code. 25 refs., 6 figs.

  11. Sequoia: A fault-tolerant tightly coupled multiprocessor for transaction processing

    SciTech Connect

    Bernstein, P.A.

    1988-02-01

    The Sequoia computer is a tightly coupled multiprocessor, and thus attains the performance advantages of this style of architecture. It avoids most of the fault-tolerance disadvantages of tight coupling by using a new fault-tolerance design. The Sequoia architecture is similar to other multimicroprocessor architectures, such as those of Encore and Sequent, in that it gives dozens of microprocessors shared access to a large main memory. It resembles the Stratus architecture in its extensive use of hardware fault-detection techniques. It resembles Stratus and Auragen in its ability to quickly recover all processes after a single point failure, transparently to the user. However, Sequoia is unique in its combination of a large-scale tightly coupled architecture with a hardware approach to fault tolerance. This article gives an overview of how the hardware architecture and operating systems (OS) work together to provide a high degree of fault tolerance with good system performance.

  12. HyperForest: A high performance multi-processor architecture for real-time intelligent systems

    SciTech Connect

    Garcia, P. Jr.; Rebeil, J.P.; Pollard, H.

    1997-04-01

    Intelligent Systems are characterized by the intensive use of computer power. The computer revolution of the last few years is what has made possible the development of the first generation of Intelligent Systems. Software for second generation Intelligent Systems will be more complex and will require more powerful computing engines in order to meet real-time constraints imposed by new robots, sensors, and applications. A multiprocessor architecture was developed that merges the advantages of message-passing and shared-memory structures: expendability and real-time compliance. The HyperForest architecture will provide an expandable real-time computing platform for computationally intensive Intelligent Systems and open the doors for the application of these systems to more complex tasks in environmental restoration and cleanup projects, flexible manufacturing systems, and DOE`s own production and disassembly activities.

  13. Embedded Multiprocessor Technology for VHSIC Insertion

    NASA Technical Reports Server (NTRS)

    Hayes, Paul J.

    1990-01-01

    Viewgraphs on embedded multiprocessor technology for VHSIC insertion are presented. The objective was to develop multiprocessor system technology providing user-selectable fault tolerance, increased throughput, and ease of application representation for concurrent operation. The approach was to develop graph management mapping theory for proper performance, model multiprocessor performance, and demonstrate performance in selected hardware systems.

  14. A Byzantine resilient processor with an encoded fault-tolerant shared memory

    NASA Technical Reports Server (NTRS)

    Butler, Bryan; Harper, Richard

    1990-01-01

    The memory requirements for ultra-reliable computers are expected to increase due to future increases in mission functionality and operating-system requirements. This increase will have a negative effect on the reliability and cost of the system. Increased memory size will also reduce the ability to reintegrate a channel after a transient fault, since the time required to reintegrate a channel in a conventional fault-tolerant processor is dominated by memory realignment time. A Byzantine Resilient Fault-Tolerant Processor with Fault-Tolerant Shared Memory (FTP/FTSM) is presented as a solution to these problems. The FTSM uses an encoded memory system, which reduces the memory requirement by one-half compared to a conventional quad-FTP design. This increases the reliability and decreases the cost of the system. The realignment problem is also addressed by the FTSM. Because any single error is corrected upon a read from the FTSM, a faulty channel's corrupted memory does not need realignment before reintegration of the faulty channel. A combination of correct-on-access and background scrubbing is proposed to prevent the accumulation of transient errors in the memory. With a hardware-implemented scrubber, the scrubbing cycle time, and therefore the memory fault latency, can be upper-bounded at a small value. This technique increases the reliability of the memory system and facilitates validation of its reliability model.

  15. Automatic Generation of Directive-Based Parallel Programs for Shared Memory Parallel Systems

    NASA Technical Reports Server (NTRS)

    Jin, Hao-Qiang; Yan, Jerry; Frumkin, Michael

    2000-01-01

    The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. Due to its ease of programming and its good performance, the technique has become very popular. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate directive-based, OpenMP, parallel programs. We outline techniques used in the implementation of the tool and present test results on the NAS parallel benchmarks and ARC3D, a CFD application. This work demonstrates the great potential of using computer-aided tools to quickly port parallel programs and also achieve good performance.

  16. Socially shared mourning: construction and consumption of collective memory

    NASA Astrophysics Data System (ADS)

    Harju, Anu

    2015-04-01

    Social media, such as YouTube, is increasingly a site of collective remembering where personal tributes to celebrity figures become sites of public mourning. YouTube, especially, is rife with celebrity commemorations. Examining fans' online mourning practices on YouTube, this paper examines video tributes dedicated to the late Steve Jobs, with a focus on collective remembering and collective construction of memory. Combining netnography with critical discourse analysis, the analysis focuses on the user comments where the past unfolds in interaction and meanings are negotiated and contested. The paper argues that celebrity death may, for avid fans, be a source of disenfranchised grief, a type of grief characterised by inadequate social support, usually arising from lack of empathy for the loss. The paper sheds light on the functions digital memorials have for mourning fans (and fandom) and argues that social media sites have come to function as spaces of negotiation, legitimisation and alleviation of disenfranchised grief. It is also suggested that when it comes to disenfranchised grief, and grief work generally, the concept of community be widened to include communities of weak ties, a typical form of communal belonging on social media.

  17. High Performance Programming Using Explicit Shared Memory Model on Cray T3D1

    NASA Technical Reports Server (NTRS)

    Simon, Horst D.; Saini, Subhash; Grassi, Charles

    1994-01-01

    The Cray T3D system is the first-phase system in Cray Research, Inc.'s (CRI) three-phase massively parallel processing (MPP) program. This system features a heterogeneous architecture that closely couples DEC's Alpha microprocessors and CRI's parallel-vector technology, i.e., the Cray Y-MP and Cray C90. An overview of the Cray T3D hardware and available programming models is presented. Under Cray Research adaptive Fortran (CRAFT) model four programming methods (data parallel, work sharing, message-passing using PVM, and explicit shared memory model) are available to the users. However, at this time data parallel and work sharing programming models are not available to the user community. The differences between standard PVM and CRI's PVM are highlighted with performance measurements such as latencies and communication bandwidths. We have found that the performance of neither standard PVM nor CRI s PVM exploits the hardware capabilities of the T3D. The reasons for the bad performance of PVM as a native message-passing library are presented. This is illustrated by the performance of NAS Parallel Benchmarks (NPB) programmed in explicit shared memory model on Cray T3D. In general, the performance of standard PVM is about 4 to 5 times less than obtained by using explicit shared memory model. This degradation in performance is also seen on CM-5 where the performance of applications using native message-passing library CMMD on CM-5 is also about 4 to 5 times less than using data parallel methods. The issues involved (such as barriers, synchronization, invalidating data cache, aligning data cache etc.) while programming in explicit shared memory model are discussed. Comparative performance of NPB using explicit shared memory programming model on the Cray T3D and other highly parallel systems such as the TMC CM-5, Intel Paragon, Cray C90, IBM-SP1, etc. is presented.

  18. A new shared-memory programming paradigm for molecular dynamics simulations on the Intel Paragon

    SciTech Connect

    D`Azevedo, E.F.; Romine, C.H.

    1994-12-01

    This report describes the use of shared memory emulation with DOLIB (Distributed Object Library) to simplify parallel programming on the Intel Paragon. A molecular dynamics application is used as an example to illustrate the use of the DOLIB shared memory library. SOTON-PAR, a parallel molecular dynamics code with explicit message-passing using a Lennard-Jones 6-12 potential, is rewritten using DOLIB primitives. The resulting code has no explicit message primitives and resembles a serial code. The new code can perform dynamic load balancing and achieves better performance than the original parallel code with explicit message-passing.

  19. Forgetting our personal past: socially shared retrieval-induced forgetting of autobiographical memories.

    PubMed

    Stone, Charles B; Barnier, Amanda J; Sutton, John; Hirst, William

    2013-11-01

    People often talk to others about their personal past. These discussions are inherently selective. Selective retrieval of memories in the course of a conversation may induce forgetting of unmentioned but related memories for both speakers and listeners (Cuc, Koppel, & Hirst, 2007). Cuc et al. (2007) defined the forgetting on the part of the speaker as within-individual retrieval-induced forgetting (WI-RIF) and the forgetting on the part of the listener as socially shared retrieval-induced forgetting (SS-RIF). However, if the forgetting associated with WI-RIF and SS-RIF is to be taken seriously as a mechanism that shapes both individual and shared memories, this mechanism must be demonstrated with meaningful material and in ecologically valid groups. In our first 2 experiments we extended SS-RIF from unemotional, experimenter-contrived material to the emotional and unemotional autobiographical memories of strangers (Experiment 1) and intimate couples (Experiment 2) when merely overhearing the speaker selectively practice memories. We then extended these results to the context of a free-flowing conversation (Experiments 3 and 4). In all 4 experiments we found WI-RIF and SS-RIF regardless of the emotional valence or individual ownership of the memories. We discuss our findings in terms of the role of conversational silence in shaping both our personal and shared pasts.

  20. A shared neural ensemble links distinct contextual memories encoded close in time

    PubMed Central

    Cai, Denise J.; Aharoni, Daniel; Shuman, Tristan; Shobe, Justin; Biane, Jeremy; Song, Weilin; Wei, Brandon; Veshkini, Michael; La-Vu, Mimi; Lou, Jerry; Flores, Sergio; Kim, Isaac; Sano, Yoshitake; Zhou, Miou; Baumgaertel, Karsten; Lavi, Ayal; Kamata, Masakazu; Tuszynski, Mark; Mayford, Mark; Golshani, Peyman; Silva, Alcino J.

    2016-01-01

    Recent studies suggest the hypothesis that a shared neural ensemble may link distinct memories encoded close in time1–13. According to the memory allocation hypothesis1,2, learning triggers a temporary increase in neuronal excitability14–16 that biases the representation of a subsequent memory to the neuronal ensemble encoding the first memory, such that recall of one memory increases the likelihood of recalling the other memory. Accordingly, we report that the overlap between the hippocampal CA1 ensembles activated by two distinct contexts acquired within a day is higher than when they are separated by a week. Multiple convergent findings indicate that this overlap of neuronal ensembles links two contextual memories. First, fear paired with one context is transferred to a neutral context when the two are acquired within a day but not across a week. Second, the first memory strengthens the second memory within a day but not across a week. Older mice, known to have lower CA1 excitability16,17, do not show the overlap between ensembles, the transfer of fear between contexts, or the strengthening of the second memory. Finally, in aged animals, increasing cellular excitability and activating a common ensemble of CA1 neurons during two distinct context exposures rescued the deficit in linking memories. Taken together, these findings demonstrate that contextual memories encoded close in time are linked by directing storage into overlapping ensembles. Alteration of these processes by aging could affect the temporal structure of memories, thus impairing efficient recall of related information. PMID:27251287

  1. Functions of Memory Sharing and Mother-Child Reminiscing Behaviors: Individual and Cultural Variations

    ERIC Educational Resources Information Center

    Kulkofsky, Sarah; Wang, Qi; Koh, Jessie Bee Kim

    2009-01-01

    This study examined maternal beliefs about the functions of memory sharing and the relations between these beliefs and mother-child reminiscing behaviors in a cross-cultural context. Sixty-three European American and 47 Chinese mothers completed an open-ended questionnaire concerning their beliefs about the functions of parent-child memory…

  2. LU Factorization with Partial Pivoting for a Multi-CPU, Multi-GPU Shared Memory System

    SciTech Connect

    Kurzak, Jakub; Luszczek, Pitior; Faverge, Mathieu; Dongarra, Jack

    2012-03-01

    LU factorization with partial pivoting is a canonical numerical procedure and the main component of the High Performance LINPACK benchmark. This article presents an implementation of the algorithm for a hybrid, shared memory, system with standard CPU cores and GPU accelerators. Performance in excess of one TeraFLOPS is achieved using four AMD Magny Cours CPUs and four NVIDIA Fermi GPUs.

  3. Using memory in the Cedar system

    SciTech Connect

    McGrath, R.E.; Emrath, P.

    1987-01-01

    The design of the virtual memory system for the Cedar multiprocessor under construction at the University of Illinois is discussed. The Cedar architecture features a hierarchy of memory, some shared by all processors, and some shared by subsets of processors. The Xylem operating system is based on Alliant Computer Systems CONCENTRIX(TM) operating system, which is based on 4.2BSD UNIX(TM). Xylem supports multi-tasking and demand paging of parts of the memory hierarchy into a linear virtual address space. Memory may be private to a task or shared between all the tasks. The locality and attributes of a page may be modified during the execution of a program. Examples of how these mechanisms can be used are discussed. 14 figs.

  4. Visual and Spatial Working Memory Are Not that Dissociated after All: A Time-Based Resource-Sharing Account

    ERIC Educational Resources Information Center

    Vergauwe, Evie; Barrouillet, Pierre; Camos, Valerie

    2009-01-01

    Examinations of interference between visual and spatial materials in working memory have suggested domain- and process-based fractionations of visuo-spatial working memory. The present study examined the role of central time-based resource sharing in visuo-spatial working memory and assessed its role in obtained interference patterns. Visual and…

  5. Parallel calculations on shared memory, NUMA-based computers using MATLAB

    NASA Astrophysics Data System (ADS)

    Krotkiewski, Marcin; Dabrowski, Marcin

    2014-05-01

    Achieving satisfactory computational performance in numerical simulations on modern computer architectures can be a complex task. Multi-core design makes it necessary to parallelize the code. Efficient parallelization on NUMA (Non-Uniform Memory Access) shared memory architectures necessitates explicit placement of the data in the memory close to the CPU that uses it. In addition, using more than 8 CPUs (~100 cores) requires a cluster solution of interconnected nodes, which involves (expensive) communication between the processors. It takes significant effort to overcome these challenges even when programming in low-level languages, which give the programmer full control over data placement and work distribution. Instead, many modelers use high-level tools such as MATLAB, which severely limit the optimization/tuning options available. Nonetheless, the advantage of programming simplicity and a large available code base can tip the scale in favor of MATLAB. We investigate whether MATLAB can be used for efficient, parallel computations on modern shared memory architectures. A common approach to performance optimization of MATLAB programs is to identify a bottleneck and migrate the corresponding code block to a MEX file implemented in, e.g. C. Instead, we aim at achieving a scalable parallel performance of MATLABs core functionality. Some of the MATLABs internal functions (e.g., bsxfun, sort, BLAS3, operations on vectors) are multi-threaded. Achieving high parallel efficiency of those may potentially improve the performance of significant portion of MATLABs code base. Since we do not have MATLABs source code, our performance tuning relies on the tools provided by the operating system alone. Most importantly, we use custom memory allocation routines, thread to CPU binding, and memory page migration. The performance tests are carried out on multi-socket shared memory systems (2- and 4-way Intel-based computers), as well as a Distributed Shared Memory machine with 96 CPU

  6. A multiprocessor operating system simulator

    SciTech Connect

    Johnston, G.M.; Campbell, R.H. . Dept. of Computer Science)

    1988-01-01

    This paper describes a multiprocessor operating system simulator that was developed by the authors in the Fall of 1987. The simulator was built in response to the need to provide students with an environment in which to build and test operating system concepts as part of the coursework of a third-year undergraduate operating systems course. Written in C++, the simulator uses the co-routine style task package that is distributed with the AT and T C++ Translator to provide a hierarchy of classes that represents a broad range of operating system software and hardware components. The class hierarchy closely follows that of the Choices family of operating systems for loosely and tightly coupled multiprocessors. During an operating system course, these classes are refined and specialized by students in homework assignments to facilitate experimentation with different aspects of operating system design and policy decisions. The current implementation runs on the IBM RT PC under 4.3bsd UNIX.

  7. A Multiprocessor Operating System Simulator

    NASA Technical Reports Server (NTRS)

    Johnston, Gary M.; Campbell, Roy H.

    1988-01-01

    This paper describes a multiprocessor operating system simulator that was developed by the authors in the Fall semester of 1987. The simulator was built in response to the need to provide students with an environment in which to build and test operating system concepts as part of the coursework of a third-year undergraduate operating systems course. Written in C++, the simulator uses the co-routine style task package that is distributed with the AT&T C++ Translator to provide a hierarchy of classes that represents a broad range of operating system software and hardware components. The class hierarchy closely follows that of the 'Choices' family of operating systems for loosely- and tightly-coupled multiprocessors. During an operating system course, these classes are refined and specialized by students in homework assignments to facilitate experimentation with different aspects of operating system design and policy decisions. The current implementation runs on the IBM RT PC under 4.3bsd UNIX.

  8. Data traffic reduction schemes for Cholesky factorization on asynchronous multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Naik, Vijay K.; Patrick, Merrell L.

    1989-01-01

    Communication requirements of Cholesky factorization of dense and sparse symmetric, positive definite matrices are analyzed. The communication requirement is characterized by the data traffic generated on multiprocessor systems with local and shared memory. Lower bound proofs are given to show that when the load is uniformly distributed the data traffic associated with factoring an n x n dense matrix using n to the alpha power (alpha less than or equal 2) processors is omega(n to the 2 + alpha/2 power). For n x n sparse matrices representing a square root of n x square root of n regular grid graph the data traffic is shown to be omega(n to the 1 + alpha/2 power), alpha less than or equal 1. Partitioning schemes that are variations of block assignment scheme are described and it is shown that the data traffic generated by these schemes are asymptotically optimal. The schemes allow efficient use of up to O(n to the 2nd power) processors in the dense case and up to O(n) processors in the sparse case before the total data traffic reaches the maximum value of O(n to the 3rd power) and O(n to the 3/2 power), respectively. It is shown that the block based partitioning schemes allow a better utilization of the data accessed from shared memory and thus reduce the data traffic than those based on column-wise wrap around assignment schemes.

  9. Reproducibility in a multiprocessor system

    SciTech Connect

    Bellofatto, Ralph A; Chen, Dong; Coteus, Paul W; Eisley, Noel A; Gara, Alan; Gooding, Thomas M; Haring, Rudolf A; Heidelberger, Philip; Kopcsay, Gerard V; Liebsch, Thomas A; Ohmacht, Martin; Reed, Don D; Senger, Robert M; Steinmacher-Burow, Burkhard; Sugawara, Yutaka

    2013-11-26

    Fixing a problem is usually greatly aided if the problem is reproducible. To ensure reproducibility of a multiprocessor system, the following aspects are proposed; a deterministic system start state, a single system clock, phase alignment of clocks in the system, system-wide synchronization events, reproducible execution of system components, deterministic chip interfaces, zero-impact communication with the system, precise stop of the system and a scan of the system state.

  10. Vascular system modeling in parallel environment - distributed and shared memory approaches.

    PubMed

    Jurczuk, Krzysztof; Kretowski, Marek; Bezy-Wendling, Johanne

    2011-07-01

    This paper presents two approaches in parallel modeling of vascular system development in internal organs. In the first approach, new parts of tissue are distributed among processors and each processor is responsible for perfusing its assigned parts of tissue to all vascular trees. Communication between processors is accomplished by passing messages, and therefore, this algorithm is perfectly suited for distributed memory architectures. The second approach is designed for shared memory machines. It parallelizes the perfusion process during which individual processing units perform calculations concerning different vascular trees. The experimental results, performed on a computing cluster and multicore machines, show that both algorithms provide a significant speedup.

  11. Shared mushroom body circuits underlie visual and olfactory memories in Drosophila

    PubMed Central

    Vogt, Katrin; Schnaitmann, Christopher; Dylla, Kristina V; Knapek, Stephan; Aso, Yoshinori; Rubin, Gerald M; Tanimoto, Hiromu

    2014-01-01

    In nature, animals form memories associating reward or punishment with stimuli from different sensory modalities, such as smells and colors. It is unclear, however, how distinct sensory memories are processed in the brain. We established appetitive and aversive visual learning assays for Drosophila that are comparable to the widely used olfactory learning assays. These assays share critical features, such as reinforcing stimuli (sugar reward and electric shock punishment), and allow direct comparison of the cellular requirements for visual and olfactory memories. We found that the same subsets of dopamine neurons drive formation of both sensory memories. Furthermore, distinct yet partially overlapping subsets of mushroom body intrinsic neurons are required for visual and olfactory memories. Thus, our results suggest that distinct sensory memories are processed in a common brain center. Such centralization of related brain functions is an economical design that avoids the repetition of similar circuit motifs. DOI: http://dx.doi.org/10.7554/eLife.02395.001 PMID:25139953

  12. IMPACC: A Tightly Integrated MPI+OpenACC Framework Exploiting Shared Memory Parallelism

    SciTech Connect

    Lee, Seyong; Vetter, Jeffrey S

    2016-01-01

    We propose IMPACC, an MPI+OpenACC framework for heterogeneous accelerator clusters. IMPACC tightly integrates MPI and OpenACC, while exploiting the shared memory parallelism in the target system. IMPACC dynamically adapts the input MPI+OpenACC applications on the target heterogeneous accelerator clusters to fully exploit target system-specific features. IMPACC provides the programmers with the unified virtual address space, automatic NUMA-friendly task-device mapping, efficient integrated communication routines, seamless streamlining of asynchronous executions, and transparent memory sharing. We have implemented IMPACC and evaluated its performance using three heterogeneous accelerator systems, including Titan supercomputer. Results show that IMPACC can achieve easier programming, higher performance, and better scalability than the current MPI+OpenACC model.

  13. Fault tolerant onboard packet switch architecture for communication satellites: Shared memory per beam approach

    NASA Technical Reports Server (NTRS)

    Shalkhauser, Mary JO; Quintana, Jorge A.; Soni, Nitin J.

    1994-01-01

    The NASA Lewis Research Center is developing a multichannel communication signal processing satellite (MCSPS) system which will provide low data rate, direct to user, commercial communications services. The focus of current space segment developments is a flexible, high-throughput, fault tolerant onboard information switching processor. This information switching processor (ISP) is a destination-directed packet switch which performs both space and time switching to route user information among numerous user ground terminals. Through both industry study contracts and in-house investigations, several packet switching architectures were examined. A contention-free approach, the shared memory per beam architecture, was selected for implementation. The shared memory per beam architecture, fault tolerance insertion, implementation, and demonstration plans are described.

  14. Immune and nervous systems share molecular and functional similarities: memory storage mechanism.

    PubMed

    Habibi, L; Ebtekar, M; Jameie, S B

    2009-04-01

    One of the most complex and important features of both the nervous and immune systems is their data storage and retrieval capability. Both systems encounter a common and complex challenge on how to overcome the cumbersome task of data management. Because each neuron makes many synapses with other neurons, they are capable of receiving data from thousands of synaptic connections. The immune system B and T cells have to deal with a similar level of complexity because of their unlimited task of recognizing foreign antigens. As for the complexity of memory storage, it has been proposed that both systems may share a common set of molecular mechanisms. Here, we review the molecular bases of memory storage in neurons and immune cells based on recent studies and findings. The expression of certain molecules and mechanisms shared between the two systems, including cytokine networks, and cell surface receptors, are reviewed. Intracellular signaling similarities and certain mechanisms such as diversity, memory storage, and their related molecular properties are briefly discussed. Moreover, two similar genetic mechanisms used by both systems is discussed, putting forward the idea that DNA recombination may be an underlying mechanism involved in CNS memory storage.

  15. Multiprocessor computer overset grid method and apparatus

    DOEpatents

    Barnette, Daniel W.; Ober, Curtis C.

    2003-01-01

    A multiprocessor computer overset grid method and apparatus comprises associating points in each overset grid with processors and using mapped interpolation transformations to communicate intermediate values between processors assigned base and target points of the interpolation transformations. The method allows a multiprocessor computer to operate with effective load balance on overset grid applications.

  16. Global arrays: A portable {open_quotes}shared-memory{close_quotes} programming model for distributed memory computers

    SciTech Connect

    Harrison, R.J.; Nieplocha, J.; Littlefield, R.J.

    1994-11-01

    Portability, efficiency, and ease of coding are all important considerations in choosing the programming model for a scalable parallel application. The message-passing programming model is widely used because of its portability, yet some applications are too complex to code in it while also trying to maintain a balanced computation load and avoid redundant computations. The shared-memory programming model simplifies coding, but it is not portable and often provides little control over interprocessor data transfer costs. This paper describes a new approach, called Global Arrays (GA), that combines the better features of both other models, leading to both simple coding and efficient execution. The key concept of GA is that it provides a portable interface through which each process in a MIMD parallel program can asynchronously access logical blocks of physically distributed matrices, with no need for explicit cooperation by other processes. The authors have implemented GA libraries on a variety of computer systems, including the Intel DELTA and Paragon, the IBM SP-1 (all message-passers), the Kendall Square KSR-2 (a nonuniform access shared-memory machine), and networks of Unix workstations. They discuss the design and implementation of these libraries, report their performance, illustrate the use of GA in the context of computational chemistry applications, and describe the use of a GA performance visualization tool.

  17. Coscheduling Technique for Symmetric Multiprocessor Clusters

    SciTech Connect

    Yoo, A B; Jette, M A

    2000-09-18

    Coscheduling is essential for obtaining good performance in a time-shared symmetric multiprocessor (SMP) cluster environment. However, the most common technique, gang scheduling, has limitations such as poor scalability and vulnerability to faults mainly due to explicit synchronization between its components. A decentralized approach called dynamic coscheduling (DCS) has been shown to be effective for network of workstations (NOW), but this technique is not suitable for the workloads on a very large SMP-cluster with thousands of processors. Furthermore, its implementation can be prohibitively expensive for such a large-scale machine. IN this paper, they propose a novel coscheduling technique based on the DCS approach which can achieve coscheduling on very large SMP-clusters in a scalable, efficient, and cost-effective way. In the proposed technique, each local scheduler achieves coscheduling based upon message traffic between the components of parallel jobs. Message trapping is carried out at the user-level, eliminating the need for unsupported hardware or device-level programming. A sending process attaches its status to outgoing messages so local schedulers on remote nodes can make more intelligent scheduling decisions. Once scheduled, processes are guaranteed some minimum period of time to execute. This provides an opportunity to synchronize the parallel job's components across all nodes and achieve good program performance. The results from a performance study reveal that the proposed technique is a promising approach that can reduce response time significantly over uncoordinated time-sharing and batch scheduling.

  18. Parallel Reduction of Large Radar Interferometry Scenes on a Mid-scale, Symmetric Multiprocessor Mainframe Computer

    NASA Astrophysics Data System (ADS)

    Harcke, L. J.; Zebker, H. A.

    2006-12-01

    We report on experiences in processing repeat-orbit interferometry data sets on a mid-scale multiprocessor mainframe computer. Newer applications of interferometric and polarimetric data processing, such as permanent scatterer deformation monitoring, require the generation of many tens of repeat-pass interferometry data pairs, perhaps 30 to 50, to provide sufficient input to the deformation model. Moving existing radar processing techniques toward massively parallel computation provides a path to coping with such large data sets, which can consist of 30 to 50 gigabytes (GB) of raw data. In June 2006, the Stanford School of Earth Sciences dedicated a new computation center for general research use. Two large machines compose the center: a single-node, symmetric multiprocessor (SMP) machine with 48 processor cores and a single 192~GB memory, and a 64 node distributed cluster containing 128 processor cores with at least 2~GB of memory per node. Distributed processing of the matched filter for synthetic aperture radar image formation requires a high communication-to-computation ratio. Experiments performed over a decade ago on distributed memory supercomputers, and repeated a half-decade ago on commodity workstation clusters, both demonstrated saturation of inter-node communication links. For this reason, we chose to parallelize the interferometric processor on the shared memory computer using the OpenMP programming standard. We find, not unexpectedly, that the input/output stage of processing standard 100-by-100~kilometer ERS-1 scenes quickly dominates the total computation time, and that only modest increases in processing time are achieved after 8 to 16 processor cores are brought to bear on a single data set. The input and output data sit in single, serially accessed disk files, creating a bottleneck for overall throughput. This points to a scheme for efficient partitioning of mid-size (24 to 48~core) machines for reducing large Earth science data sets, where 3 to

  19. Multiprocessor system with multiple concurrent modes of execution

    SciTech Connect

    Ahn, Daniel; Ceze, Luis H.; Chen, Dong Chen; Gara, Alan; Heidelberger, Philip; Ohmacht, Martin

    2016-11-22

    A multiprocessor system supports multiple concurrent modes of speculative execution. Speculation identification numbers (IDs) are allocated to speculative threads from a pool of available numbers. The pool is divided into domains, with each domain being assigned to a mode of speculation. Modes of speculation include TM, TLS, and rollback. Allocation of the IDs is carried out with respect to a central state table and using hardware pointers. The IDs are used for writing different versions of speculative results in different ways of a set in a cache memory.

  20. Simulating an aerospace multiprocessor. [for space guidance computers

    NASA Technical Reports Server (NTRS)

    Mallach, E. G.

    1976-01-01

    The paper describes a simulator which was used to evaluate the architecture of an aerospace multiprocessor. The simulator models interactions among the processors, memories, the central data bus, and a possible 'job stack'. Special features of the simulator are discussed, including the use of explicitly coded and individually distinguishable 'job models' instead of a statistically defined 'job mix' and a specialized Job Model Definition Language to automate the detailed coding of the models. Some results are presented which show that when the simulator was employed in conjunction with queuing theory and Markov-process analysis, more insight into system behavior was obtained than would have been with any one technique alone.

  1. Multiprocessor system with multiple concurrent modes of execution

    DOEpatents

    Ahn, Daniel; Ceze, Luis H; Chen, Dong; Gara, Alan; Heidelberger, Philip; Ohmacht, Martin

    2013-12-31

    A multiprocessor system supports multiple concurrent modes of speculative execution. Speculation identification numbers (IDs) are allocated to speculative threads from a pool of available numbers. The pool is divided into domains, with each domain being assigned to a mode of speculation. Modes of speculation include TM, TLS, and rollback. Allocation of the IDs is carried out with respect to a central state table and using hardware pointers. The IDs are used for writing different versions of speculative results in different ways of a set in a cache memory.

  2. Shared and Distributed Memory Parallel Security Analysis of Large-Scale Source Code and Binary Applications

    SciTech Connect

    Quinlan, D; Barany, G; Panas, T

    2007-08-30

    Many forms of security analysis on large scale applications can be substantially automated but the size and complexity can exceed the time and memory available on conventional desktop computers. Most commercial tools are understandably focused on such conventional desktop resources. This paper presents research work on the parallelization of security analysis of both source code and binaries within our Compass tool, which is implemented using the ROSE source-to-source open compiler infrastructure. We have focused on both shared and distributed memory parallelization of the evaluation of rules implemented as checkers for a wide range of secure programming rules, applicable to desktop machines, networks of workstations and dedicated clusters. While Compass as a tool focuses on source code analysis and reports violations of an extensible set of rules, the binary analysis work uses the exact same infrastructure but is less well developed into an equivalent final tool.

  3. Exploiting Processor Groups to Extend Scalability of the GA Shared Memory Programming Model

    SciTech Connect

    Nieplocha, Jarek; Krishnan, Manoj Kumar; Palmer, Bruce J.; Tipparaju, Vinod; Zhang, Yeliang

    2005-05-04

    Exploiting processor groups is becoming increasingly important for programming next-generation high-end systems composed of tens or hundreds of thousands of processors. This paper discusses the requirements, functionality and development of multilevel-parallelism based on processor groups in the context of the Global Array (GA) shared memory programming model. The main effort involves management of shared data, rather than interprocessor communication. Experimental results for the NAS NPB Conjugate Gradient benchmark and a molecular dynamics (MD) application are presented for a Linux cluster with Myrinet and illustrate the value of the proposed approach for improving scalability. While the original GA version of the CG benchmark lagged MPI, the processor-group version outperforms MPI in all cases, except for a few points on the smallest problem size. Similarly, the group version of the MD application improves execution time by 58% on 32 processors.

  4. An implementation of SISAL for distributed-memory architectures

    SciTech Connect

    Beard, Patrick C.

    1995-06-01

    This thesis describes a new implementation of the implicitly parallel functional programming language SISAL, for massively parallel processor supercomputers. The Optimizing SISAL Compiler (OSC), developed at Lawrence Livermore National Laboratory, was originally designed for shared-memory multiprocessor machines and has been adapted to distributed-memory architectures. OSC has been relatively portable between shared-memory architectures, because they are architecturally similar, and OSC generates portable C code. However, distributed-memory architectures are not standardized -- each has a different programming model. Distributed-memory SISAL depends on a layer of software that provides a portable, distributed, shared-memory abstraction. This layer is provided by Split-C, a dialect of the C programming language developed at U.C. Berkeley, which has demonstrated good performance on distributed-memory architectures. Split-C provides important capabilities for good performance: support for program-specific distributed data structures, and split-phase memory operations. Distributed data structures help achieve good memory locality, while split-phase memory operations help tolerate the longer communication latencies inherent in distributed-memory architectures. The distributed-memory SISAL compiler and run-time system takes advantage of these capabilities. The results of these efforts is a compiler that runs identically on the Thinking Machines Connection Machine (CM-5), and the Meiko Computing Surface (CS-2).

  5. Multiprocessor performance modeling with ADAS

    NASA Technical Reports Server (NTRS)

    Hayes, Paul J.; Andrews, Asa M.

    1989-01-01

    A graph managing strategy referred to as the Algorithm to Architecture Mapping Model (ATAMM) appears useful for the time-optimized execution of application algorithm graphs in embedded multiprocessors and for the performance prediction of graph designs. This paper reports the modeling of ATAMM in the Architecture Design and Assessment System (ADAS) to make an independent verification of ATAMM's performance prediction capability and to provide a user framework for the evaluation of arbitrary algorithm graphs. Following an overview of ATAMM and its major functional rules are descriptions of the ADAS model of ATAMM, methods to enter an arbitrary graph into the model, and techniques to analyze the simulation results. The performance of a 7-node graph example is evaluated using the ADAS model and verifies the ATAMM concept by substantiating previously published performance results.

  6. Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

    NASA Astrophysics Data System (ADS)

    Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide

    2015-09-01

    The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.

  7. An experimental distributed microprocessor implementation with a shared memory communications and control medium

    NASA Technical Reports Server (NTRS)

    Mejzak, R. S.

    1980-01-01

    The distributed processing concept is defined in terms of control primitives, variables, and structures and their use in performing a decomposed discrete Fourier transform (DET) application function. The design assumes interprocessor communications to be anonymous. In this scheme, all processors can access an entire common database by employing control primitives. Access to selected areas within the common database is random, enforced by a hardware lock, and determined by task and subtask pointers. This enables the number of processors to be varied in the configuration without any modifications to the control structure. Decompositional elements of the DFT application function in terms of tasks and subtasks are also described. The experimental hardware configuration consists of IMSAI 8080 chassis which are independent, 8 bit microcomputer units. These chassis are linked together to form a multiple processing system by means of a shared memory facility. This facility consists of hardware which provides a bus structure to enable up to six microcomputers to be interconnected. It provides polling and arbitration logic so that only one processor has access to shared memory at any one time.

  8. ATAMM enhancement and multiprocessor performance evaluation

    NASA Technical Reports Server (NTRS)

    Stoughton, John W.; Mielke, Roland R.; Som, Sukhamoy; Obando, Rodrigo; Malekpour, Mahyar R.; Jones, Robert L., III; Mandala, Brij Mohan V.

    1991-01-01

    ATAMM (Algorithm To Architecture Mapping Model) enhancement and multiprocessor performance evaluation is discussed. The following topics are included: the ATAMM model; ATAMM enhancement; ADM (Advanced Development Model) implementation of ATAMM; and ATAMM support tools.

  9. Optimal eigenvalue computation on a mesh multiprocessor

    SciTech Connect

    Crivelli, S.; Jessup, E. R.

    1993-01-01

    In this paper, we compare the costs of computing a single eigenvalue of a symmetric tridiagonal matrix by serial bisection and by parallel multisection on a mesh multiprocessor. We show how the optimal method for computing one eigenvalue depends on such variables as the matrix order and parameters of the multiprocessor used. We present the results of experiments on the 520-processor Intel Touchstone Delta to support our analysis.

  10. Testing and operating a multiprocessor chip with processor redundancy

    DOEpatents

    Bellofatto, Ralph E; Douskey, Steven M; Haring, Rudolf A; McManus, Moyra K; Ohmacht, Martin; Schmunkamp, Dietmar; Sugavanam, Krishnan; Weatherford, Bryan J

    2014-10-21

    A system and method for improving the yield rate of a multiprocessor semiconductor chip that includes primary processor cores and one or more redundant processor cores. A first tester conducts a first test on one or more processor cores, and encodes results of the first test in an on-chip non-volatile memory. A second tester conducts a second test on the processor cores, and encodes results of the second test in an external non-volatile storage device. An override bit of a multiplexer is set if a processor core fails the second test. In response to the override bit, the multiplexer selects a physical-to-logical mapping of processor IDs according to one of: the encoded results in the memory device or the encoded results in the external storage device. On-chip logic configures the processor cores according to the selected physical-to-logical mapping.

  11. gpuSPHASE-A shared memory caching implementation for 2D SPH using CUDA

    NASA Astrophysics Data System (ADS)

    Winkler, Daniel; Meister, Michael; Rezavand, Massoud; Rauch, Wolfgang

    2017-04-01

    Smoothed particle hydrodynamics (SPH) is a meshless Lagrangian method that has been successfully applied to computational fluid dynamics (CFD), solid mechanics and many other multi-physics problems. Using the method to solve transport phenomena in process engineering requires the simulation of several days to weeks of physical time. Based on the high computational demand of CFD such simulations in 3D need a computation time of years so that a reduction to a 2D domain is inevitable. In this paper gpuSPHASE, a new open-source 2D SPH solver implementation for graphics devices, is developed. It is optimized for simulations that must be executed with thousands of frames per second to be computed in reasonable time. A novel caching algorithm for Compute Unified Device Architecture (CUDA) shared memory is proposed and implemented. The software is validated and the performance is evaluated for the well established dambreak test case.

  12. Iterative algorithms for tridiagonal matrices on a WSI-multiprocessor

    SciTech Connect

    Gajski, D.D.; Sameh, A.H.; Wisniewski, J.A.

    1982-01-01

    With the rapid advances in semiconductor technology, the construction of Wafer Scale Integration (WSI)-multiprocessors consisting of a large number of processors is now feasible. We illustrate the implementation of some basic linear algebra algorithms on such multiprocessors.

  13. Parallel Fock matrix construction with distributed shared memory model for the FMO-MO method.

    PubMed

    Umeda, Hiroaki; Inadomi, Yuichi; Watanabe, Toshio; Yagi, Toru; Ishimoto, Takayoshi; Ikegami, Tsutomu; Tadano, Hiroto; Sakurai, Tetsuya; Nagashima, Umpei

    2010-10-01

    A parallel Fock matrix construction program for FMO-MO method has been developed with the distributed shared memory model. To construct a large-sized Fock matrix during FMO-MO calculations, a distributed parallel algorithm was designed to make full use of local memory to reduce communication, and was implemented on the Global Array toolkit. A benchmark calculation for a small system indicates that the parallelization efficiency of the matrix construction portion is as high as 93% at 1,024 processors. A large FMO-MO application on the epidermal growth factor receptor (EGFR) protein (17,246 atoms and 96,234 basis functions) was also carried out at the HF/6-31G level of theory, with the frontier orbitals being extracted by a Sakurai-Sugiura eigensolver. It takes 11.3 h for the FMO calculation, 49.1 h for the Fock matrix construction, and 10 min to extract 94 eigen-components on a PC cluster system using 256 processors.

  14. Aho-Corasick String Matching on Shared and Distributed Memory Parallel Architectures

    SciTech Connect

    Tumeo, Antonino; Villa, Oreste; Chavarría-Miranda, Daniel

    2012-03-01

    String matching is at the core of many critical applications, including network intrusion detection systems, search engines, virus scanners, spam filters, DNA and protein sequencing, and data mining. For all of these applications string matching requires a combination of (sometimes all) the following characteristics: high and/or predictable performance, support for large data sets and flexibility of integration and customization. Many software based implementations targeting conventional cache-based microprocessors fail to achieve high and predictable performance requirements, while Field-Programmable Gate Array (FPGA) implementations and dedicated hardware solutions fail to support large data sets (dictionary sizes) and are difficult to integrate and customize. The advent of multicore, multithreaded, and GPU-based systems is opening the possibility for software based solutions to reach very high performance at a sustained rate. This paper compares several software-based implementations of the Aho-Corasick string searching algorithm for high performance systems. We discuss the implementation of the algorithm on several types of shared-memory high-performance architectures (Niagara 2, large x86 SMPs and Cray XMT), distributed memory with homogeneous processing elements (InfiniBand cluster of x86 multicores) and heterogeneous processing elements (InfiniBand cluster of x86 multicores with NVIDIA Tesla C10 GPUs). We describe in detail how each solution achieves the objectives of supporting large dictionaries, sustaining high performance, and enabling customization and flexibility using various data sets.

  15. Multiprocessor smalltalk: Implementation, performance, and analysis

    SciTech Connect

    Pallas, J.I.

    1990-01-01

    Multiprocessor Smalltalk demonstrates the value of object-oriented programming on a multiprocessor. Its implementation and analysis shed light on three areas: concurrent programming in an object oriented language without special extensions, implementation techniques for adapting to multiprocessors, and performance factors in the resulting system. Adding parallelism to Smalltalk code is easy, because programs already use control abstractions like iterators. Smalltalk's basic control and concurrency primitives (lambda expressions, processes and semaphores) can be used to build parallel control abstractions, including parallel iterators, parallel objects, atomic objects, and futures. Language extensions for concurrency are not required. This implementation demonstrates that it is possible to build an efficient parallel object-oriented programming system and illustrates techniques for doing so. Three modification tools-serialization, replication, and reorganization-adapted the Berkeley Smalltalk interpreter to the Firefly multiprocessor. Multiprocessor Smalltalk's performance shows that the combination of multiprocessing and object-oriented programming can be effective: speedups (relative to the original serial version) exceed 2.0 for five processors on all the benchmarks; the median efficiency is 48%. Analysis shows both where performance is lost and how to improve and generalize the experimental results. Changes in the interpreter to support concurrency add at most 12% overhead; better access to per-process variables could eliminate much of that. Changes in the user code to express concurrency add as much as 70% overhead; this overhead could be reduced to 54% if blocks (lambda expressions) were reentrant. Performance is also lost when the program cannot keep all five processors busy.

  16. Shared neuroanatomical substrates of impaired phonological working memory across reading disability and autism

    PubMed Central

    Lu, Chunming; Qi, Zhenghan; Harris, Adrianne; Weil, Lisa Wisman; Han, Michelle; Halverson, Kelly; Perrachione, Tyler K.; Kjelgaard, Margaret; Wexler, Kenneth; Tager-Flusberg, Helen; Gabrieli, John D. E.

    2015-01-01

    Background Individuals with reading disability or individuals with autism spectrum disorder (ASD) are characterized, respectively, by their difficulties in reading or social communication, but both groups often have impaired phonological working memory (PWM). It is not known whether the impaired PWM reflects distinct or shared neuroanatomical abnormalities in these two diagnostic groups. Methods White-matter structural connectivity via diffusion weighted imaging was examined in sixty-four children, ages 5-17 years, with reading disability, ASD, or typical development (TD), who were matched in age, gender, intelligence, and diffusion data quality. Results Children with reading disability and children with ASD exhibited reduced PWM compared to children with TD. The two diagnostic groups showed altered white-matter microstructure in the temporo-parietal portion of the left arcuate fasciculus (AF) and in the temporo-occipital portion of the right inferior longitudinal fasciculus (ILF), as indexed by reduced fractional anisotropy and increased radial diffusivity. Moreover, the structural integrity of the right ILF was positively correlated with PWM ability in the two diagnostic groups, but not in the TD group. Conclusions These findings suggest that impaired PWM is transdiagnostically associated with shared neuroanatomical abnormalities in ASD and reading disability. Microstructural characteristics in left AF and right ILF may play important roles in the development of PWM. The right ILF may support a compensatory mechanism for children with impaired PWM. PMID:26949750

  17. Performance and Application of Parallel OVERFLOW Codes on Distributed and Shared Memory Platforms

    NASA Technical Reports Server (NTRS)

    Djomehri, M. Jahed; Rizk, Yehia M.

    1999-01-01

    The presentation discusses recent studies on the performance of the two parallel versions of the aerodynamics CFD code, OVERFLOW_MPI and _MLP. Developed at NASA Ames, the serial version, OVERFLOW, is a multidimensional Navier-Stokes flow solver based on overset (Chimera) grid technology. The code has recently been parallelized in two ways. One is based on the explicit message-passing interface (MPI) across processors and uses the _MPI communication package. This approach is primarily suited for distributed memory systems and workstation clusters. The second, termed the multi-level parallel (MLP) method, is simple and uses shared memory for all communications. The _MLP code is suitable on distributed-shared memory systems. For both methods, the message passing takes place across the processors or processes at the advancement of each time step. This procedure is, in effect, the Chimera boundary conditions update, which is done in an explicit "Jacobi" style. In contrast, the update in the serial code is done in more of the "Gauss-Sidel" fashion. The programming efforts for the _MPI code is more complicated than for the _MLP code; the former requires modification of the outer and some inner shells of the serial code, whereas the latter focuses only on the outer shell of the code. The _MPI version offers a great deal of flexibility in distributing grid zones across a specified number of processors in order to achieve load balancing. The approach is capable of partitioning zones across multiple processors or sending each zone and/or cluster of several zones into a single processor. The message passing across the processors consists of Chimera boundary and/or an overlap of "halo" boundary points for each partitioned zone. The MLP version is a new coarse-grain parallel concept at the zonal and intra-zonal levels. A grouping strategy is used to distribute zones into several groups forming sub-processes which will run in parallel. The total volume of grid points in each

  18. A combined PLC and CPU approach to multiprocessor control

    SciTech Connect

    Harris, J.J.; Broesch, J.D.; Coon, R.M.

    1995-10-01

    A sophisticated multiprocessor control system has been developed for use in the E-Power Supply System Integrated Control (EPSSIC) on the DIII-D tokamak. EPSSIC provides control and interlocks for the ohmic heating coil power supply and its associated systems. Of particular interest is the architecture of this system: both a Programmable Logic Controller (PLC) and a Central Processor Unit (CPU) have been combined on a standard VME bus. The PLC and CPU input and output signals are routed through signal conditioning modules, which provide the necessary voltage and ground isolation. Additionally these modules adapt the signal levels to that of the VME I/O boards. One set of I/O signals is shared between the two processors. The resulting multiprocessor system provides a number of advantages: redundant operation for mission critical situations, flexible communications using conventional TCP/IP protocols, the simplicity of ladder logic programming for the majority of the control code, and an easily maintained and expandable non-proprietary system.

  19. Techniques for Improving the Performance of Sparse Matrix Factorization on Multiprocessor Workstations

    DTIC Science & Technology

    1990-06-01

    DATES COVERED 4. TITLE AND SUBTITLE S. FUNDING NUMBERS Techniques for Improving the Performance of Sparse Matrix 87-K-0828 Factorization on...ABSTRACT (Maximum 200 words) Abstract - this paper vk Ioo6at the problem of factoring large sparse systems of equations on high-performance multiprocessor... factorization codes achieve only a small fraction of this potential. A major limiting factor is the cost of memory accesses performed during the factorization

  20. Multiprocessor switch with selective pairing

    DOEpatents

    Gara, Alan; Gschwind, Michael K; Salapura, Valentina

    2014-03-11

    System, method and computer program product for a multiprocessing system to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). Each paired microprocessor or processor cores that provide one highly reliable thread for high-reliability connect with a system components such as a memory "nest" (or memory hierarchy), an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. The memory nest is attached to a selective pairing facility via a switch or a bus

  1. Parallel simulated annealing algorithms for cell placement on hypercube multiprocessors

    NASA Technical Reports Server (NTRS)

    Banerjee, Prithviraj; Jones, Mark Howard; Sargent, Jeff S.

    1990-01-01

    Two parallel algorithms for standard cell placement using simulated annealing are developed to run on distributed-memory message-passing hypercube multiprocessors. The cells can be mapped in a two-dimensional area of a chip onto processors in an n-dimensional hypercube in two ways, such that both small and large cell exchange and displacement moves can be applied. The computation of the cost function in parallel among all the processors in the hypercube is described, along with a distributed data structure that needs to be stored in the hypercube to support the parallel cost evaluation. A novel tree broadcasting strategy is used extensively for updating cell locations in the parallel environment. A dynamic parallel annealing schedule estimates the errors due to interacting parallel moves and adapts the rate of synchronization automatically. Two novel approaches in controlling error in parallel algorithms are described: heuristic cell coloring and adaptive sequence control.

  2. MLP: A Parallel Programming Alternative to MPI for New Shared Memory Parallel Systems

    NASA Technical Reports Server (NTRS)

    Taft, James R.

    1999-01-01

    Recent developments at the NASA AMES Research Center's NAS Division have demonstrated that the new generation of NUMA based Symmetric Multi-Processing systems (SMPs), such as the Silicon Graphics Origin 2000, can successfully execute legacy vector oriented CFD production codes at sustained rates far exceeding processing rates possible on dedicated 16 CPU Cray C90 systems. This high level of performance is achieved via shared memory based Multi-Level Parallelism (MLP). This programming approach, developed at NAS and outlined below, is distinct from the message passing paradigm of MPI. It offers parallelism at both the fine and coarse grained level, with communication latencies that are approximately 50-100 times lower than typical MPI implementations on the same platform. Such latency reductions offer the promise of performance scaling to very large CPU counts. The method draws on, but is also distinct from, the newly defined OpenMP specification, which uses compiler directives to support a limited subset of multi-level parallel operations. The NAS MLP method is general, and applicable to a large class of NASA CFD codes.

  3. Spaceborne VHSIC multiprocessor system for AI applications

    NASA Technical Reports Server (NTRS)

    Lum, Henry, Jr.; Shrobe, Howard E.; Aspinall, John G.

    1988-01-01

    A multiprocessor system, under design for space-station applications, makes use of the latest generation symbolic processor and packaging technology. The result will be a compact, space-qualified system two to three orders of magnitude more powerful than present-day symbolic processing systems.

  4. Energy efficient low power shared-memory Fast Fourier Transform (FFT) processor with dynamic voltage scaling

    NASA Astrophysics Data System (ADS)

    Fitrio, D.; Singh, J.; Stojcevski, A.

    2005-12-01

    Reduction of power dissipations in CMOS circuits needs to be addressed for portable battery devices. Selection of appropriate transistor library to minimise leakage current, implementation of low power design architectures, power management implementation, and the choice of chip packaging, all have impact on power dissipation and are important considerations in design and implementation of integrated circuits for low power applications. Energy-efficient architecture is highly desirable for battery operated systems, which operates in a wide variation of operating scenarios. Energy-efficient design aims to reconfigure its own architectures to scale down energy consumption depending upon the throughput and quality requirement. An energy efficient system should be able to decide its minimum power requirements by dynamically scaling its own operating frequency, supply voltage or the threshold voltage according to a variety of operating scenarios. The increasing product demand for application specific integrated circuit or processor for independent portable devices has influenced designers to implement dedicated processors with ultra low power requirements. One of these dedicated processors is a Fast Fourier Transform (FFT) processor, which is widely used in signal processing for numerous applications such as, wireless telecommunication and biomedical applications where the demand for extended battery life is extremely high. This paper presents the design and performance analysis of a low power shared memory FFT processor incorporating dynamic voltage scaling. Dynamic voltage scaling enables power supply scaling into various supply voltage levels. The concept behind the proposed solution is that if the speed of the main logic core can be adjusted according to input load or amount of processor's computation "just enough" to meet the requirement. The design was implemented using 0.12 μm ST-Microelectronic 6-metal layer CMOS dual- process technology in Cadence Analogue

  5. Thread mapping using system-level model for shared memory multicores

    NASA Astrophysics Data System (ADS)

    Mitra, Reshmi

    Exploring thread-to-core mapping options for a parallel application on a multicore architecture is computationally very expensive. For the same algorithm, the mapping strategy (MS) with the best response time may change with data size and thread counts. The primary challenge is to design a fast, accurate and automatic framework for exploring these MSs for large data-intensive applications. This is to ensure that the users can explore the design space within reasonable machine hours, without thorough understanding on how the code interacts with the platform. Response time is related to the cycles per instructions retired (CPI), taking into account both active and sleep states of the pipeline. This work establishes a hybrid approach, based on Markov Chain Model (MCM) and Model Tree (MT) for system-level steady state CPI prediction. It is designed for shared memory multicore processors with coarse-grained multithreading. The thread status is represented by the MCM states. The program characteristics are modeled as the transition probabilities, representing the system moving between active and suspended thread states. The MT model extrapolates these probabilities for the actual application size (AS) from the smaller AS performance. This aspect of the framework, along with, the use of mathematical expressions for the actual AS performance information, results in a tremendous reduction in the CPI prediction time. The framework is validated using an electromagnetics application. The average performance prediction error for steady state CPI results with 12 different MSs is less than 1%. The total run time of model is of the order of minutes, whereas the actual application execution time is in terms of days.

  6. Design and evaluation of a fault-tolerant multiprocessor using hardware recovery blocks

    NASA Technical Reports Server (NTRS)

    Lee, Y. H.; Shin, K. G.

    1982-01-01

    A fault-tolerant multiprocessor with a rollback recovery mechanism is discussed. The rollback mechanism is based on the hardware recovery block which is a hardware equivalent to the software recovery block. The hardware recovery block is constructed by consecutive state-save operations and several state-save units in every processor and memory module. When a fault is detected, the multiprocessor reconfigures itself to replace the faulty component and then the process originally assigned to the faulty component retreats to one of the previously saved states in order to resume fault-free execution. A mathematical model is proposed to calculate both the coverage of multi-step rollback recovery and the risk of restart. A performance evaluation in terms of task execution time is also presented.

  7. Satisfiability Test with Synchronous Simulated Annealing on the Fujitsu AP1000 Massively-Parallel Multiprocessor

    NASA Technical Reports Server (NTRS)

    Sohn, Andrew; Biswas, Rupak

    1996-01-01

    Solving the hard Satisfiability Problem is time consuming even for modest-sized problem instances. Solving the Random L-SAT Problem is especially difficult due to the ratio of clauses to variables. This report presents a parallel synchronous simulated annealing method for solving the Random L-SAT Problem on a large-scale distributed-memory multiprocessor. In particular, we use a parallel synchronous simulated annealing procedure, called Generalized Speculative Computation, which guarantees the same decision sequence as sequential simulated annealing. To demonstrate the performance of the parallel method, we have selected problem instances varying in size from 100-variables/425-clauses to 5000-variables/21,250-clauses. Experimental results on the AP1000 multiprocessor indicate that our approach can satisfy 99.9 percent of the clauses while giving almost a 70-fold speedup on 500 processors.

  8. Fault detection, isolation and reconfiguration in FTMP Methods and experimental results. [fault tolerant multiprocessor

    NASA Technical Reports Server (NTRS)

    Lala, J. H.

    1983-01-01

    The Fault-Tolerant Multiprocessor (FTMP) is a highly reliable computer designed to meet a goal of 10 to the -10th failures per hour and built with the objective of flying an active-control transport aircraft. Fault detection, identification, and recovery software is described, and experimental results obtained by injecting faults in the pin level in the FTMP are presented. Over 21,000 faults were injected in the CPU, memory, bus interface circuits, and error detection, masking, and error reporting circuits of one LRU of the multiprocessor. Detection, isolation, and reconfiguration times were recorded for each fault, and the results were found to agree well with earlier assumptions made in reliability modeling.

  9. Modeling the Performance of the Concert Multiprocessor.

    DTIC Science & Technology

    1987-05-01

    215, ,’ [MIJ Merchant, S.S., "llie Design and Performancc Analysis of an Arbiter for a Multiprocessor Sharcd-Mcmory System", Report #IlDS-Tl-[-1396...6IFIATION OF THIS PAGE AD-41 -361q P - REPORT DOCUMENTATION PAGE l. REPORT SECURITY CLASSIFICATION lb RESTRICTIVE MARKINGS Unclassified 2a. SECURITY...CLASSIFICATION AUTHORITY 3. DISTRIBUTION /AVAILABILITY OF REPORT 2b. DECLASSIFICATION IDOWNGRADING SCHEDULE Approved for public release; distribution is

  10. Associative-memory representations emerge as shared spatial patterns of theta activity spanning the primate temporal cortex

    PubMed Central

    Nakahara, Kiyoshi; Adachi, Ken; Kawasaki, Keisuke; Matsuo, Takeshi; Sawahata, Hirohito; Majima, Kei; Takeda, Masaki; Sugiyama, Sayaka; Nakata, Ryota; Iijima, Atsuhiko; Tanigawa, Hisashi; Suzuki, Takafumi; Kamitani, Yukiyasu; Hasegawa, Isao

    2016-01-01

    Highly localized neuronal spikes in primate temporal cortex can encode associative memory; however, whether memory formation involves area-wide reorganization of ensemble activity, which often accompanies rhythmicity, or just local microcircuit-level plasticity, remains elusive. Using high-density electrocorticography, we capture local-field potentials spanning the monkey temporal lobes, and show that the visual pair-association (PA) memory is encoded in spatial patterns of theta activity in areas TE, 36, and, partially, in the parahippocampal cortex, but not in the entorhinal cortex. The theta patterns elicited by learned paired associates are distinct between pairs, but similar within pairs. This pattern similarity, emerging through novel PA learning, allows a machine-learning decoder trained on theta patterns elicited by a particular visual item to correctly predict the identity of those elicited by its paired associate. Our results suggest that the formation and sharing of widespread cortical theta patterns via learning-induced reorganization are involved in the mechanisms of associative memory representation. PMID:27282247

  11. Distinct and shared cognitive functions mediate event- and time-based prospective memory impairment in normal ageing

    PubMed Central

    Gonneaud, Julie; Kalpouzos, Grégoria; Bon, Laetitia; Viader, Fausto; Eustache, Francis; Desgranges, Béatrice

    2011-01-01

    Prospective memory (PM) is the ability to remember to perform an action at a specific point in the future. Regarded as multidimensional, PM involves several cognitive functions that are known to be impaired in normal aging. In the present study, we set out to investigate the cognitive correlates of PM impairment in normal aging. Manipulating cognitive load, we assessed event- and time-based PM, as well as several cognitive functions, including executive functions, working memory and retrospective episodic memory, in healthy subjects covering the entire adulthood. We found that normal aging was characterized by PM decline in all conditions and that event-based PM was more sensitive to the effects of aging than time-based PM. Whatever the conditions, PM was linked to inhibition and processing speed. However, while event-based PM was mainly mediated by binding and retrospective memory processes, time-based PM was mainly related to inhibition. The only distinction between high- and low-load PM cognitive correlates lays in an additional, but marginal, correlation between updating and the high-load PM condition. The association of distinct cognitive functions, as well as shared mechanisms with event- and time-based PM confirms that each type of PM relies on a different set of processes. PMID:21678154

  12. An Analysis of an Improved Bus-Based Multiprocessor Architecture

    NASA Technical Reports Server (NTRS)

    Ricks, Kenneth G.; Wells, B. Earl

    1998-01-01

    This paper analyses the effectiveness of a hybrid multiprocessing/multicomputing architecture that is based upon a single-board-computer multiprocessor (SBCM) architecture. Based upon empirical analysis using discrete event simulations and Monte Carlo techniques, this hybrid architecture, called the enhanced single-board-computer multiprocessor (ESBCM), is shown to have improved performance and scalability characteristics over current SBCM designs.

  13. Multi-core and Many-core Shared-memory Parallel Raycasting Volume Rendering Optimization and Tuning

    SciTech Connect

    Howison, Mark

    2012-01-31

    Given the computing industry trend of increasing processing capacity by adding more cores to a chip, the focus of this work is tuning the performance of a staple visualization algorithm, raycasting volume rendering, for shared-memory parallelism on multi-core CPUs and many-core GPUs. Our approach is to vary tunable algorithmic settings, along with known algorithmic optimizations and two different memory layouts, and measure performance in terms of absolute runtime and L2 memory cache misses. Our results indicate there is a wide variation in runtime performance on all platforms, as much as 254% for the tunable parameters we test on multi-core CPUs and 265% on many-core GPUs, and the optimal configurations vary across platforms, often in a non-obvious way. For example, our results indicate the optimal configurations on the GPU occur at a crossover point between those that maintain good cache utilization and those that saturate computational throughput. This result is likely to be extremely difficult to predict with an empirical performance model for this particular algorithm because it has an unstructured memory access pattern that varies locally for individual rays and globally for the selected viewpoint. Our results also show that optimal parameters on modern architectures are markedly different from those in previous studies run on older architectures. And, given the dramatic performance variation across platforms for both optimal algorithm settings and performance results, there is a clear benefit for production visualization and analysis codes to adopt a strategy for performance optimization through auto-tuning. These benefits will likely become more pronounced in the future as the number of cores per chip and the cost of moving data through the memory hierarchy both increase.

  14. Partitioning of regular computation on multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Lee, Fung Fung

    1988-01-01

    Problem partitioning of regular computation over two dimensional meshes on multiprocessor systems is examined. The regular computation model considered involves repetitive evaluation of values at each mesh point with local communication. The computational workload and the communication pattern are the same at each mesh point. The regular computation model arises in numerical solutions of partial differential equations and simulations of cellular automata. Given a communication pattern, a systematic way to generate a family of partitions is presented. The influence of various partitioning schemes on performance is compared on the basis of computation to communication ratio.

  15. Partitioning of regular computation on multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Lee, Fung F.

    1990-01-01

    Problem partitioning of regular computation over two dimensional meshes on multiprocessor systems is examined. The regular computation model considered involves repetitive evaluation of values at each mesh point with local communication. The computational workload and the communication pattern are the same at each mesh point. The regular computation model arises in numerical solutions of partial differential equations and simulations of cellular automata. Given a communication pattern, a systematic way to generate a family of partitions is presented. The influence of various partitioning schemes on performance is compared on the basis of computation to communication ratio.

  16. Partitioning of regular computation on multiprocessor systems

    SciTech Connect

    Lee, F. . Computer Systems Lab.)

    1990-07-01

    Problem partitioning of regular computation over two-dimensional meshes on multiprocessor systems is examined. The regular computation model considered involves repetitive evaluation of values at each mesh point with local communication. The computational workload and the communication pattern are the same at each mesh point. The regular computation model arises in numerical solutions of partial differential equations and simulations of cellular automata. Given a communication pattern, a systematic way to generate a family of partitions is presented. The influence of various partitioning schemes on performance is compared on the basis of computation to communication ratio.

  17. Shared Etiology of Phonological Memory and Vocabulary Deficits in School-Age Children

    ERIC Educational Resources Information Center

    Peterson, Robin L.; Pennington, Bruce F.; Samuelsson, Stefan; Byrne, Brian; Olson, Richard K.

    2013-01-01

    Purpose: The goal of this study was to investigate the etiologic basis for the association between deficits in phonological memory (PM) and vocabulary in school-age children. Method: Children with deficits in PM or vocabulary were identified within the International Longitudinal Twin Study (ILTS; Samuelsson et al., 2005). The ILTS includes 1,045…

  18. Parallel algorithms for geometric connected component labeling on a hypercube multiprocessor

    NASA Technical Reports Server (NTRS)

    Belkhale, K. P.; Banerjee, P.

    1992-01-01

    Different algorithms for the geometric connected component labeling (GCCL) problem are defined each of which involves d stages of message passing, for a d-dimensional hypercube. The major idea is that in each stage a hypercube multiprocessor increases its knowledge of domain. The algorithms under consideration include the QUAD algorithm for small number of processors and the Overlap Quad algorithm for large number of processors, subject to the locality of the connected sets. These algorithms differ in their run time, memory requirements, and message complexity. They were implemented on an Intel iPSC2/D4/MX hypercube.

  19. Neural substrates of shared attention as social memory: A hyperscanning functional magnetic resonance imaging study.

    PubMed

    Koike, Takahiko; Tanabe, Hiroki C; Okazaki, Shuntaro; Nakagawa, Eri; Sasaki, Akihiro T; Shimada, Koji; Sugawara, Sho K; Takahashi, Haruka K; Yoshihara, Kazufumi; Bosch-Bayard, Jorge; Sadato, Norihiro

    2016-01-15

    During a dyadic social interaction, two individuals can share visual attention through gaze, directed to each other (mutual gaze) or to a third person or an object (joint attention). Shared attention is fundamental to dyadic face-to-face interaction, but how attention is shared, retained, and neutrally represented in a pair-specific manner has not been well studied. Here, we conducted a two-day hyperscanning functional magnetic resonance imaging study in which pairs of participants performed a real-time mutual gaze task followed by a joint attention task on the first day, and mutual gaze tasks several days later. The joint attention task enhanced eye-blink synchronization, which is believed to be a behavioral index of shared attention. When the same participant pairs underwent mutual gaze without joint attention on the second day, enhanced eye-blink synchronization persisted, and this was positively correlated with inter-individual neural synchronization within the right inferior frontal gyrus. Neural synchronization was also positively correlated with enhanced eye-blink synchronization during the previous joint attention task session. Consistent with the Hebbian association hypothesis, the right inferior frontal gyrus had been activated both by initiating and responding to joint attention. These results indicate that shared attention is represented and retained by pair-specific neural synchronization that cannot be reduced to the individual level.

  20. SHARE and Share Alike

    ERIC Educational Resources Information Center

    Baird, Jeffrey Marshall

    2006-01-01

    This article describes a reading comprehension program adopted at J. E. Cosgriff Memorial Catholic School in Salt Lake City, Utah. The program is called SHARE: Students Helping Achieve Reading Excellence, and involves seventh and eighth grade students teaching first and second graders reading comprehension strategies learned in middle school…

  1. A multiprocessor airborne lidar data system

    NASA Technical Reports Server (NTRS)

    Wright, C. W.; Bailey, S. A.; Heath, G. E.; Piazza, C. R.

    1988-01-01

    A new multiprocessor data acquisition system was developed for the existing Airborne Oceanographic Lidar (AOL). This implementation simultaneously utilizes five single board 68010 microcomputers, the UNIX system V operating system, and the real time executive VRTX. The original data acquisition system was implemented on a Hewlett Packard HP 21-MX 16 bit minicomputer using a multi-tasking real time operating system and a mixture of assembly and FORTRAN languages. The present collection of data sources produce data at widely varied rates and require varied amounts of burdensome real time processing and formatting. It was decided to replace the aging HP 21-MX minicomputer with a multiprocessor system. A new and flexible recording format was devised and implemented to accommodate the constantly changing sensor configuration. A central feature of this data system is the minimization of non-remote sensing bus traffic. Therefore, it is highly desirable that each micro be capable of functioning as much as possible on-card or via private peripherals. The bus is used primarily for the transfer of remote sensing data to or from the buffer queue.

  2. A multiprocessor airborne lidar data system

    NASA Astrophysics Data System (ADS)

    Wright, C. W.; Bailey, S. A.; Heath, G. E.; Piazza, C. R.

    A new multiprocessor data acquisition system was developed for the existing Airborne Oceanographic Lidar (AOL). This implementation simultaneously utilizes five single board 68010 microcomputers, the UNIX system V operating system, and the real time executive VRTX. The original data acquisition system was implemented on a Hewlett Packard HP 21-MX 16 bit minicomputer using a multi-tasking real time operating system and a mixture of assembly and FORTRAN languages. The present collection of data sources produce data at widely varied rates and require varied amounts of burdensome real time processing and formatting. It was decided to replace the aging HP 21-MX minicomputer with a multiprocessor system. A new and flexible recording format was devised and implemented to accommodate the constantly changing sensor configuration. A central feature of this data system is the minimization of non-remote sensing bus traffic. Therefore, it is highly desirable that each micro be capable of functioning as much as possible on-card or via private peripherals. The bus is used primarily for the transfer of remote sensing data to or from the buffer queue.

  3. LDRD final report : managing shared memory data distribution in hybrid HPC applications.

    SciTech Connect

    Merritt, Alexander M.; Pedretti, Kevin Thomas Tauke

    2010-09-01

    MPI is the dominant programming model for distributed memory parallel computers, and is often used as the intra-node programming model on multi-core compute nodes. However, application developers are increasingly turning to hybrid models that use threading within a node and MPI between nodes. In contrast to MPI, most current threaded models do not require application developers to deal explicitly with data locality. With increasing core counts and deeper NUMA hierarchies seen in the upcoming LANL/SNL 'Cielo' capability supercomputer, data distribution poses an upper boundary on intra-node scalability within threaded applications. Data locality therefore has to be identified at runtime using static memory allocation policies such as first-touch or next-touch, or specified by the application user at launch time. We evaluate several existing techniques for managing data distribution using micro-benchmarks on an AMD 'Magny-Cours' system with 24 cores among 4 NUMA domains and argue for the adoption of a dynamic runtime system implemented at the kernel level, employing a novel page table replication scheme to gather per-NUMA domain memory access traces.

  4. Autobiographical Memory Sharing in Everyday Life: Characteristics of a Good Story

    ERIC Educational Resources Information Center

    Baron, Jacqueline M.; Bluck, Susan

    2009-01-01

    Storytelling is a ubiquitous human activity that occurs across the lifespan as part of everyday life. Studies from three disparate literatures suggest that older adults (as compared to younger adults) are (a) less likely to recall story details, (b) more likely to go off-target when sharing stories, and, in contrast, (c) more likely to receive…

  5. Shared Memory Performance of Multi-Computer Terminals in Distributed Information Systems.

    ERIC Educational Resources Information Center

    Reddi, Arumalla V.

    1984-01-01

    Presents a system model for transmission of input data that is coming from terminals of users in a limited user resource-sharing environment. Performance of a mini/microcomputer receiving mixture of picture-phone terminal data is analyzed with constant service times, synchronous transmission, and single-server interruptions through first-order…

  6. Insertion of coherence requests for debugging a multiprocessor

    DOEpatents

    Blumrich, Matthias A.; Salapura, Valentina

    2010-02-23

    A method and system are disclosed to insert coherence events in a multiprocessor computer system, and to present those coherence events to the processors of the multiprocessor computer system for analysis and debugging purposes. The coherence events are inserted in the computer system by adding one or more special insert registers. By writing into the insert registers, coherence events are inserted in the multiprocessor system as if they were generated by the normal coherence protocol. Once these coherence events are processed, the processing of coherence events can continue in the normal operation mode.

  7. Memory

    MedlinePlus

    ... it has to decide what is worth remembering. Memory is the process of storing and then remembering this information. There are different types of memory. Short-term memory stores information for a few ...

  8. Fault diagnosis in sparse multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Blough, Douglas M.; Sullivan, Gregory F.; Masson, Gerald M.

    1988-01-01

    The problem of fault diagnosis in multiprocessor systems is considered under a uniformly probabilistic model in which processors are faulty with probability p. This work focuses on minimizing the number of tests that must be conducted in order to correctly diagnose the state of every processor in the system with high probability. A diagnosis algorithm that can correctly diagnose the state of every processor with probability approaching one in a class of systems performing slightly greater than a linear number of tests is presented. A nearly matching lower bound on the number of tests required to achieve correct diagnosis in arbitrary systems is also proven. The number of tests required under this probabilistic model is shown to be significantly less than under a bounded-size fault set model. Because the number of tests that must be conducted is a measure of the diagnosis overhead, these results represent a dramatic improvement in the performance of system-level diagnosis technique.

  9. Vectorization of a multiprocessor multifrontal code

    SciTech Connect

    Amestoy, P.R. ); Duff, I.S. )

    1989-01-01

    The authors describe design changes that enhance the vectorization of a multiprocessor version of a multifrontal code for the direct solution of large sparse sets of linear equations. These changes employ techniques used with success in full Gaussian elimination and are based on the use of matrix-vector and matrix-matrix kernels as implemented in the Level 2 and Level 3 BLAS. They illustrate the performance of the improved code by runs on the IBM 3090/VF, the ETA-10P, and the CRAY-2. Although their experiments are principally on a single processor of these machines, they briefly consider the influence of multiprocessing. Speedup factors of more than 11 are obtained, and the modified code performs at over 200 MFLOPS on standard structures problems on one processor of the CRAY-2.

  10. Hardware for a real-time multiprocessor simulator

    NASA Technical Reports Server (NTRS)

    Blech, R. A.; Arpasi, D. J.

    1984-01-01

    The hardware for a real time multiprocessor simulator (RTMPS) developed at the NASA Lewis Research Center is described. The RTMPS is a multiple microprocessor system used to investigate the application of parallel processing concepts to real time simulation. It is designed to provide flexible data exchange paths between processors by using off the shelf microcomputer boards and minimal customized interfacing. A dedicated operator interface allows easy setup of the simulator and quick interpreting of simulation data. Simulations for the RTMPS are coded in a NASA designed real time multiprocessor language (RTMPL). This language is high level and geared to the multiprocessor environment. A real time multiprocessor operating system (RTMPOS) has also been developed that provides a user friendly operator interface. The RTMPS and supporting software are currently operational and are being evaluated at Lewis. The results of this evaluation will be used to specify the design of an optimized parallel processing system for real time simulation of dynamic systems.

  11. VME rollback hardware for time warp multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Robb, Michael J.; Buzzell, Calvin A.

    1992-01-01

    The purpose of the research effort is to develop and demonstrate innovative hardware to implement specific rollback and timing functions required for efficient queue management and precision timekeeping in multiprocessor discrete event simulations. The previously completed phase 1 effort demonstrated the technical feasibility of building hardware modules which eliminate the state saving overhead of the Time Warp paradigm used in distributed simulations on multiprocessor systems. The current phase 2 effort will build multiple pre-production rollback hardware modules integrated with a network of Sun workstations, and the integrated system will be tested by executing a Time Warp simulation. The rollback hardware will be designed to interface with the greatest number of multiprocessor systems possible. The authors believe that the rollback hardware will provide for significant speedup of large scale discrete event simulation problems and allow multiprocessors using Time Warp to dramatically increase performance.

  12. The change probability effect: incidental learning, adaptability, and shared visual working memory resources.

    PubMed

    van Lamsweerde, Amanda E; Beck, Melissa R

    2011-12-01

    Statistical properties in the visual environment can be used to improve performance on visual working memory (VWM) tasks. The current study examined the ability to incidentally learn that a change is more likely to occur to a particular feature dimension (shape, color, or location) and use this information to improve change detection performance for that dimension (the change probability effect). Participants completed a change detection task in which one change type was more probable than others. Change probability effects were found for color and shape changes, but not location changes, and intentional strategies did not improve the effect. Furthermore, the change probability effect developed and adapted to new probability information quickly. Finally, in some conditions, an improvement in change detection performance for a probable change led to an impairment in change detection for improbable changes.

  13. Directions in parallel programming: HPF, shared virtual memory and object parallelism in pC++

    NASA Technical Reports Server (NTRS)

    Bodin, Francois; Priol, Thierry; Mehrotra, Piyush; Gannon, Dennis

    1994-01-01

    Fortran and C++ are the dominant programming languages used in scientific computation. Consequently, extensions to these languages are the most popular for programming massively parallel computers. We discuss two such approaches to parallel Fortran and one approach to C++. The High Performance Fortran Forum has designed HPF with the intent of supporting data parallelism on Fortran 90 applications. HPF works by asking the user to help the compiler distribute and align the data structures with the distributed memory modules in the system. Fortran-S takes a different approach in which the data distribution is managed by the operating system and the user provides annotations to indicate parallel control regions. In the case of C++, we look at pC++ which is based on a concurrent aggregate parallel model.

  14. File-System Workload on a Scientific Multiprocessor

    NASA Technical Reports Server (NTRS)

    Kotz, David; Nieuwejaar, Nils

    1995-01-01

    Many scientific applications have intense computational and I/O requirements. Although multiprocessors have permitted astounding increases in computational performance, the formidable I/O needs of these applications cannot be met by current multiprocessors a their I/O subsystems. To prevent I/O subsystems from forever bottlenecking multiprocessors and limiting the range of feasible applications, new I/O subsystems must be designed. The successful design of computer systems (both hardware and software) depends on a thorough understanding of their intended use. A system designer optimizes the policies and mechanisms for the cases expected to most common in the user's workload. In the case of multiprocessor file systems, however, designers have been forced to build file systems based only on speculation about how they would be used, extrapolating from file-system characterizations of general-purpose workloads on uniprocessor and distributed systems or scientific workloads on vector supercomputers (see sidebar on related work). To help these system designers, in June 1993 we began the Charisma Project, so named because the project sought to characterize 1/0 in scientific multiprocessor applications from a variety of production parallel computing platforms and sites. The Charisma project is unique in recording individual read and write requests-in live, multiprogramming, parallel workloads (rather than from selected or nonparallel applications). In this article, we present the first results from the project: a characterization of the file-system workload an iPSC/860 multiprocessor running production, parallel scientific applications at NASA's Ames Research Center.

  15. A class Hierarchical, object-oriented approach to virtual memory management

    NASA Technical Reports Server (NTRS)

    Russo, Vincent F.; Campbell, Roy H.; Johnston, Gary M.

    1989-01-01

    The Choices family of operating systems exploits class hierarchies and object-oriented programming to facilitate the construction of customized operating systems for shared memory and networked multiprocessors. The software is being used in the Tapestry laboratory to study the performance of algorithms, mechanisms, and policies for parallel systems. Described here are the architectural design and class hierarchy of the Choices virtual memory management system. The software and hardware mechanisms and policies of a virtual memory system implement a memory hierarchy that exploits the trade-off between response times and storage capacities. In Choices, the notion of a memory hierarchy is captured by abstract classes. Concrete subclasses of those abstractions implement a virtual address space, segmentation, paging, physical memory management, secondary storage, and remote (that is, networked) storage. Captured in the notion of a memory hierarchy are classes that represent memory objects. These classes provide a storage mechanism that contains encapsulated data and have methods to read or write the memory object. Each of these classes provides specializations to represent the memory hierarchy.

  16. Memory Benchmarks for SMP-Based High Performance Parallel Computers

    SciTech Connect

    Yoo, A B; de Supinski, B; Mueller, F; Mckee, S A

    2001-11-20

    As the speed gap between CPU and main memory continues to grow, memory accesses increasingly dominates the performance of many applications. The problem is particularly acute for symmetric multiprocessor (SMP) systems, where the shared memory may be accessed concurrently by a group of threads running on separate CPUs. Unfortunately, several key issues governing memory system performance in current systems are not well understood. Complex interactions between the levels of the memory hierarchy, buses or switches, DRAM back-ends, system software, and application access patterns can make it difficult to pinpoint bottlenecks and determine appropriate optimizations, and the situation is even more complex for SMP systems. To partially address this problem, we formulated a set of multi-threaded microbenchmarks for characterizing and measuring the performance of the underlying memory system in SMP-based high-performance computers. We report our use of these microbenchmarks on two important SMP-based machines. This paper has four primary contributions. First, we introduce a microbenchmark suite to systematically assess and compare the performance of different levels in SMP memory hierarchies. Second, we present a new tool based on hardware performance monitors to determine a wide array of memory system characteristics, such as cache sizes, quickly and easily; by using this tool, memory performance studies can be targeted to the full spectrum of performance regimes with many fewer data points than is otherwise required. Third, we present experimental results indicating that the performance of applications with large memory footprints remains largely constrained by memory. Fourth, we demonstrate that thread-level parallelism further degrades memory performance, even for the latest SMPs with hardware prefetching and switch-based memory interconnects.

  17. Memory.

    ERIC Educational Resources Information Center

    McKean, Kevin

    1983-01-01

    Discusses current research (including that involving amnesiacs and snails) into the nature of the memory process, differentiating between and providing examples of "fact" memory and "skill" memory. Suggests that three brain parts (thalamus, fornix, mammilary body) are involved in the memory process. (JN)

  18. Memory performance of Prolog architectures

    SciTech Connect

    Tick, E.

    1988-01-01

    Memory Performance of Prolog Architectures addresses these problems and reports dynamic data and instruction referencing characteristics of both sequential and parallel prolog architectures and corresponding uni-processor and multi-processor memory-hierarchy performance tradeoffs. Computer designers and logic programmers will find this work to be a valuable reference with many practical applications. Memory Performance of Prolog Architectures will also serve as an important textbook for graduate level courses in computer architecture and/or performance analysis.

  19. Exploration of SMP-Aware DAO Memory Performance Issues-Final Report 2002

    SciTech Connect

    de Supinski, B R; Yoo, A; McKee, S A; Schulz, M; Mohan, T

    2003-02-04

    The performance of many LLNL applications is dominated by the cost of main memory accesses. Worse, many current trends in computer architecture will lead to substantial degradation of the percentage of peak performance obtained by these codes. This project yields novel techniques that alleviate this problem in SMP-based systems, which are common at LLNL. Further, our techniques will complement other emerging mechanisms for improving memory system performance, such as processor-in-memory. The exploration of existing dynamic access ordering (DAO) mechanisms adapted to SMPs and the development of new memory performance optimization techniques will lead to significant improvements in run times for LLNL applications on future computing platforms, effectively increasing the size of the platform. In this project, we have focused on a range of techniques to overcome the performance bottleneck of current multiprocessor systems and to increase the single-node efficiency. These efforts include the design and implementation of a toolset to analyze memory access patterns of applications, the exploration of regularity metrics and their use to classify code behavior, and a set of microbenchmarks to assess and quantify the performance of SMP memory systems. We will make these tools available to the general laboratory user community to help the evaluation and optimization of LLNL applications. In addition, we explored the use of Dynamic Access Ordering (DAO) techniques in the realm of shared memory multiprocessors. The most critical part of the latter is the need to maintain coherence among reordered accesses due to possible aliasing. We have worked on several design alternatives to guarantee consistency in such systems without changing the user environment. This guarantees that such novel memory systems will be directly applicable for existing and future HPC codes at LLNL.

  20. Multidisciplinary Simulation Acceleration using Multiple Shared-Memory Graphical Processing Units

    NASA Astrophysics Data System (ADS)

    Kemal, Jonathan Yashar

    For purposes of optimizing and analyzing turbomachinery and other designs, the unsteady Favre-averaged flow-field differential equations for an ideal compressible gas can be solved in conjunction with the heat conduction equation. We solve all equations using the finite-volume multiple-grid numerical technique, with the dual time-step scheme used for unsteady simulations. Our numerical solver code targets CUDA-capable Graphical Processing Units (GPUs) produced by NVIDIA. Making use of MPI, our solver can run across networked compute notes, where each MPI process can use either a GPU or a Central Processing Unit (CPU) core for primary solver calculations. We use NVIDIA Tesla C2050/C2070 GPUs based on the Fermi architecture, and compare our resulting performance against Intel Zeon X5690 CPUs. Solver routines converted to CUDA typically run about 10 times faster on a GPU for sufficiently dense computational grids. We used a conjugate cylinder computational grid and ran a turbulent steady flow simulation using 4 increasingly dense computational grids. Our densest computational grid is divided into 13 blocks each containing 1033x1033 grid points, for a total of 13.87 million grid points or 1.07 million grid points per domain block. To obtain overall speedups, we compare the execution time of the solver's iteration loop, including all resource intensive GPU-related memory copies. Comparing the performance of 8 GPUs to that of 8 CPUs, we obtain an overall speedup of about 6.0 when using our densest computational grid. This amounts to an 8-GPU simulation running about 39.5 times faster than running than a single-CPU simulation.

  1. A fault-tolerant multiprocessor architecture for aircraft, volume 1. [autopilot configuration

    NASA Technical Reports Server (NTRS)

    Smith, T. B.; Hopkins, A. L.; Taylor, W.; Ausrotas, R. A.; Lala, J. H.; Hanley, L. D.; Martin, J. H.

    1978-01-01

    A fault-tolerant multiprocessor architecture is reported. This architecture, together with a comprehensive information system architecture, has important potential for future aircraft applications. A preliminary definition and assessment of a suitable multiprocessor architecture for such applications is developed.

  2. Prefetching in file systems for MIMD multiprocessors

    NASA Technical Reports Server (NTRS)

    Kotz, David F.; Ellis, Carla Schlatter

    1990-01-01

    The question of whether prefetching blocks on the file into the block cache can effectively reduce overall execution time of a parallel computation, even under favorable assumptions, is considered. Experiments have been conducted with an interleaved file system testbed on the Butterfly Plus multiprocessor. Results of these experiments suggest that (1) the hit ratio, the accepted measure in traditional caching studies, may not be an adequate measure of performance when the workload consists of parallel computations and parallel file access patterns, (2) caching with prefetching can significantly improve the hit ratio and the average time to perform an I/O (input/output) operation, and (3) an improvement in overall execution time has been observed in most cases. In spite of these gains, prefetching sometimes results in increased execution times (a negative result, given the optimistic nature of the study). The authors explore why it is not trivial to translate savings on individual I/O requests into consistently better overall performance and identify the key problems that need to be addressed in order to improve the potential of prefetching techniques in the environment.

  3. FTMP (Fault Tolerant Multiprocessor) programmer's manual

    NASA Technical Reports Server (NTRS)

    Feather, F. E.; Liceaga, C. A.; Padilla, P. A.

    1986-01-01

    The Fault Tolerant Multiprocessor (FTMP) computer system was constructed using the Rockwell/Collins CAPS-6 processor. It is installed in the Avionics Integration Research Laboratory (AIRLAB) of NASA Langley Research Center. It is hosted by AIRLAB's System 10, a VAX 11/750, for the loading of programs and experimentation. The FTMP support software includes a cross compiler for a high level language called Automated Engineering Design (AED) System, an assembler for the CAPS-6 processor assembly language, and a linker. Access to this support software is through an automated remote access facility on the VAX which relieves the user of the burden of learning how to use the IBM 4381. This manual is a compilation of information about the FTMP support environment. It explains the FTMP software and support environment along many of the finer points of running programs on FTMP. This will be helpful to the researcher trying to run an experiment on FTMP and even to the person probing FTMP with fault injections. Much of the information in this manual can be found in other sources; we are only attempting to bring together the basic points in a single source. If the reader should need points clarified, there is a list of support documentation in the back of this manual.

  4. Selection in spatial working memory is independent of perceptual selective attention, but they interact in a shared spatial priority map.

    PubMed

    Hedge, Craig; Oberauer, Klaus; Leonards, Ute

    2015-11-01

    We examined the relationship between the attentional selection of perceptual information and of information in working memory (WM) through four experiments, using a spatial WM-updating task. Participants remembered the locations of two objects in a matrix and worked through a sequence of updating operations, each mentally shifting one dot to a new location according to an arrow cue. Repeatedly updating the same object in two successive steps is typically faster than switching to the other object; this object switch cost reflects the shifting of attention in WM. In Experiment 1, the arrows were presented in random peripheral locations, drawing perceptual attention away from the selected object in WM. This manipulation did not eliminate the object switch cost, indicating that the mechanisms of perceptual selection do not underlie selection in WM. Experiments 2a and 2b corroborated the independence of selection observed in Experiment 1, but showed a benefit to reaction times when the placement of the arrow cue was aligned with the locations of relevant objects in WM. Experiment 2c showed that the same benefit also occurs when participants are not able to mark an updating location through eye fixations. Together, these data can be accounted for by a framework in which perceptual selection and selection in WM are separate mechanisms that interact through a shared spatial priority map.

  5. Evaluation of the Cedar memory system: Configuration of 16 by 16

    NASA Technical Reports Server (NTRS)

    Gallivan, K.; Jalby, W.; Wijshoff, H.

    1991-01-01

    Some basic results on the performance of the Cedar multiprocessor system are presented. Empirical results on the 16 processor 16 memory bank system configuration, which show the behavior of the Cedar system under different modes of operation are presented.

  6. Real-Time Multiprocessor Programming Language (RTMPL) user's manual

    NASA Technical Reports Server (NTRS)

    Arpasi, D. J.

    1985-01-01

    A real-time multiprocessor programming language (RTMPL) has been developed to provide for high-order programming of real-time simulations on systems of distributed computers. RTMPL is a structured, engineering-oriented language. The RTMPL utility supports a variety of multiprocessor configurations and types by generating assembly language programs according to user-specified targeting information. Many programming functions are assumed by the utility (e.g., data transfer and scaling) to reduce the programming chore. This manual describes RTMPL from a user's viewpoint. Source generation, applications, utility operation, and utility output are detailed. An example simulation is generated to illustrate many RTMPL features.

  7. The left superior temporal gyrus is a shared substrate for auditory short-term memory and speech comprehension: evidence from 210 patients with stroke.

    PubMed

    Leff, Alexander P; Schofield, Thomas M; Crinion, Jennifer T; Seghier, Mohamed L; Grogan, Alice; Green, David W; Price, Cathy J

    2009-12-01

    Competing theories of short-term memory function make specific predictions about the functional anatomy of auditory short-term memory and its role in language comprehension. We analysed high-resolution structural magnetic resonance images from 210 stroke patients and employed a novel voxel based analysis to test the relationship between auditory short-term memory and speech comprehension. Using digit span as an index of auditory short-term memory capacity we found that the structural integrity of a posterior region of the superior temporal gyrus and sulcus predicted auditory short-term memory capacity, even when performance on a range of other measures was factored out. We show that the integrity of this region also predicts the ability to comprehend spoken sentences. Our results therefore support cognitive models that posit a shared substrate between auditory short-term memory capacity and speech comprehension ability. The method applied here will be particularly useful for modelling structure-function relationships within other complex cognitive domains.

  8. Job-mix modeling and system analysis of an aerospace multiprocessor.

    NASA Technical Reports Server (NTRS)

    Mallach, E. G.

    1972-01-01

    An aerospace guidance computer organization, consisting of multiple processors and memory units attached to a central time-multiplexed data bus, is described. A job mix for this type of computer is obtained by analysis of Apollo mission programs. Multiprocessor performance is then analyzed using: 1) queuing theory, under certain 'limiting case' assumptions; 2) Markov process methods; and 3) system simulation. Results of the analyses indicate: 1) Markov process analysis is a useful and efficient predictor of simulation results; 2) efficient job execution is not seriously impaired even when the system is so overloaded that new jobs are inordinately delayed in starting; 3) job scheduling is significant in determining system performance; and 4) a system having many slow processors may or may not perform better than a system of equal power having few fast processors, but will not perform significantly worse.

  9. Development and evaluation of a Fault-Tolerant Multiprocessor (FTMP) computer. Volume 2: FTMP software

    NASA Technical Reports Server (NTRS)

    Lala, J. H.; Smith, T. B., III

    1983-01-01

    The software developed for the Fault-Tolerant Multiprocessor (FTMP) is described. The FTMP executive is a timer-interrupt driven dispatcher that schedules iterative tasks which run at 3.125, 12.5, and 25 Hz. Major tasks which run under the executive include system configuration control, flight control, and display. The flight control task includes autopilot and autoland functions for a jet transport aircraft. System Displays include status displays of all hardware elements (processors, memories, I/O ports, buses), failure log displays showing transient and hard faults, and an autopilot display. All software is in a higher order language (AED, an ALGOL derivative). The executive is a fully distributed general purpose executive which automatically balances the load among available processor triads. Provisions for graceful performance degradation under processing overload are an integral part of the scheduling algorithms.

  10. Fault tree models for fault tolerant hypercube multiprocessors

    NASA Technical Reports Server (NTRS)

    Boyd, Mark A.; Tuazon, Jezus O.

    1991-01-01

    Three candidate fault tolerant hypercube architectures are modeled, their reliability analyses are compared, and the resulting implications of these methods of incorporating fault tolerance into hypercube multiprocessors are discussed. In the course of performing the reliability analyses, the use of HARP and fault trees in modeling sequence dependent system behaviors is demonstrated.

  11. Techniques and tools for efficiently modeling multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Carpenter, T.; Yalamanchili, S.

    1990-01-01

    System-level tools and methodologies associated with an integrated approach to the development of multiprocessor systems are examined. Tools for capturing initial program structure, automated program partitioning, automated resource allocation, and high-level modeling of the combined application and resource are discussed. The primary language focus of the current implementation is Ada, although the techniques should be appropriate for other programming paradigms.

  12. Characterizing parallel file-access patterns on a large-scale multiprocessor

    NASA Technical Reports Server (NTRS)

    Purakayastha, Apratim; Ellis, Carla Schlatter; Kotz, David; Nieuwejaar, Nils; Best, Michael

    1994-01-01

    Rapid increases in the computational speeds of multiprocessors have not been matched by corresponding performance enhancements in the I/O subsystem. To satisfy the large and growing I/O requirements of some parallel scientific applications, we need parallel file systems that can provide high-bandwidth and high-volume data transfer between the I/O subsystem and thousands of processors. Design of such high-performance parallel file systems depends on a thorough grasp of the expected workload. So far there have been no comprehensive usage studies of multiprocessor file systems. Our CHARISMA project intends to fill this void. The first results from our study involve an iPSC/860 at NASA Ames. This paper presents results from a different platform, the CM-5 at the National Center for Supercomputing Applications. The CHARISMA studies are unique because we collect information about every individual read and write request and about the entire mix of applications running on the machines. The results of our trace analysis lead to recommendations for parallel file system design. First the file system should support efficient concurrent access to many files, and I/O requests from many jobs under varying load conditions. Second, it must efficiently manage large files kept open for long periods. Third, it should expect to see small requests predominantly sequential access patterns, application-wide synchronous access, no concurrent file-sharing between jobs appreciable byte and block sharing between processes within jobs, and strong interprocess locality. Finally, the trace data suggest that node-level write caches and collective I/O request interfaces may be useful in certain environments.

  13. Queueing analysis of a canonical model of real-time multiprocessors

    NASA Technical Reports Server (NTRS)

    Krishna, C. M.; Shin, K. G.

    1983-01-01

    A logical classification of multiprocessor structures from the point of view of control applications is presented. A computation of the response time distribution for a canonical model of a real time multiprocessor is presented. The multiprocessor is approximated by a blocking model. Two separate models are derived: one created from the system's point of view, and the other from the point of view of an incoming task.

  14. Memories.

    ERIC Educational Resources Information Center

    Brand, Judith, Ed.

    1998-01-01

    This theme issue of the journal "Exploring" covers the topic of "memories" and describes an exhibition at San Francisco's Exploratorium that ran from May 22, 1998 through January 1999 and that contained over 40 hands-on exhibits, demonstrations, artworks, images, sounds, smells, and tastes that demonstrated and depicted the biological,…

  15. Analysis of a Multiprocessor Guidance Computer. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Maltach, E. G.

    1969-01-01

    The design of the next generation of spaceborne digital computers is described. It analyzes a possible multiprocessor computer configuration. For the analysis, a set of representative space computing tasks was abstracted from the Lunar Module Guidance Computer programs as executed during the lunar landing, from the Apollo program. This computer performs at this time about 24 concurrent functions, with iteration rates from 10 times per second to once every two seconds. These jobs were tabulated in a machine-independent form, and statistics of the overall job set were obtained. It was concluded, based on a comparison of simulation and Markov results, that the Markov process analysis is accurate in predicting overall trends and in configuration comparisons, but does not provide useful detailed information in specific situations. Using both types of analysis, it was determined that the job scheduling function is a critical one for efficiency of the multiprocessor. It is recommended that research into the area of automatic job scheduling be performed.

  16. Memory System Technologies for Future High-End Computing Systems

    SciTech Connect

    McKee, S A; de Supinski, B R; Mueller, F; Tyson, G S

    2003-05-16

    Our ability to solve Grand Challenge Problems in computing hinges on the development of reliable and efficient High-End Computing systems. Unfortunately, the increasing gap between memory and processor speeds remains one of the major bottlenecks in modern architectures. Uniprocessor nodes still suffer, but symmetric multiprocessor nodes--where access to physical memory is shared among all processors--are among the hardest hit. In the latter case, the memory system must juggle multiple working sets and maintain memory coherence, on top of simply responding to access requests. To illustrate the severity of the current situation, consider two important examples: even the high-performance parallel supercomputers in use at Department of Energy National labs observe single-processor utilization rates as low as 5%, and transaction processing commercial workloads see utilizations of at most about 33%. A wealth of research demonstrates that traditional memory systems are incapable of bridging the processor/memory performance gap, and the problem continues to grow. The success of future High-End Computing platforms therefore depends on our developing hardware and software technologies to dramatically relieve the memory bottleneck. In order to take better advantage of the tremendous computing power of modern microprocessors and future High-End systems, we consider it crucial to develop the hardware for intelligent, adaptable memory systems; the middleware and OS modifications to manage them; and the compiler technology and performance tools to exploit them. Taken together, these will provide the foundations for meeting the requirements of future generations of performance-critical, parallel systems based on either uniprocessor or SMP nodes (including PIM organizations). We feel that such solutions should not be vendor-specific, but should be sufficiently general and adaptable such that the technologies could be leveraged by any commercial vendor of High-End Computing systems

  17. Plasma physics modeling and the Cray-2 multiprocessor

    SciTech Connect

    Killeen, J.

    1985-01-01

    The importance of computer modeling in the magnetic fusion energy research program is discussed. The need for the most advanced supercomputers is described. To meet the demand for more powerful scientific computers to solve larger and more complicated problems, the computer industry is developing multiprocessors. The role of the Cray-2 in plasma physics modeling is discussed with some examples. 28 refs., 2 figs., 1 tab.

  18. Modelling parallel programs and multiprocessor architectures with AXE

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Fineman, Charles E.

    1991-01-01

    AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.

  19. Modeling and measurement of fault-tolerant multiprocessors

    NASA Technical Reports Server (NTRS)

    Shin, K. G.; Woodbury, M. H.; Lee, Y. H.

    1985-01-01

    The workload effects on computer performance are addressed first for a highly reliable unibus multiprocessor used in real-time control. As an approach to studing these effects, a modified Stochastic Petri Net (SPN) is used to describe the synchronous operation of the multiprocessor system. From this model the vital components affecting performance can be determined. However, because of the complexity in solving the modified SPN, a simpler model, i.e., a closed priority queuing network, is constructed that represents the same critical aspects. The use of this model for a specific application requires the partitioning of the workload into job classes. It is shown that the steady state solution of the queuing model directly produces useful results. The use of this model in evaluating an existing system, the Fault Tolerant Multiprocessor (FTMP) at the NASA AIRLAB, is outlined with some experimental results. Also addressed is the technique of measuring fault latency, an important microscopic system parameter. Most related works have assumed no or a negligible fault latency and then performed approximate analyses. To eliminate this deficiency, a new methodology for indirectly measuring fault latency is presented.

  20. SYMNET: an optical interconnection network for scalable high-performance symmetric multiprocessors.

    PubMed

    Louri, Ahmed; Kodi, Avinash Karanth

    2003-06-10

    We address the primary limitation of the bandwidth to satisfy the demands for address transactions in future cache-coherent symmetric multiprocessors (SMPs). It is widely known that the bus speed and the coherence overhead limit the snoop/address bandwidth needed to broadcast address transactions to all processors. As a solution, we propose a scalable address subnetwork called symmetric multiprocessor network (SYMNET) in which address requests and snoop responses of SMPs are implemented optically. SYMNET not only has the ability to pipeline address requests, but also multiple address requests from different processors can propagate through the address subnetwork simultaneously. This is in contrast with all electrical bus-based SMPs, where only a single request is broadcast on the physical address bus at any given point in time. The simultaneous propagation of multiple address requests in SYMNET increases the available address bandwidth and lowers the latency of the network, but the preservation of cache coherence can no longer be maintained with the usual fast snooping protocols. A modified snooping cache-coherence protocol, coherence in SYMNET (COSYM) is introduced to solve the coherence problem. We evaluated SYMNET with a subset of Splash-2 benchmarks and compared it with the electrical bus-based MOESI (modified, owned, exclusive, shared, invalid) protocol. Our simulation studies have shown a 5-66% improvement in execution time for COSYM as compared with MOESI for various applications. Simulations have also shown that the average latency for a transaction to complete by use of COSYM protocol was 5-78% better than the MOESI protocol. SYMNET can scale up to hundreds of processors while still using fast snooping-based cache-coherence protocols, and additional performance gains may be attained with further improvement in optical device technology.

  1. Shared Attention.

    PubMed

    Shteynberg, Garriy

    2015-09-01

    Shared attention is extremely common. In stadiums, public squares, and private living rooms, people attend to the world with others. Humans do so across all sensory modalities-sharing the sights, sounds, tastes, smells, and textures of everyday life with one another. The potential for attending with others has grown considerably with the emergence of mass media technologies, which allow for the sharing of attention in the absence of physical co-presence. In the last several years, studies have begun to outline the conditions under which attending together is consequential for human memory, motivation, judgment, emotion, and behavior. Here, I advance a psychological theory of shared attention, defining its properties as a mental state and outlining its cognitive, affective, and behavioral consequences. I review empirical findings that are uniquely predicted by shared-attention theory and discuss the possibility of integrating shared-attention, social-facilitation, and social-loafing perspectives. Finally, I reflect on what shared-attention theory implies for living in the digital world.

  2. Development and evaluation of a Fault-Tolerant Multiprocessor (FTMP) computer. Volume 3: FTMP test and evaluation

    NASA Technical Reports Server (NTRS)

    Lala, J. H.; Smith, T. B., III

    1983-01-01

    The experimental test and evaluation of the Fault-Tolerant Multiprocessor (FTMP) is described. Major objectives of this exercise include expanding validation envelope, building confidence in the system, revealing any weaknesses in the architectural concepts and in their execution in hardware and software, and in general, stressing the hardware and software. To this end, pin-level faults were injected into one LRU of the FTMP and the FTMP response was measured in terms of fault detection, isolation, and recovery times. A total of 21,055 stuck-at-0, stuck-at-1 and invert-signal faults were injected in the CPU, memory, bus interface circuits, Bus Guardian Units, and voters and error latches. Of these, 17,418 were detected. At least 80 percent of undetected faults are estimated to be on unused pins. The multiprocessor identified all detected faults correctly and recovered successfully in each case. Total recovery time for all faults averaged a little over one second. This can be reduced to half a second by including appropriate self-tests.

  3. CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms.

    PubMed

    Lee, Daren; Dinov, Ivo; Dong, Bin; Gutman, Boris; Yanovsky, Igor; Toga, Arthur W

    2012-06-01

    As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern graphical processing units (GPUs) can reduce computational times by orders of magnitude. However, its massively threaded architecture introduces challenges when GPU resources are exceeded. This paper presents optimization strategies for compute- and memory-bound algorithms for the CUDA architecture. For compute-bound algorithms, the registers are reduced through variable reuse via shared memory and the data throughput is increased through heavier thread workloads and maximizing the thread configuration for a single thread block per multiprocessor. For memory-bound algorithms, fitting the data into the fast but limited GPU resources is achieved through reorganizing the data into self-contained structures and employing a multi-pass approach. Memory latencies are reduced by selecting memory resources whose cache performance are optimized for the algorithm's access patterns. We demonstrate the strategies on two computationally expensive algorithms and achieve optimized GPU implementations that perform up to 6× faster than unoptimized ones. Compared to CPU implementations, we achieve peak GPU speedups of 129× for the 3D unbiased nonlinear image registration technique and 93× for the non-local means surface denoising algorithm.

  4. Transactional memories: A new abstraction for parallel processing

    SciTech Connect

    Fasel, J.H.; Lubeck, O.M.; Agrawal, D.; Bruno, J.L.; El Abbadi, A.

    1997-12-01

    This is the final report of a three-year, Laboratory Directed Research and Development (LDRD) project at Los Alamos National Laboratory (LANL). Current distributed memory multiprocessor computer systems make the development of parallel programs difficult. From a programmer`s perspective, it would be most desirable if the underlying hardware and software could provide the programming abstraction commonly referred to as sequential consistency--a single address space and multiple threads; but enforcement of sequential consistency limits opportunities for architectural and operating system performance optimizations, leading to poor performance. Recently, Herlihy and Moss have introduced a new abstraction called transactional memories for parallel programming. The programming model is shared memory with multiple threads. However, data consistency is obtained through the use of transactions rather than mutual exclusion based on locking. The transaction approach permits the underlying system to exploit the potential parallelism in transaction processing. The authors explore the feasibility of designing parallel programs using the transaction paradigm for data consistency and a barrier type of thread synchronization.

  5. Combined shared and distributed memory ab-initio computations of molecular-hydrogen systems in the correlated state: Process pool solution and two-level parallelism

    NASA Astrophysics Data System (ADS)

    Biborski, Andrzej; Kądzielawa, Andrzej P.; Spałek, Józef

    2015-12-01

    An efficient computational scheme devised for investigations of ground state properties of the electronically correlated systems is presented. As an example, (H2)n chain is considered with the long-range electron-electron interactions taken into account. The implemented procedure covers: (i) single-particle Wannier wave-function basis construction in the correlated state, (ii) microscopic parameters calculation, and (iii) ground state energy optimization. The optimization loop is based on highly effective process-pool solution - specific root-workers approach. The hierarchical, two-level parallelism was applied: both shared (by use of Open Multi-Processing) and distributed (by use of Message Passing Interface) memory models were utilized. We discuss in detail the feature that such approach results in a substantial increase of the calculation speed reaching factor of 300 for the fully parallelized solution. The scheme elaborated in detail reflects the situation in which the most demanding task is the single-particle basis optimization.

  6. Fast decision tree-based method to index large DNA-protein sequence databases using hybrid distributed-shared memory programming model.

    PubMed

    Jaber, Khalid Mohammad; Abdullah, Rosni; Rashid, Nur'Aini Abdul

    2014-01-01

    In recent times, the size of biological databases has increased significantly, with the continuous growth in the number of users and rate of queries; such that some databases have reached the terabyte size. There is therefore, the increasing need to access databases at the fastest rates possible. In this paper, the decision tree indexing model (PDTIM) was parallelised, using a hybrid of distributed and shared memory on resident database; with horizontal and vertical growth through Message Passing Interface (MPI) and POSIX Thread (PThread), to accelerate the index building time. The PDTIM was implemented using 1, 2, 4 and 5 processors on 1, 2, 3 and 4 threads respectively. The results show that the hybrid technique improved the speedup, compared to a sequential version. It could be concluded from results that the proposed PDTIM is appropriate for large data sets, in terms of index building time.

  7. Experience with a Genetic Algorithm Implemented on a Multiprocessor Computer

    NASA Technical Reports Server (NTRS)

    Plassman, Gerald E.; Sobieszczanski-Sobieski, Jaroslaw

    2000-01-01

    Numerical experiments were conducted to find out the extent to which a Genetic Algorithm (GA) may benefit from a multiprocessor implementation, considering, on one hand, that analyses of individual designs in a population are independent of each other so that they may be executed concurrently on separate processors, and, on the other hand, that there are some operations in a GA that cannot be so distributed. The algorithm experimented with was based on a gaussian distribution rather than bit exchange in the GA reproductive mechanism, and the test case was a hub frame structure of up to 1080 design variables. The experimentation engaging up to 128 processors confirmed expectations of radical elapsed time reductions comparing to a conventional single processor implementation. It also demonstrated that the time spent in the non-distributable parts of the algorithm and the attendant cross-processor communication may have a very detrimental effect on the efficient utilization of the multiprocessor machine and on the number of processors that can be used effectively in a concurrent manner. Three techniques were devised and tested to mitigate that effect, resulting in efficiency increasing to exceed 99 percent.

  8. Instrumentation, performance visualization, and debugging tools for multiprocessors

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Fineman, Charles E.; Hontalas, Philip J.

    1991-01-01

    The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessor architectures. However, without effective means to monitor (and visualize) program execution, debugging, and tuning parallel programs becomes intractably difficult as program complexity increases with the number of processors. Research on performance evaluation tools for multiprocessors is being carried out at ARC. Besides investigating new techniques for instrumenting, monitoring, and presenting the state of parallel program execution in a coherent and user-friendly manner, prototypes of software tools are being incorporated into the run-time environments of various hardware testbeds to evaluate their impact on user productivity. Our current tool set, the Ames Instrumentation Systems (AIMS), incorporates features from various software systems developed in academia and industry. The execution of FORTRAN programs on the Intel iPSC/860 can be automatically instrumented and monitored. Performance data collected in this manner can be displayed graphically on workstations supporting X-Windows. We have successfully compared various parallel algorithms for computational fluid dynamics (CFD) applications in collaboration with scientists from the Numerical Aerodynamic Simulation Systems Division. By performing these comparisons, we show that performance monitors and debuggers such as AIMS are practical and can illuminate the complex dynamics that occur within parallel programs.

  9. Scalable Multiprocessor for High-Speed Computing in Space

    NASA Technical Reports Server (NTRS)

    Lux, James; Lang, Minh; Nishimoto, Kouji; Clark, Douglas; Stosic, Dorothy; Bachmann, Alex; Wilkinson, William; Steffke, Richard

    2004-01-01

    A report discusses the continuing development of a scalable multiprocessor computing system for hard real-time applications aboard a spacecraft. "Hard realtime applications" signifies applications, like real-time radar signal processing, in which the data to be processed are generated at "hundreds" of pulses per second, each pulse "requiring" millions of arithmetic operations. In these applications, the digital processors must be tightly integrated with analog instrumentation (e.g., radar equipment), and data input/output must be synchronized with analog instrumentation, controlled to within fractions of a microsecond. The scalable multiprocessor is a cluster of identical commercial-off-the-shelf generic DSP (digital-signal-processing) computers plus generic interface circuits, including analog-to-digital converters, all controlled by software. The processors are computers interconnected by high-speed serial links. Performance can be increased by adding hardware modules and correspondingly modifying the software. Work is distributed among the processors in a parallel or pipeline fashion by means of a flexible master/slave control and timing scheme. Each processor operates under its own local clock; synchronization is achieved by broadcasting master time signals to all the processors, which compute offsets between the master clock and their local clocks.

  10. Cache directory look-up re-use as conflict check mechanism for speculative memory requests

    DOEpatents

    Ohmacht, Martin

    2013-09-10

    In a cache memory, energy and other efficiencies can be realized by saving a result of a cache directory lookup for sequential accesses to a same memory address. Where the cache is a point of coherence for speculative execution in a multiprocessor system, with directory lookups serving as the point of conflict detection, such saving becomes particularly advantageous.

  11. Memory loss

    MedlinePlus

    ... this page: //medlineplus.gov/ency/article/003257.htm Memory loss To use the sharing features on this ... Bethesda, MD 20894 U.S. Department of Health and Human Services National Institutes of Health Page last updated: ...

  12. A shared, flexible neural map architecture reflects capacity limits in both visual short-term memory and enumeration.

    PubMed

    Knops, André; Piazza, Manuela; Sengupta, Rakesh; Eger, Evelyn; Melcher, David

    2014-07-23

    Human cognition is characterized by severe capacity limits: we can accurately track, enumerate, or hold in mind only a small number of items at a time. It remains debated whether capacity limitations across tasks are determined by a common system. Here we measure brain activation of adult subjects performing either a visual short-term memory (vSTM) task consisting of holding in mind precise information about the orientation and position of a variable number of items, or an enumeration task consisting of assessing the number of items in those sets. We show that task-specific capacity limits (three to four items in enumeration and two to three in vSTM) are neurally reflected in the activity of the posterior parietal cortex (PPC): an identical set of voxels in this region, commonly activated during the two tasks, changed its overall response profile reflecting task-specific capacity limitations. These results, replicated in a second experiment, were further supported by multivariate pattern analysis in which we could decode the number of items presented over a larger range during enumeration than during vSTM. Finally, we simulated our results with a computational model of PPC using a saliency map architecture in which the level of mutual inhibition between nodes gives rise to capacity limitations and reflects the task-dependent precision with which objects need to be encoded (high precision for vSTM, lower precision for enumeration). Together, our work supports the existence of a common, flexible system underlying capacity limits across tasks in PPC that may take the form of a saliency map.

  13. Allo-HLA Cross-Reactivities of CMV-, FLU- and VZV-Specific Memory T Cells Are Shared by Different Individuals.

    PubMed

    van den Heuvel, H; Heutinck, K M; van der Meer-Prins, E M W; Yong, S L; van Miert, P P M C; Anholts, J D H; Franke-van Dijk, M E I; Zhang, X Q; Roelen, D L; Ten Berge, R J M; Claas, F H J

    2017-03-23

    Virus-specific T cells can recognize allogeneic HLA (allo-HLA) through TCR cross-reactivity. The allospecificity often differs per individual ("private cross-reactivity"), but can also be shared by multiple individuals ("public cross-reactivity"). However, only a few examples of the latter have been described. Since these could facilitate alloreactivity prediction in transplantation, we aimed to identify novel public cross-reactivities of human virus-specific CD8+ T cells directed against allo-HLA by assessing their reactivity in mixed-lymphocyte reactions. Further characterization was done by studying TCR usage with primer-based DNA sequencing, cytokine production with enzyme-linked immunosorbent assays (ELISAs), and cytotoxicity with (51) Chromium-release assays. We identified three novel public allo-HLA cross-reactivities of human virus-specific CD8(+) T cells. CMV B35/IPS CD8(+) T cells cross-reacted with HLA-B51 and/or HLA-B58/B57 (23% of tetramer-positive individuals), FLU A2/GIL CD8(+) T cells with HLA-B38 (90% of tetramer-positive individuals) and VZV A2/ALW CD8(+) T cells with HLA-B55 (two unrelated individuals). Cross-reactivity was tested against different cell types including endothelial and epithelial cells. All cross-reactive T cells expressed a memory phenotype, emphasizing the importance for transplantation. We conclude that public allo-HLA cross-reactivity of virus-specific memory T cells is not uncommon, which may create novel opportunities for alloreactivity prediction and risk estimation in transplantation. This article is protected by copyright. All rights reserved.

  14. Parallel algorithm of VLBI software correlator under multiprocessor environment

    NASA Astrophysics Data System (ADS)

    Zheng, Weimin; Zhang, Dong

    2007-11-01

    The correlator is the key signal processing equipment of a Very Lone Baseline Interferometry (VLBI) synthetic aperture telescope. It receives the mass data collected by the VLBI observatories and produces the visibility function of the target, which can be used to spacecraft position, baseline length measurement, synthesis imaging, and other scientific applications. VLBI data correlation is a task of data intensive and computation intensive. This paper presents the algorithms of two parallel software correlators under multiprocessor environments. A near real-time correlator for spacecraft tracking adopts the pipelining and thread-parallel technology, and runs on the SMP (Symmetric Multiple Processor) servers. Another high speed prototype correlator using the mixed Pthreads and MPI (Massage Passing Interface) parallel algorithm is realized on a small Beowulf cluster platform. Both correlators have the characteristic of flexible structure, scalability, and with 10-station data correlating abilities.

  15. Evict on write, a management strategy for a prefetch unit and/or first level cache in a multiprocessor system with speculative execution

    DOEpatents

    Gara, Alan; Ohmacht, Martin

    2014-09-16

    In a multiprocessor system with at least two levels of cache, a speculative thread may run on a core processor in parallel with other threads. When the thread seeks to do a write to main memory, this access is to be written through the first level cache to the second level cache. After the write though, the corresponding line is deleted from the first level cache and/or prefetch unit, so that any further accesses to the same location in main memory have to be retrieved from the second level cache. The second level cache keeps track of multiple versions of data, where more than one speculative thread is running in parallel, while the first level cache does not have any of the versions during speculation. A switch allows choosing between modes of operation of a speculation blind first level cache.

  16. A Tensor Product Formulation of Strassen's Matrix Multiplication Algorithm with Memory Reduction

    DOE PAGES

    Kumar, B.; Huang, C. -H.; Sadayappan, P.; ...

    1995-01-01

    In this article, we present a program generation strategy of Strassen's matrix multiplication algorithm using a programming methodology based on tensor product formulas. In this methodology, block recursive programs such as the fast Fourier Transforms and Strassen's matrix multiplication algorithm are expressed as algebraic formulas involving tensor products and other matrix operations. Such formulas can be systematically translated to high-performance parallel/vector codes for various architectures. In this article, we present a nonrecursive implementation of Strassen's algorithm for shared memory vector processors such as the Cray Y-MP. A previous implementation of Strassen's algorithm synthesized from tensor product formulas required workingmore » storage of size O(7 n ) for multiplying 2 n × 2 n matrices. We present a modified formulation in which the working storage requirement is reduced to O(4 n ). The modified formulation exhibits sufficient parallelism for efficient implementation on a shared memory multiprocessor. Performance results on a Cray Y-MP8/64 are presented.« less

  17. Distribution and reliability in a multiprocessor operating system

    SciTech Connect

    Sindhu, P.S.

    1984-01-01

    This thesis explores whether distributed hardware can be used to build reliable systems that are practical. It deals mostly with issues in the design and construction of the MEDUSA operating system, built to run of the distributed multiprocessor Cm. The goal is to exploit the redundancy, physical distribution, and powerful communication facilities within Cm to produce a robust system - one that can function satisfactorily in spite of hardware failure, bad data, heavy demands on its services, and other disruptive occurrences. The thesis demonstrates that, given appropriate hardware, it is possible to provide robustness without significantly sacrificing performance or greatly increasing complexity. Factors important to this success were 1) the treatment of reliability as a probabilistic rather than an absolute guarantee, which resulted in a structure whose components explicitly acknowledge and cope with failures in one another; 2) the adoption of a systematic approach to exception processing, which made the complexity of recovery manageable; and 3) the exploitation of system knowledge in the design of reliability mechanisms, which made it possible to achieve robustness while retaining good normal-case performance.

  18. MULTIPROCESSOR AND DISTRIBUTED PROCESSING BIBLIOGRAPHIC DATA BASE SOFTWARE SYSTEM

    NASA Technical Reports Server (NTRS)

    Miya, E. N.

    1994-01-01

    Multiprocessors and distributed processing are undergoing increased scientific scrutiny for many reasons. It is more and more difficult to keep track of the existing research in these fields. This package consists of a large machine-readable bibliographic data base which, in addition to the usual keyword searches, can be used for producing citations, indexes, and cross-references. The data base is compiled from smaller existing multiprocessing bibliographies, and tables of contents from journals and significant conferences. There are approximately 4,000 entries covering topics such as parallel and vector processing, networks, supercomputers, fault-tolerant computers, and cellular automata. Each entry is represented by 21 fields including keywords, author, referencing book or journal title, volume and page number, and date and city of publication. The data base contains UNIX 'refer' formatted ASCII data and can be implemented on any computer running under the UNIX operating system. The data base requires approximately one megabyte of secondary storage. The documentation for this program is included with the distribution tape, although it can be purchased for the price below. This bibliography was compiled in 1985 and updated in 1988.

  19. Efficient diagnosis of multiprocessor systems under probabilistic models

    NASA Technical Reports Server (NTRS)

    Blough, Douglas M.; Sullivan, Gregory F.; Masson, Gerald M.

    1989-01-01

    The problem of fault diagnosis in multiprocessor systems is considered under a probabilistic fault model. The focus is on minimizing the number of tests that must be conducted in order to correctly diagnose the state of every processor in the system with high probability. A diagnosis algorithm that can correctly diagnose the state of every processor with probability approaching one in a class of systems performing slightly greater than a linear number of tests is presented. A nearly matching lower bound on the number of tests required to achieve correct diagnosis in arbitrary systems is also proven. Lower and upper bounds on the number of tests required for regular systems are also presented. A class of regular systems which includes hypercubes is shown to be correctly diagnosable with high probability. In all cases, the number of tests required under this probabilistic model is shown to be significantly less than under a bounded-size fault set model. Because the number of tests that must be conducted is a measure of the diagnosis overhead, these results represent a dramatic improvement in the performance of system-level diagnosis techniques.

  20. Ordered fast fourier transforms on a massively parallel hypercube multiprocessor

    NASA Technical Reports Server (NTRS)

    Tong, Charles; Swarztrauber, Paul N.

    1989-01-01

    Design alternatives for ordered Fast Fourier Transformation (FFT) algorithms were examined on massively parallel hypercube multiprocessors such as the Connection Machine. Particular emphasis is placed on reducing communication which is known to dominate the overall computing time. To this end, the order and computational phases of the FFT were combined, and the sequence to processor maps that reduce communication were used. The class of ordered transforms is expanded to include any FFT in which the order of the transform is the same as that of the input sequence. Two such orderings are examined, namely, standard-order and A-order which can be implemented with equal ease on the Connection Machine where orderings are determined by geometries and priorities. If the sequence has N = 2 exp r elements and the hypercube has P = 2 exp d processors, then a standard-order FFT can be implemented with d + r/2 + 1 parallel transmissions. An A-order sequence can be transformed with 2d - r/2 parallel transmissions which is r - d + 1 fewer than the standard order. A parallel method for computing the trigonometric coefficients is presented that does not use trigonometric functions or interprocessor communication. A performance of 0.9 GFLOPS was obtained for an A-order transform on the Connection Machine.

  1. Solution of open region electromagnetic scattering problems on hypercube multiprocessors

    SciTech Connect

    Gedney, S.D.

    1991-01-01

    This thesis focuses on development of parallel algorithms that exploit hypercube multiprocessor computers for the solution of the scattering of electromagnetic fields by bodies situated in an unbounded space. Initially, algorithms based on the method of moments are investigated for coarse-grained MIMD hypercubes as well as finite-grained MIMD and SIMD hypercubes. It is shown that by exploiting the architecture of each hypercube, supercomputer performance can be obtained using the JPL Mark III hypercube and the Thinking Machine's CM2. Second, the use of the finite-element method for solution of the scattering by bodies of composite materials is presented. For finite bodies situated in an unbounded space, use of an absorbing boundary condition is investigated. A method known as the mixed-{chi} formulation is presented, which reduces the mesh density in the regions away from the scatterer, enhancing the use of an absorbing boundary condition. The scattering by troughs or slots is also investigated using a combined FEM/MoM formulation. This method is extended to the problem of the diffraction of electromagnetic waves by thick conducting and/or dielectric gratings. Finally, the adaptation of the FEM method onto a coarse-grained hypercube is presented.

  2. Performance and economy of a fault-tolerant multiprocessor

    NASA Technical Reports Server (NTRS)

    Lala, J. H.; Smith, C. J.

    1979-01-01

    The FTMP (Fault-Tolerant Multiprocessor) is one of two central aircraft fault-tolerant architectures now in the prototype phase under NASA sponsorship. The intended application of the computer includes such critical real-time tasks as 'fly-by-wire' active control and completely automatic Category III landings of commercial aircraft. The FTMP architecture is briefly described and it is shown that it is a viable solution to the multi-faceted problems of safety, speed, and cost. Three job dispatch strategies are described, and their results with respect to job-starting delay are presented. The first strategy is a simple First-Come-First-Serve (FCFS) job dispatch executive. The other two schedulers are an adaptive FCFS and an interrupt driven scheduler. Three failure modes are discussed, and the FTMP survival probability in the face of random hard failures is evaluated. It is noted that the hourly cost of operating two FTMPs in a transport aircraft can be as little as one-to-two percent of the total flight-hour cost of the aircraft.

  3. Validation of a fault-tolerant multiprocessor: Baseline experiments and workload implementation

    NASA Technical Reports Server (NTRS)

    Feather, Frank; Siewiorek, Daniel; Segall, Zary

    1985-01-01

    In the future, aircraft must employ highly reliable multiprocessors in order to achieve flight safety. Such computers must be experimentally validated before they are deployed. This project outlines a methodology for validating reliable multiprocessors. The methodology begins with baseline experiments, which tests a single phenomenon. As experiments progress, tools for performance testing are developed. The methodology is used, in part, on the Fault Tolerant Multiprocessor (FTMP) at NASA-Langley's AIRLAB facility. Experiments are designed to evaluate the fault-free performance of the system. Presented are the results of interrupt baseline experiments performed on FTMP. Interrupt causing exception conditions were tested, and several were found to have unimplemented interrupt handling software while one had an unimplemented interrupt vector. A synthetic workload model for realtime multiprocessors is then developed as an application level performance analysis tool. Details of the workload implementation and calibration are presented. Both the experimental methodology and the synthetic workload model are general enough to be applicable to reliable multiprocessors beside FTMP.

  4. Modeling techniques in a parallelizing compiler for the B-HIVE multiprocessor system

    NASA Technical Reports Server (NTRS)

    Kim, Sukil; Agrawal, Dharma P.; Mauney, Jon; Leu, Ja-Song

    1989-01-01

    The parallelizing compiler for the B-HIVE loosely-coupled multiprocessor system uses a medium grain model to minimize the communication overhead. A medium grain model is shown to be an optimum way of merging fine grain operations into parallel tasks such that the parallelism obtained at the grain level is retained and communication overhead is decreased. A new communication model is introduced in this paper, allowing additional overlap between computation and communication. Simulation results indicate that the medium grain communication model shows promise for automatic parallelization for a loosely-coupled multiprocessor system.

  5. Memory management and compiler support for rapid recovery from failures in computer systems

    NASA Technical Reports Server (NTRS)

    Fuchs, W. K.

    1991-01-01

    This paper describes recent developments in the use of memory management and compiler technology to support rapid recovery from failures in computer systems. The techniques described include cache coherence protocols for user transparent checkpointing in multiprocessor systems, compiler-based checkpoint placement, compiler-based code modification for multiple instruction retry, and forward recovery in distributed systems utilizing optimistic execution.

  6. Meeting the memory challenges of brain-scale network simulation.

    PubMed

    Kunkel, Susanne; Potjans, Tobias C; Eppler, Jochen M; Plesser, Hans Ekkehard; Morrison, Abigail; Diesmann, Markus

    2011-01-01

    The development of high-performance simulation software is crucial for studying the brain connectome. Using connectome data to generate neurocomputational models requires software capable of coping with models on a variety of scales: from the microscale, investigating plasticity, and dynamics of circuits in local networks, to the macroscale, investigating the interactions between distinct brain regions. Prior to any serious dynamical investigation, the first task of network simulations is to check the consistency of data integrated in the connectome and constrain ranges for yet unknown parameters. Thanks to distributed computing techniques, it is possible today to routinely simulate local cortical networks of around 10(5) neurons with up to 10(9) synapses on clusters and multi-processor shared-memory machines. However, brain-scale networks are orders of magnitude larger than such local networks, in terms of numbers of neurons and synapses as well as in terms of computational load. Such networks have been investigated in individual studies, but the underlying simulation technologies have neither been described in sufficient detail to be reproducible nor made publicly available. Here, we discover that as the network model sizes approach the regime of meso- and macroscale simulations, memory consumption on individual compute nodes becomes a critical bottleneck. This is especially relevant on modern supercomputers such as the Blue Gene/P architecture where the available working memory per CPU core is rather limited. We develop a simple linear model to analyze the memory consumption of the constituent components of neuronal simulators as a function of network size and the number of cores used. This approach has multiple benefits. The model enables identification of key contributing components to memory saturation and prediction of the effects of potential improvements to code before any implementation takes place. As a consequence, development cycles can be shorter and

  7. SIFT - Multiprocessor architecture for Software Implemented Fault Tolerance flight control and avionics computers

    NASA Technical Reports Server (NTRS)

    Forman, P.; Moses, K.

    1979-01-01

    A brief description of a SIFT (Software Implemented Fault Tolerance) Flight Control Computer with emphasis on implementation is presented. A multiprocessor system that relies on software-implemented fault detection and reconfiguration algorithms is described. A high level reliability and fault tolerance is achieved by the replication of computing tasks among processing units.

  8. Implementation of multigrid methods for solving Navier-Stokes equations on a multiprocessor system

    NASA Technical Reports Server (NTRS)

    Naik, Vijay K.; Taasan, Shlomo

    1987-01-01

    Presented are schemes for implementing multigrid algorithms on message based MIMD multiprocessor systems. To address the various issues involved, a nontrivial problem of solving the 2-D incompressible Navier-Stokes equations is considered as the model problem. Three different multigrid algorithms are considered. Results from implementing these algorithms on an Intel iPSC are presented.

  9. Evaluation of the impact chip multiprocessors have on SNL application performance.

    SciTech Connect

    Doerfler, Douglas W.

    2009-10-01

    This report describes trans-organizational efforts to investigate the impact of chip multiprocessors (CMPs) on the performance of important Sandia application codes. The impact of CMPs on the performance and applicability of Sandia's system software was also investigated. The goal of the investigation was to make algorithmic and architectural recommendations for next generation platform acquisitions.

  10. A distributed multiprocessor system designed for real-time image processing

    NASA Astrophysics Data System (ADS)

    Yin, Zhiyi; Heng, Wei

    2008-11-01

    In real-time image processing, a large amount of data is needed to be processed at a very high speed. Considering the problems faced in real-time image processing, a distributed multiprocessor system is proposed in this paper. In the design of the distributed multiprocessor system, processing tasks are allocated to various processes, which are bound to different CPUs. Several designs are discussed, and making full use of every process is very important to system's excellent performance. Furthermore, the problems of realization fasten on the inter-process communication, the synchronization, and the stability. System analysis and performance tests both show that the distributed multiprocessor system is able to improve system's performance variously, including the delay, the throughput rate, the stability, the scalability. And the system can be expanded easy at aspects of software and hardware. In a word, the distributed multiprocessor system designed for real-time image processing, based on distributed algorithms, not only improves system's performance variously, but also costs low and expands easy.

  11. Performance of the butterfly processor-memory interconnection in a vector environment

    NASA Astrophysics Data System (ADS)

    Brooks, E. D., III

    1985-02-01

    A fundamental hurdle impeding the development of large N common memory multiprocessors is the performance limitation in the switch connecting the processors to the memory modules. Multistage networks currently considered for this connection have a memory latency which grows like (ALPHA)log2N*. For scientific computing, it is natural to look for a multiprocessor architecture that will enable the use of vector operations to mask memory latency. The problem to be overcome here is the chaotic behavior introduced by conflicts occurring in the switch. The performance of the butterfly or indirect binary n-cube network in a vector processing environment is examined. A simple modification of the standard 2X2 switch node used in such networks which adaptively removes chaotic behavior during a vector operation is described.

  12. Apparatus for multiprocessor-based control of a multiagent robot

    NASA Technical Reports Server (NTRS)

    Peters, II, Richard Alan (Inventor)

    2009-01-01

    An architecture for robot intelligence enables a robot to learn new behaviors and create new behavior sequences autonomously and interact with a dynamically changing environment. Sensory information is mapped onto a Sensory Ego-Sphere (SES) that rapidly identifies important changes in the environment and functions much like short term memory. Behaviors are stored in a DBAM that creates an active map from the robot's current state to a goal state and functions much like long term memory. A dream state converts recent activities stored in the SES and creates or modifies behaviors in the DBAM.

  13. Validation of fault-free behavior of a reliable multiprocessor system - FTMP: A case study. [Fault-Tolerant Multi-Processor avionics

    NASA Technical Reports Server (NTRS)

    Clune, E.; Segall, Z.; Siewiorek, D.

    1984-01-01

    A program of experiments has been conducted at NASA-Langley to test the fault-free performance of a Fault-Tolerant Multiprocessor (FTMP) avionics system for next-generation aircraft. Baseline measurements of an operating FTMP system were obtained with respect to the following parameters: instruction execution time, frame size, and the variation of clock ticks. The mechanisms of frame stretching were also investigated. The experimental results are summarized in a table. Areas of interest for future tests are identified, with emphasis given to the implementation of a synthetic workload generation mechanism on FTMP.

  14. Safe and Efficient Support for Embeded Multi-Processors in ADA

    NASA Astrophysics Data System (ADS)

    Ruiz, Jose F.

    2010-08-01

    New software demands increasing processing power, and multi-processor platforms are spreading as the answer to achieve the required performance. Embedded real-time systems are also subject to this trend, but in the case of real-time mission-critical systems, the properties of reliability, predictability and analyzability are also paramount. The Ada 2005 language defined a subset of its tasking model, the Ravenscar profile, that provides the basis for the implementation of deterministic and time analyzable applications on top of a streamlined run-time system. This Ravenscar tasking profile, originally designed for single processors, has proven remarkably useful for modelling verifiable real-time single-processor systems. This paper proposes a simple extension to the Ravenscar profile to support multi-processor systems using a fully partitioned approach. The implementation of this scheme is simple, and it can be used to develop applications amenable to schedulability analysis.

  15. Method for wiring allocation and switch configuration in a multiprocessor environment

    DOEpatents

    Aridor, Yariv; Domany, Tamar; Frachtenberg, Eitan; Gal, Yoav; Shmueli, Edi; Stockmeyer, legal representative, Robert E.; Stockmeyer, Larry Joseph

    2008-07-15

    A method for wiring allocation and switch configuration in a multiprocessor computer, the method including employing depth-first tree traversal to determine a plurality of paths among a plurality of processing elements allocated to a job along a plurality of switches and wires in a plurality of D-lines, and selecting one of the paths in accordance with at least one selection criterion.

  16. Expert Systems on Multiprocessor Architectures. Volume 2. Technical Reports

    DTIC Science & Technology

    1991-06-01

    interpretations of continuous streams of er- rorful data, a class of applications which curr-ntly run too slowly on serial black - board systems to be of practical...for distributed memory multi-procesor machinesiRice 881 [Nii 881. 2-16 other KS executions. KSs contain condition-action rules that can read the black ...KSs which interprets the data up the black - board’s levels of abstraction. KSs in Cage can be executed in parallel with or without synchronization at

  17. Expert Systems on Multiprocessor Architectures. Volume 3. Technical Reports

    DTIC Science & Technology

    1991-06-01

    track consisting of scantimes 100, 110, 120 .... 1 50. Suppose that the rate of data .rrival is high, causing message order to be scrambled , and that...sera’etsm xo-- under ur control m aa actons hared memory --s-.r-buced meory .ia~P ?O fl Egg ---- fO~s mu Ftl-pr r uhe-pr# macnines mac -es -Fgure 1...where there are one or more streams of continuous input data, the problem appears as scrambled data arrival - the data may be out of temporal sequence

  18. Dynamic power management for UML modeled applications on multiprocessor SoC

    NASA Astrophysics Data System (ADS)

    Kukkala, Petri; Arpinen, Tero; Setälä, Mikko; Hännikäinen, Marko; Hämäläinen, Timo D.

    2007-02-01

    The paper presents a novel scheme of dynamic power management for UML modeled applications that are executed on a multiprocessor System-on-Chip (SoC) in a distributed manner. The UML models for both application and architecture are designed according to a well-defined UML profile for embedded system design, called TUT-Profile. Application processes are considered as elementary units of distributed execution, and their mapping on a multiprocessor SoC can be dynamically changed at run-time. Our approach on the dynamic power management balances utilized processor resources against current workload at runtime by (1) observing the processor and workload statistics, (2) re-evaluating the amount of required resources (i.e. the number of active processors), and (3) re-mapping the application processes to the minimum set of active processors. The inactive processors are set to a power-save state by using clock-gating. The approach integrates the well-known power management techniques tightly with the UML based design of embedded systems in a novel way. We evaluated the dynamic power management with a WLAN terminal implemented on a multiprocessor SoC on Altera Stratix II FPGA containing up to five Nios II processors and dedicated hardware accelerators. Measurements proved up to 21% savings in the power consumption of the whole FPGA board.

  19. RTMPL: A structured programming and documentation utility for real-time multiprocessor simulations

    NASA Technical Reports Server (NTRS)

    Arpasi, D. J.

    1984-01-01

    The NASA Lewis Research Center is developing and evaluating experimental hardware and software systems to help meet future needs for real time simulations of air-breathing propulsion systems. The Real Time Multiprocessor Simulator (RTMPS) project is aimed at developing a prototype simulator system that uses multiple microprocessors to achieve the desired computing speed and accuracy at relatively low cost. Software utilities are being developed to provide engineering-level programming and interactive operation of the simulator. Two major software development efforts were undertaken in the RTMPS project. A real time multiprocessor operating system was developed to provide for interactive operation of the simulator. The second effort was aimed at developing a structured, high-level, engineering-oriented programming language and translator that would facilitate the programming of the simulator. The Real Time Multiprocessor Programming Language (RTMPL) allows the user to describe simulation tasks for each processor in a straight-forward, structured manner. The RTMPL utility acts as an assembly language programmer, translating the high-level simulation description into time-efficient assembly language code for the processors. The utility sets up all of the interfaces between the simulator hardware, firmware, and operating system.

  20. Sharing code.

    PubMed

    Kubilius, Jonas

    2014-01-01

    Sharing code is becoming increasingly important in the wake of Open Science. In this review I describe and compare two popular code-sharing utilities, GitHub and Open Science Framework (OSF). GitHub is a mature, industry-standard tool but lacks focus towards researchers. In comparison, OSF offers a one-stop solution for researchers but a lot of functionality is still under development. I conclude by listing alternative lesser-known tools for code and materials sharing.

  1. Myrmics Memory Allocator

    SciTech Connect

    Lymperis, S.

    2011-09-23

    MMA is a stand-alone memory management system for MPI clusters. It implements a shared Partitioned Global Address Space, where multiple MPI processes request objects from the allocator and the latter provides them with system-wide unique memory addresses for each object. It provides applications with an intuitive way of managing the memory system in a unified way, thus enabling easier writing of irregular application code.

  2. Utilizing Predictors for Efficient Thermal Management in Multiprocessor SoCs

    DTIC Science & Technology

    2009-10-01

    CMT SPARC processor,” in Proc. ISSCC, 2008, pp. 82–83. [34] A. Wald and J. Wolfowitz , “Optimum character of the sequential proba- bility ratio test...from IEEE Xplore. Restrictions apply. COŞKUN et al.: UTILIZING PREDICTORS FOR EFFICIENT THERMAL MANAGEMENT IN MULTIPROCESSOR SoCs 1505 values in...not introduce inaccuracy. yt + p∑ i=1 (ai yt−i) = et + q∑ i=1 (ci et−i). (1) An ARMA(p, q) model is described by (1). In the equation, yt is the value

  3. Programmable controller with a multiprocessor-based high speed interactive language system

    SciTech Connect

    Matsuzaki, K.; Hata, S.; Ohkochi, O.; Okamura, M.; Sugimoto, N.

    1983-01-01

    A multiprocessor-based programmable controller (PC) capable of sequence control and data processing has been developed. This PC consists of a custom processor for a relay ladder program and a 68000 16-bit microprocessor for a basic program. The basic program is executed by an interpreter which is an order faster than a conventional interpreter of a personal computer. The relay ladder program and the basic program can activate and communicate with each other. Although the controller features more control functions than conventional PCs, it can be easily operated interactively on site. 2 references.

  4. A high speed multi-tasking, multi-processor telemetry system

    SciTech Connect

    Wu, Kung Chris

    1996-12-31

    This paper describes a small size, light weight, multitasking, multiprocessor telemetry system capable of collecting 32 channels of differential signals at a sampling rate of 6.25 kHz per channel. The system is designed to collect data from remote wind turbine research sites and transfer the data via wireless communication. A description of operational theory, hardware components, and itemized cost is provided. Synchronization with other data acquisition systems and test data on data transmission rates is also given. 11 refs., 7 figs., 4 tabs.

  5. ScalaBLAST 2.0: Rapid and robust BLAST calculations on multiprocessor systems

    SciTech Connect

    Oehmen, Christopher S.; Baxter, Douglas J.

    2013-03-15

    BLAST remains one of the most widely used tools in computational biology. The rate at which new sequence data is available continues to grow exponentially, driving the emergence of new fields of biological research. At the same time multicore systems and conventional clusters are more accessible. ScalaBLAST has been designed to run on conventional multiprocessor systems with an eye to extreme parallelism, enabling parallel BLAST calculations using over 16,000 processing cores with a portable, robust, fault-resilient design. ScalaBLAST 2.0 source code can be freely downloaded from http://omics.pnl.gov/software/ScalaBLAST.php.

  6. Multi-objective two-stage multiprocessor flow shop scheduling - a subgroup particle swarm optimisation approach

    NASA Astrophysics Data System (ADS)

    Huang, Rong-Hwa; Yang, Chang-Lin; Hsu, Chun-Ting

    2015-12-01

    Flow shop production system - compared to other economically important production systems - is popular in real manufacturing environments. This study focuses on the flow shop with multiprocessor scheduling problem (FSMP), and develops an improved particle swarm optimisation heuristic to solve it. Additionally, this study designs an integer programming model to perform effectiveness and robustness testing on the proposed heuristic. Experimental results demonstrate a 10% to 50% improvement in the effectiveness of the proposed heuristic in small-scale problem tests, and a 10% to 40% improvement in the robustness of the heuristic in large-scale problem tests, indicating extremely satisfactory performance.

  7. A simple executive for a fault-tolerant, real-time multiprocessor.

    NASA Technical Reports Server (NTRS)

    Filene, R. J.; Green, A. I.

    1971-01-01

    Description of a simple executive for operation with a fault-tolerant multiprocessor that is oriented toward application in an environment where the primary function is to provide real-time control. The primary executive function is to accept requests for jobs placed by other jobs or from peripheral equipment and then schedule their initiation in accordance with the request parameters. The executive is also brought into action when a processor fails, so that appropriate disposition may be made of the job that was running on the failed processor. Many architectural features intended to support this executive concept are included.

  8. Energy-efficient fault tolerance in multiprocessor real-time systems

    NASA Astrophysics Data System (ADS)

    Guo, Yifeng

    The recent progress in the multiprocessor/multicore systems has important implications for real-time system design and operation. From vehicle navigation to space applications as well as industrial control systems, the trend is to deploy multiple processors in real-time systems: systems with 4 -- 8 processors are common, and it is expected that many-core systems with dozens of processing cores will be available in near future. For such systems, in addition to general temporal requirement common for all real-time systems, two additional operational objectives are seen as critical: energy efficiency and fault tolerance. An intriguing dimension of the problem is that energy efficiency and fault tolerance are typically conflicting objectives, due to the fact that tolerating faults (e.g., permanent/transient) often requires extra resources with high energy consumption potential. In this dissertation, various techniques for energy-efficient fault tolerance in multiprocessor real-time systems have been investigated. First, the Reliability-Aware Power Management (RAPM) framework, which can preserve the system reliability with respect to transient faults when Dynamic Voltage Scaling (DVS) is applied for energy savings, is extended to support parallel real-time applications with precedence constraints. Next, the traditional Standby-Sparing (SS) technique for dual processor systems, which takes both transient and permanent faults into consideration while saving energy, is generalized to support multiprocessor systems with arbitrary number of identical processors. Observing the inefficient usage of slack time in the SS technique, a Preference-Oriented Scheduling Framework is designed to address the problem where tasks are given preferences for being executed as soon as possible (ASAP) or as late as possible (ALAP). A preference-oriented earliest deadline (POED) scheduler is proposed and its application in multiprocessor systems for energy-efficient fault tolerance is

  9. Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 4: FTMP executive summary

    NASA Technical Reports Server (NTRS)

    Smith, T. B., III; Lala, J. H.

    1984-01-01

    The FTMP architecture is a high reliability computer concept modeled after a homogeneous multiprocessor architecture. Elements of the FTMP are operated in tight synchronism with one another and hardware fault-detection and fault-masking is provided which is transparent to the software. Operating system design and user software design is thus greatly simplified. Performance of the FTMP is also comparable to that of a simplex equivalent due to the efficiency of fault handling hardware. The FTMP project constructed an engineering module of the FTMP, programmed the machine and extensively tested the architecture through fault injection and other stress testing. This testing confirmed the soundness of the FTMP concepts.

  10. Animal models of source memory.

    PubMed

    Crystal, Jonathon D

    2016-01-01

    Source memory is the aspect of episodic memory that encodes the origin (i.e., source) of information acquired in the past. Episodic memory (i.e., our memories for unique personal past events) typically involves source memory because those memories focus on the origin of previous events. Source memory is at work when, for example, someone tells a favorite joke to a person while avoiding retelling the joke to the friend who originally shared the joke. Importantly, source memory permits differentiation of one episodic memory from another because source memory includes features that were present when the different memories were formed. This article reviews recent efforts to develop an animal model of source memory using rats. Experiments are reviewed which suggest that source memory is dissociated from other forms of memory. The review highlights strengths and weaknesses of a number of animal models of episodic memory. Animal models of source memory may be used to probe the biological bases of memory. Moreover, these models can be combined with genetic models of Alzheimer's disease to evaluate pharmacotherapies that ultimately have the potential to improve memory.

  11. Sharing code

    PubMed Central

    Kubilius, Jonas

    2014-01-01

    Sharing code is becoming increasingly important in the wake of Open Science. In this review I describe and compare two popular code-sharing utilities, GitHub and Open Science Framework (OSF). GitHub is a mature, industry-standard tool but lacks focus towards researchers. In comparison, OSF offers a one-stop solution for researchers but a lot of functionality is still under development. I conclude by listing alternative lesser-known tools for code and materials sharing. PMID:25165519

  12. Mapping virtual addresses to different physical addresses for value disambiguation for thread memory access requests

    DOEpatents

    Gala, Alan; Ohmacht, Martin

    2014-09-02

    A multiprocessor system includes nodes. Each node includes a data path that includes a core, a TLB, and a first level cache implementing disambiguation. The system also includes at least one second level cache and a main memory. For thread memory access requests, the core uses an address associated with an instruction format of the core. The first level cache uses an address format related to the size of the main memory plus an offset corresponding to hardware thread meta data. The second level cache uses a physical main memory address plus software thread meta data to store the memory access request. The second level cache accesses the main memory using the physical address with neither the offset nor the thread meta data after resolving speculation. In short, this system includes mapping of a virtual address to a different physical addresses for value disambiguation for different threads.

  13. Closed-form solutions of performability. [modeling of a degradable buffer/multiprocessor system

    NASA Technical Reports Server (NTRS)

    Meyer, J. F.

    1981-01-01

    Methods which yield closed form performability solutions for continuous valued variables are developed. The models are similar to those employed in performance modeling (i.e., Markovian queueing models) but are extended so as to account for variations in structure due to faults. In particular, the modeling of a degradable buffer/multiprocessor system is considered whose performance Y is the (normalized) average throughput rate realized during a bounded interval of time. To avoid known difficulties associated with exact transient solutions, an approximate decomposition of the model is employed permitting certain submodels to be solved in equilibrium. These solutions are then incorporated in a model with fewer transient states and by solving the latter, a closed form solution of the system's performability is obtained. In conclusion, some applications of this solution are discussed and illustrated, including an example of design optimization.

  14. Commodity multi-processor systems in the ATLAS level-2 trigger

    SciTech Connect

    Abolins, M.; Blair, R.; Bock, R.; Bogaerts, A.; Dawson, J.; Ermoline, Y.; Hauser, R.; Kugel, A.; Lay, R.; Muller, M.; Noffz, K.-H.; Pope, B.; Schlereth, J.; Werner, P.

    2000-05-23

    Low cost SMP (Symmetric Multi-Processor) systems provide substantial CPU and I/O capacity. These features together with the ease of system integration make them an attractive and cost effective solution for a number of real-time applications in event selection. In ATLAS the authors consider them as intelligent input buffers (active ROB complex), as event flow supervisors or as powerful processing nodes. Measurements of the performance of one off-the-shelf commercial 4-processor PC with two PCI buses, equipped with commercial FPGA based data source cards (microEnable) and running commercial software are presented and mapped on such applications together with a long-term program of work. The SMP systems may be considered as an important building block in future data acquisition systems.

  15. Dynamic modelling and estimation of the error due to asynchronism in a redundant asynchronous multiprocessor system

    NASA Technical Reports Server (NTRS)

    Huynh, Loc C.; Duval, R. W.

    1986-01-01

    The use of Redundant Asynchronous Multiprocessor System to achieve ultrareliable Fault Tolerant Control Systems shows great promise. The development has been hampered by the inability to determine whether differences in the outputs of redundant CPU's are due to failures or to accrued error built up by slight differences in CPU clock intervals. This study derives an analytical dynamic model of the difference between redundant CPU's due to differences in their clock intervals and uses this model with on-line parameter identification to idenitify the differences in the clock intervals. The ability of this methodology to accurately track errors due to asynchronisity generate an error signal with the effect of asynchronisity removed and this signal may be used to detect and isolate actual system failures.

  16. Programmable Optoelectronic Multiprocessors And Their Comparison With Symbolic Substitution For Digital Optical Computing

    NASA Astrophysics Data System (ADS)

    Kiamilev, F.; Esener, Sadik C.; Paturi, R.; Fainmar, Y.; Mercier, P.; Guest, C. C.; Lee, Sing H.

    1989-04-01

    This paper introduces programmable arrays of optically inter-connected electronic processors and compares them with conventional symbolic substitution (SS) systems. The comparison is made on the basis of computational efficiency, speed, size, energy utilization, programmability, and fault tolerance. The small grain size and space-invariant connections of SS lead to poor computational efficiency, difficult programming, and difficult incorporation of fault tolerance. Reliance on optical gates as its fundamental building elements is shown to give poor energy utilization. Programmable optoelectronic multiprocessor (POEM) systems, on the other hand, provide the architectural flexibility for good computational efficiency, use an energy-efficient combination of technologies, and support traditional programming methodologies and fault tolerance. Although the inherent clock speed of POEM systems is slower than that of SS systems, for most problems they will provide greater computational throughput. This comparison does not take into account the recent addition of crossover interconnect and space-variant masks to the SS architecture.

  17. Simulating a small turboshaft engine in real-time multiprocessor simulator (RTMPS) environment

    NASA Technical Reports Server (NTRS)

    Milner, E. J.; Arpasi, D. J.

    1986-01-01

    A Real-Time Multiprocessor Simulator (RTMPS) has been developed at NASA Lewis Research Center. The RTMPS uses parallel microprocessors to achieve computing speeds needed for real-time engine simulation. This report describes the use of the RTMPS system to simulate a small turboshaft engine. The process of programming the engine equations and distributing them over one, two, and four processors is discussed. Steady-state and transient results from the RTMPS simulation are compared with results from a main-frame-based simulation. Processor execution times and the associated execution time savings for the two and four processor cases are presented using actual data obtained from the RTMPS system. Included is a discussion of why the minimum achievable calculation time for the turboshaft engine model was attained using four processors. Finally, future enhancements to the RTMPS system are discussed including the development of a generalized partitioning algorithm to automatically distribute the system equations among the processors in optimum fashion.

  18. Analysis of Photonic Networks for a Chip Multiprocessor Using Scientific Applications

    SciTech Connect

    Kamil, Shoaib A; Hendry, Gilbert; Biberman, Aleksandr; Chan, Johnnie; Lee, Benjamin G.; Mohiyuddin, Marghoob; Jain, Ankit; Bergman, Keren; Carloni, Luca; Kubiatowicz, John; Oliker, Leonid; Shalf, John

    2009-01-31

    As multiprocessors scale to unprecedented numbers of cores in order to sustain performance growth, it is vital that these gains are not nullified by high energy consumption from inter-core communication. With recent advances in 3D Integration CMOS technology, the possibility for realizing hybrid photonic-electronic networks-on-chip warrants investigating real application traces on functionally comparable photonic and electronic network designs. We present a comparative analysis using both synthetic benchmarks as well as real applications, run through detailed cycle accurate models implemented under the OMNeT++ discrete event simulation environment. Results show that when utilizing standard process-to-processor mapping methods, this hybrid network can achieve 75X improvement in energy efficiency for synthetic benchmarks and up to 37X improvement for real scientific applications, defined as network performance per energy spent, over an electronic mesh for large messages across a variety of communication patterns.

  19. Partitioning strategy for efficient nonlinear finite element dynamic analysis on multiprocessor computers

    NASA Technical Reports Server (NTRS)

    Noor, Ahmed K.; Peters, Jeanne M.

    1989-01-01

    A computational procedure is presented for the nonlinear dynamic analysis of unsymmetric structures on vector multiprocessor systems. The procedure is based on a novel hierarchical partitioning strategy in which the response of the unsymmetric and antisymmetric response vectors (modes), each obtained by using only a fraction of the degrees of freedom of the original finite element model. The three key elements of the procedure which result in high degree of concurrency throughout the solution process are: (1) mixed (or primitive variable) formulation with independent shape functions for the different fields; (2) operator splitting or restructuring of the discrete equations at each time step to delineate the symmetric and antisymmetric vectors constituting the response; and (3) two level iterative process for generating the response of the structure. An assessment is made of the effectiveness of the procedure on the CRAY X-MP/4 computers.

  20. Dynamic Scheduling Real-Time Task Using Primary-Backup Overloading Strategy for Multiprocessor Systems

    NASA Astrophysics Data System (ADS)

    Sun, Wei; Yu, Chen; Défago, Xavier; Inoguchi, Yasushi

    The scheduling of real-time tasks with fault-tolerant requirements has been an important problem in multiprocessor systems. The primary-backup (PB) approach is often used as a fault-tolerant technique to guarantee the deadlines of tasks despite the presence of faults. In this paper we propose a dynamic PB-based task scheduling approach, wherein an allocation parameter is used to search the available time slots for a newly arriving task, and the previously scheduled tasks can be re-scheduled when there is no available time slot for the newly arriving task. In order to improve the schedulability we also propose an overloading strategy for PB-overloading and Backup-backup (BB) overloading. Our proposed task scheduling algorithm is compared with some existing scheduling algorithms in the literature through simulation studies. The results have shown that the task rejection ratio of our real-time task scheduling algorithm is almost 50% lower than the compared algorithms.

  1. Characterizing parallel file-access patterns on a large-scale multiprocessor

    NASA Technical Reports Server (NTRS)

    Purakayastha, A.; Ellis, Carla; Kotz, David; Nieuwejaar, Nils; Best, Michael L.

    1995-01-01

    High-performance parallel file systems are needed to satisfy tremendous I/O requirements of parallel scientific applications. The design of such high-performance parallel file systems depends on a comprehensive understanding of the expected workload, but so far there have been very few usage studies of multiprocessor file systems. This paper is part of the CHARISMA project, which intends to fill this void by measuring real file-system workloads on various production parallel machines. In particular, we present results from the CM-5 at the National Center for Supercomputing Applications. Our results are unique because we collect information about nearly every individual I/O request from the mix of jobs running on the machine. Analysis of the traces leads to various recommendations for parallel file-system design.

  2. Debugging Fortran on a shared memory machine

    SciTech Connect

    Allen, T.R.; Padua, D.A.

    1987-01-01

    Debugging on a parallel processor is more difficult than debugging on a serial machine because errors in a parallel program may introduce nondeterminism. The approach to parallel debugging presented here attempts to reduce the problem of debugging on a parallel machine to that of debugging on a serial machine by automatically detecting nondeterminism. 20 refs., 6 figs.

  3. Mechanical memory

    DOEpatents

    Gilkey, Jeffrey C.; Duesterhaus, Michelle A.; Peter, Frank J.; Renn, Rosemarie A.; Baker, Michael S.

    2006-08-15

    A first-in-first-out (FIFO) microelectromechanical memory apparatus (also termed a mechanical memory) is disclosed. The mechanical memory utilizes a plurality of memory cells, with each memory cell having a beam which can be bowed in either of two directions of curvature to indicate two different logic states for that memory cell. The memory cells can be arranged around a wheel which operates as a clocking actuator to serially shift data from one memory cell to the next. The mechanical memory can be formed using conventional surface micromachining, and can be formed as either a nonvolatile memory or as a volatile memory.

  4. Mechanical memory

    DOEpatents

    Gilkey, Jeffrey C.; Duesterhaus, Michelle A.; Peter, Frank J.; Renn, Rosemarie A.; Baker, Michael S.

    2006-05-16

    A first-in-first-out (FIFO) microelectromechanical memory apparatus (also termed a mechanical memory) is disclosed. The mechanical memory utilizes a plurality of memory cells, with each memory cell having a beam which can be bowed in either of two directions of curvature to indicate two different logic states for that memory cell. The memory cells can be arranged around a wheel which operates as a clocking actuator to serially shift data from one memory cell to the next. The mechanical memory can be formed using conventional surface micromachining, and can be formed as either a nonvolatile memory or as a volatile memory.

  5. Asynchronous and corrected-asynchronous numerical solutions of parabolic PDES on MIMD multiprocessors

    NASA Technical Reports Server (NTRS)

    Amitai, Dganit; Averbuch, Amir; Itzikowitz, Samuel; Turkel, Eli

    1991-01-01

    A major problem in achieving significant speed-up on parallel machines is the overhead involved with synchronizing the concurrent process. Removing the synchronization constraint has the potential of speeding up the computation. The authors present asynchronous (AS) and corrected-asynchronous (CA) finite difference schemes for the multi-dimensional heat equation. Although the discussion concentrates on the Euler scheme for the solution of the heat equation, it has the potential for being extended to other schemes and other parabolic partial differential equations (PDEs). These schemes are analyzed and implemented on the shared memory multi-user Sequent Balance machine. Numerical results for one and two dimensional problems are presented. It is shown experimentally that the synchronization penalty can be about 50 percent of run time: in most cases, the asynchronous scheme runs twice as fast as the parallel synchronous scheme. In general, the efficiency of the parallel schemes increases with processor load, with the time level, and with the problem dimension. The efficiency of the AS may reach 90 percent and over, but it provides accurate results only for steady-state values. The CA, on the other hand, is less efficient, but provides more accurate results for intermediate (non steady-state) values.

  6. Event parallelism: Distributed memory parallel computing for high energy physics experiments

    SciTech Connect

    Nash, T.

    1989-05-01

    This paper describes the present and expected future development of distributed memory parallel computers for high energy physics experiments. It covers the use of event parallel microprocessor farms, particularly at Fermilab, including both ACP multiprocessors and farms of MicroVAXES. These systems have proven very cost effective in the past. A case is made for moving to the more open environment of UNIX and RISC processors. The 2nd Generation ACP Multiprocessor System, which is based on powerful RISC systems, is described. Given the promise of still more extraordinary increases in processor performance, a new emphasis on point to point, rather than bussed, communication will be required. Developments in this direction are described. 6 figs.

  7. System and method for memory allocation in a multiclass memory system

    DOEpatents

    Loh, Gabriel; Meswani, Mitesh; Ignatowski, Michael; Nutter, Mark

    2016-06-28

    A system for memory allocation in a multiclass memory system includes a processor coupleable to a plurality of memories sharing a unified memory address space, and a library store to store a library of software functions. The processor identifies a type of a data structure in response to a memory allocation function call to the library for allocating memory to the data structure. Using the library, the processor allocates portions of the data structure among multiple memories of the multiclass memory system based on the type of the data structure.

  8. Abnormal fault-recovery characteristics of the fault-tolerant multiprocessor uncovered using a new fault-injection methodology

    NASA Technical Reports Server (NTRS)

    Padilla, Peter A.

    1991-01-01

    An investigation was made in AIRLAB of the fault handling performance of the Fault Tolerant MultiProcessor (FTMP). Fault handling errors detected during fault injection experiments were characterized. In these fault injection experiments, the FTMP disabled a working unit instead of the faulted unit once in every 500 faults, on the average. System design weaknesses allow active faults to exercise a part of the fault management software that handles Byzantine or lying faults. Byzantine faults behave such that the faulted unit points to a working unit as the source of errors. The design's problems involve: (1) the design and interface between the simplex error detection hardware and the error processing software, (2) the functional capabilities of the FTMP system bus, and (3) the communication requirements of a multiprocessor architecture. These weak areas in the FTMP's design increase the probability that, for any hardware fault, a good line replacement unit (LRU) is mistakenly disabled by the fault management software.

  9. Sharing values, sharing a vision

    SciTech Connect

    Not Available

    1993-12-31

    Teamwork, partnership and shared values emerged as recurring themes at the Third Technology Transfer/Communications Conference. The program drew about 100 participants who sat through a packed two days to find ways for their laboratories and facilities to better help American business and the economy. Co-hosts were the Lawrence Livermore National Laboratory and the Lawrence Berkeley Laboratory, where most meetings took place. The conference followed traditions established at the First Technology Transfer/Communications Conference, conceived of and hosted by the Pacific Northwest Laboratory in May 1992 in Richmond, Washington, and the second conference, hosted by the National Renewable Energy Laboratory in January 1993 in Golden, Colorado. As at the other conferences, participants at the third session represented the fields of technology transfer, public affairs and communications. They came from Department of Energy headquarters and DOE offices, laboratories and production facilities. Continued in this report are keynote address; panel discussion; workshops; and presentations in technology transfer.

  10. 3-dimensional magnetotelluric inversion including topography using deformed hexahedral edge finite elements and direct solvers parallelized on symmetric multiprocessor computers - Part II: direct data-space inverse solution

    NASA Astrophysics Data System (ADS)

    Kordy, M.; Wannamaker, P.; Maris, V.; Cherkaev, E.; Hill, G.

    2016-01-01

    Following the creation described in Part I of a deformable edge finite-element simulator for 3-D magnetotelluric (MT) responses using direct solvers, in Part II we develop an algorithm named HexMT for 3-D regularized inversion of MT data including topography. Direct solvers parallelized on large-RAM, symmetric multiprocessor (SMP) workstations are used also for the Gauss-Newton model update. By exploiting the data-space approach, the computational cost of the model update becomes much less in both time and computer memory than the cost of the forward simulation. In order to regularize using the second norm of the gradient, we factor the matrix related to the regularization term and apply its inverse to the Jacobian, which is done using the MKL PARDISO library. For dense matrix multiplication and factorization related to the model update, we use the PLASMA library which shows very good scalability across processor cores. A synthetic test inversion using a simple hill model shows that including topography can be important; in this case depression of the electric field by the hill can cause false conductors at depth or mask the presence of resistive structure. With a simple model of two buried bricks, a uniform spatial weighting for the norm of model smoothing recovered more accurate locations for the tomographic images compared to weightings which were a function of parameter Jacobians. We implement joint inversion for static distortion matrices tested using the Dublin secret model 2, for which we are able to reduce nRMS to ˜1.1 while avoiding oscillatory convergence. Finally we test the code on field data by inverting full impedance and tipper MT responses collected around Mount St Helens in the Cascade volcanic chain. Among several prominent structures, the north-south trending, eruption-controlling shear zone is clearly imaged in the inversion.

  11. Reusing existing resources for testing a multi-processor system-on-chip

    NASA Astrophysics Data System (ADS)

    Lee, Seung Eun

    2013-03-01

    In this article, we propose a test strategy for a multi-processor system-on-chip and model the test time for distributed Intellectual Property (IP) cores. The proposed test methodology uses the existing on-chip resources, IP cores and network elements in network-on-chip. The use of embedded IP cores as a built- in self-test (BIST) module completes the test much faster than an external test and provides flexibility in the test program. Moreover, the reuse of the existing network resources as a test media eliminates additional test access mechanism (TAM) wires in the design and increases test parallelism, reducing the area and test time. Based on the proposed test methodology, we evaluate the test time for distributed IP cores. First, we define the model for a distributed IP core with four parameters in the context of test purposes. Next, the required test time is driven. Finally, we show the characteristics of IP cores for a parallel testing that provides useful information for the test scheduling.

  12. Reconfigurable fault-tolerant multiprocessor system for real-time control

    SciTech Connect

    Kao, M.L.

    1986-01-01

    Real-time control applications place stringent constraints in computers controlling them since the failure of a computer could result in costly damages and even loss of human lives. Fault-tolerant computers, therefore, have been always in high demand in critical avionic and aerospace applications. However, the use of redundancy techniques to achieve fault tolerance in industrial applications has only recently become feasible due to the rapid decrease in cost and increase in performance of microprocessors. As more and more robots are being built to replace human beings in dangerous and difficult tasks, the need for a reliable computer for robotics control increases. This need, in particular, motivated the research described in this dissertation - the design and implementation of a reconfigurable fault-tolerant multiprocessor system (the FREMP system). The FREMP system consists of four processing units (PUs) and three common parallel buses. Each PU is a combination of an Intel 86/30 single board computer and a custom fault detection/masking circuit board (FDM board). A hardware/software combined scheme was devised to detect faults and correct errors. This scheme has shown to be more efficient than software voting while maintaining the flexibility of software approaches. Time-frame scheduling was adopted to schedule tasks for execution.

  13. Fault-free behavior of reliable multiprocessor systems: FTMP experiments in AIRLAB

    NASA Technical Reports Server (NTRS)

    Clune, E.; Segall, Z.; Siewiorek, D.

    1985-01-01

    This report describes a set of experiments which were implemented on the Fault tolerant Multi-Processor (FTMP) at NASA/Langley's AIRLAB facility. These experiments are part of an effort to formulate and evaluate validation methodologies for fault-tolerant computers. This report deals with the measurement of single parameters (baselines) of a fault free system. The initial set of baseline experiments lead to the following conclusions: (1) The system clock is constant and independent of workload in the tested cases; (2) the instruction execution times are constant; (3) the R4 frame size is 40mS with some variation; (4) the frame stretching mechanism has some flaws in its implementation that allow the possibility of an infinite stretching of frame duration. Future experiments are planned. Some will broaden the results of these initial experiments. Others will measure the system more dynamically. The implementation of a synthetic workload generation mechanism for FTMP is planned to enhance the experimental environment of the system.

  14. Optimal Scheme for Search State Space and Scheduling on Multiprocessor Systems

    NASA Astrophysics Data System (ADS)

    Youness, Hassan A.; Sakanushi, Keishi; Takeuchi, Yoshinori; Salem, Ashraf; Wahdan, Abdel-Moneim; Imai, Masaharu

    A scheduling algorithm aims to minimize the overall execution time of the program by properly allocating and arranging the execution order of the tasks on the core processors such that the precedence constraints among the tasks are preserved. In this paper, we present a new scheduling algorithm by using geometry analysis of the Task Precedence Graph (TPG) based on A* search technique and uses a computationally efficient cost function for guiding the search with reduced complexity and pruning techniques to produce an optimal solution for the allocation/scheduling problem of a parallel application to parallel and multiprocessor architecture. The main goal of this work is to significantly reduce the search space and achieve the optimality or near optimal solution. We implemented the algorithm on general task graph problems that are processed on most of related search work and obtain the optimal scheduling with a small number of states. The proposed algorithm reduced the exhaustive search by at least 50% of search space. The viability and potential of the proposed algorithm is demonstrated by an illustrative example.

  15. Memory Matters

    MedlinePlus

    ... different parts. Some of them are important for memory. The hippocampus (say: hih-puh-KAM-pus) is one of the more important parts of the brain that processes memories. Old information and new information, or memories, are ...

  16. Microsupercomputers: Design and implementation. Technical progress report, November 1988-March 1989

    SciTech Connect

    Hennessy, J.L.; Horowitz, M.A.

    1989-03-01

    Contents: (1) parallel processor architecture; (2) parallel software; (3) unit processor architecture; (4) computer aided designs tools; (5) very large scale integration. keywords: scalable shared memory multiprocessors, high performance cache design.

  17. Bipartite memory network architectures for parallel processing

    SciTech Connect

    Smith, W.; Kale, L.V. . Dept. of Computer Science)

    1990-01-01

    Parallel architectures are boradly classified as either shared memory or distributed memory architectures. In this paper, the authors propose a third family of architectures, called bipartite memory network architectures. In this architecture, processors and memory modules constitute a bipartite graph, where each processor is allowed to access a small subset of the memory modules, and each memory module allows access from a small set of processors. The architecture is particularly suitable for computations requiring dynamic load balancing. The authors explore the properties of this architecture by examining the Perfect Difference set based topology for the graph. Extensions of this topology are also suggested.

  18. The FORCE - A highly portable parallel programming language

    NASA Technical Reports Server (NTRS)

    Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

    1989-01-01

    This paper explains why the FORCE parallel programming language is easily portable among six different shared-memory multiprocessors, and how a two-level macro preprocessor makes it possible to hide low-level machine dependencies and to build machine-independent high-level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared-memory multiprocessor executing them.

  19. Efficient mapping algorithms for scheduling robot inverse dynamics computation on a multiprocessor system

    NASA Technical Reports Server (NTRS)

    Lee, C. S. G.; Chen, C. L.

    1989-01-01

    Two efficient mapping algorithms for scheduling the robot inverse dynamics computation consisting of m computational modules with precedence relationship to be executed on a multiprocessor system consisting of p identical homogeneous processors with processor and communication costs to achieve minimum computation time are presented. An objective function is defined in terms of the sum of the processor finishing time and the interprocessor communication time. The minimax optimization is performed on the objective function to obtain the best mapping. This mapping problem can be formulated as a combination of the graph partitioning and the scheduling problems; both have been known to be NP-complete. Thus, to speed up the searching for a solution, two heuristic algorithms were proposed to obtain fast but suboptimal mapping solutions. The first algorithm utilizes the level and the communication intensity of the task modules to construct an ordered priority list of ready modules and the module assignment is performed by a weighted bipartite matching algorithm. For a near-optimal mapping solution, the problem can be solved by the heuristic algorithm with simulated annealing. These proposed optimization algorithms can solve various large-scale problems within a reasonable time. Computer simulations were performed to evaluate and verify the performance and the validity of the proposed mapping algorithms. Finally, experiments for computing the inverse dynamics of a six-jointed PUMA-like manipulator based on the Newton-Euler dynamic equations were implemented on an NCUBE/ten hypercube computer to verify the proposed mapping algorithms. Computer simulation and experimental results are compared and discussed.

  20. Polynomial algorithms for multiprocessor scheduling with a small number of job lengths

    SciTech Connect

    McCormick, S.T.; Smallwood, S.R.; Spieksma, F.C.R.

    1997-06-01

    The following problem was originally motivated by a question arising in scheduling maintenance periods for aircraft. Each maintenance period is a job, and the maintenance facilities are machines. In this context, there are very few different types of maintenances performed, so it is natural to consider the problem with only a small, fixed number C of different types of jobs. Each job type has a processing time, and each machine is available for the same length of time. A machine can handle at most one job at a time, all jobs are released at time zero, there are no due dates or precedence constraints, and preemption is not allowed. The question is whether it is possible to finish all jobs. We call this problem the Multiprocessor Scheduling Problem with C job lengths (MSPC). Scheduling problems such as MSPC where we can partition the jobs into a relatively few types such that all jobs of each type are identical are often called high-multiplicity problems. High-multiplicity problems are interesting because their input is very compact: the input to MSPC consists of only 2C + 2 numbers. For the case C = 2 we present a polynomial-time algorithm. We show that this algorithm produces a schedule that uses at most three different one-machine schedules, the minimum possible number. Further, we extend this algorithm to the case of machine-dependent deadlines and to a multi-parametric case. Finally, we discuss why our approach appears not to extend to the case C > 2.

  1. Process Management and Exception Handling in Multiprocessor Operating Systems Using Object-Oriented Design Techniques. Revised Sep. 1988

    NASA Technical Reports Server (NTRS)

    Russo, Vincent; Johnston, Gary; Campbell, Roy

    1988-01-01

    The programming of the interrupt handling mechanisms, process switching primitives, scheduling mechanism, and synchronization primitives of an operating system for a multiprocessor require both efficient code in order to support the needs of high- performance or real-time applications and careful organization to facilitate maintenance. Although many advantages have been claimed for object-oriented class hierarchical languages and their corresponding design methodologies, the application of these techniques to the design of the primitives within an operating system has not been widely demonstrated. To investigate the role of class hierarchical design in systems programming, the authors have constructed the Choices multiprocessor operating system architecture the C++ programming language. During the implementation, it was found that many operating system design concerns can be represented advantageously using a class hierarchical approach, including: the separation of mechanism and policy; the organization of an operating system into layers, each of which represents an abstract machine; and the notions of process and exception management. In this paper, we discuss an implementation of the low-level primitives of this system and outline the strategy by which we developed our solution.

  2. Parallel processing of real-time dynamic systems simulation on OSCAR (Optimally SCheduled Advanced multiprocessoR)

    NASA Technical Reports Server (NTRS)

    Kasahara, Hironori; Honda, Hiroki; Narita, Seinosuke

    1989-01-01

    Parallel processing of real-time dynamic systems simulation on a multiprocessor system named OSCAR is presented. In the simulation of dynamic systems, generally, the same calculation are repeated every time step. However, we cannot apply to Do-all or the Do-across techniques for parallel processing of the simulation since there exist data dependencies from the end of an iteration to the beginning of the next iteration and furthermore data-input and data-output are required every sampling time period. Therefore, parallelism inside the calculation required for a single time step, or a large basic block which consists of arithmetic assignment statements, must be used. In the proposed method, near fine grain tasks, each of which consists of one or more floating point operations, are generated to extract the parallelism from the calculation and assigned to processors by using optimal static scheduling at compile time in order to reduce large run time overhead caused by the use of near fine grain tasks. The practicality of the scheme is demonstrated on OSCAR (Optimally SCheduled Advanced multiprocessoR) which has been developed to extract advantageous features of static scheduling algorithms to the maximum extent.

  3. Memory Matters

    MedlinePlus

    ... blood vessel (which carries the blood) bursts. continue Brain Injuries Affect Memory At any age, an injury to ... with somebody's memory. Some people who recover from brain injuries need to learn old things all over again, ...

  4. SHARING EDUCATIONAL SERVICES.

    ERIC Educational Resources Information Center

    Catskill Area Project in Small School Design, Oneonta, NY.

    SHARED SERVICES, A COOPERATIVE SCHOOL RESOURCE PROGRAM, IS DEFINED IN DETAIL. INCLUDED IS A DISCUSSION OF THEIR NEED, ADVANTAGES, GROWTH, DESIGN, AND OPERATION. SPECIFIC PROCEDURES FOR OBTAINING STATE AID IN SHARED SERVICES, EFFECTS OF SHARED SERVICES ON THE SCHOOL, AND HINTS CONCERNING SHARED SERVICES ARE DESCRIBED. CHARACTERISTICS OF THE SMALL…

  5. State recovery and lockstep execution restart in a system with multiprocessor pairing

    DOEpatents

    Gara, Alan; Gschwind, Michael K; Salapura, Valentina

    2014-01-21

    System, method and computer program product for a multiprocessing system to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). Each paired microprocessor or processor cores that provide one highly reliable thread for high-reliability connect with a system components such as a memory "nest" (or memory hierarchy), an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. The memory nest is attached to a selective pairing facility via a switch or a bus. Each selectively paired processor core is includes a transactional execution facility, whereing the system is configured to enable processor rollback to a previous state and reinitialize lockstep execution in order to recover from an incorrect execution when an incorrect execution has been detected by the selective pairing facility.

  6. Memory Dysfunction

    PubMed Central

    Matthews, Brandy R.

    2015-01-01

    Purpose of Review: This article highlights the dissociable human memory systems of episodic, semantic, and procedural memory in the context of neurologic illnesses known to adversely affect specific neuroanatomic structures relevant to each memory system. Recent Findings: Advances in functional neuroimaging and refinement of neuropsychological and bedside assessment tools continue to support a model of multiple memory systems that are distinct yet complementary and to support the potential for one system to be engaged as a compensatory strategy when a counterpart system fails. Summary: Episodic memory, the ability to recall personal episodes, is the subtype of memory most often perceived as dysfunctional by patients and informants. Medial temporal lobe structures, especially the hippocampal formation and associated cortical and subcortical structures, are most often associated with episodic memory loss. Episodic memory dysfunction may present acutely, as in concussion; transiently, as in transient global amnesia (TGA); subacutely, as in thiamine deficiency; or chronically, as in Alzheimer disease. Semantic memory refers to acquired knowledge about the world. Anterior and inferior temporal lobe structures are most often associated with semantic memory loss. The semantic variant of primary progressive aphasia (svPPA) is the paradigmatic disorder resulting in predominant semantic memory dysfunction. Working memory, associated with frontal lobe function, is the active maintenance of information in the mind that can be potentially manipulated to complete goal-directed tasks. Procedural memory, the ability to learn skills that become automatic, involves the basal ganglia, cerebellum, and supplementary motor cortex. Parkinson disease and related disorders result in procedural memory deficits. Most memory concerns warrant bedside cognitive or neuropsychological evaluation and neuroimaging to assess for specific neuropathologies and guide treatment. PMID:26039844

  7. Memory protection

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.

    1988-01-01

    Accidental overwriting of files or of memory regions belonging to other programs, browsing of personal files by superusers, Trojan horses, and viruses are examples of breakdowns in workstations and personal computers that would be significantly reduced by memory protection. Memory protection is the capability of an operating system and supporting hardware to delimit segments of memory, to control whether segments can be read from or written into, and to confine accesses of a program to its segments alone. The absence of memory protection in many operating systems today is the result of a bias toward a narrow definition of performance as maximum instruction-execution rate. A broader definition, including the time to get the job done, makes clear that cost of recovery from memory interference errors reduces expected performance. The mechanisms of memory protection are well understood, powerful, efficient, and elegant. They add to performance in the broad sense without reducing instruction execution rate.

  8. Collaboratively Sharing Scientific Data

    NASA Astrophysics Data System (ADS)

    Wang, Fusheng; Vergara-Niedermayr, Cristobal

    Scientific research becomes increasingly reliant on multi-disciplinary, multi-institutional collaboration through sharing experimental data. Indeed, data sharing is mandatory by government research agencies such as NIH. The major hurdles for data sharing come from: i) the lack of data sharing infrastructure to make data sharing convenient for users; ii) users’ fear of losing control of their data; iii) difficulty on sharing schemas and incompatible data from sharing partners; and iv) inconsistent data under schema evolution. In this paper, we develop a collaborative data sharing system SciPort, to support consistency preserved data sharing among multiple distributed organizations. The system first provides Central Server based lightweight data integration architecture, so data and schemas can be conveniently shared across multiple organizations. Through distributed schema management, schema sharing and evolution is made possible, while data consistency is maintained and data compatibility is enforced. With this data sharing system, distributed sites can now consistently share their research data and their associated schemas with much convenience and flexibility. SciPort has been successfully used for data sharing in biomedical research, clinical trials and large scale research collaboration.

  9. Modeling and performance evaluation of the DPS25 packet switching multiprocessor system

    NASA Astrophysics Data System (ADS)

    Mitropoulos, Spyridon

    1987-05-01

    A packet switching communication network is evaluated via queue network modeling theory. Direct access memory systems, busses linking the network processors, commutation and signal processors, and virtual reference circuits are considered. All models are developed with the help of the QNAP2 program and listings of the models are given. The method is illustrated with results from simulations at the various system levels.

  10. Children's Working Memory: Investigating Performance Limitations in Complex Span Tasks

    ERIC Educational Resources Information Center

    Conlin, J.A.; Gathercole, S.E.; Adams, J.W.

    2005-01-01

    Three experiments investigated the roles of resource-sharing and intrinsic memory demands in complex working memory span performance in 7- and 9-year-olds. In Experiment 1, the processing complexity of arithmetic operations was varied under conditions in which processing times were equivalent. Memory span did not differ as a function of processing…

  11. The Structure of Memory: Fixed of Flexible? Structural Learning Series.

    ERIC Educational Resources Information Center

    Scandura, Joseph M.

    Most current information processing theories of cognition and memory share one common feature: the structure (state-space) of memory is fixed and retrieval from memory involves searching through that structure. Learning, where it is treated at all, involves transforming one such structure into another. This form of representation is questioned and…

  12. A Formal Model of Capacity Limits in Working Memory

    ERIC Educational Resources Information Center

    Oberauer, Klaus; Kliegl, Reinhold

    2006-01-01

    A mathematical model of working-memory capacity limits is proposed on the key assumption of mutual interference between items in working memory. Interference is assumed to arise from overwriting of features shared by these items. The model was fit to time-accuracy data of memory-updating tasks from four experiments using nonlinear mixed effect…

  13. Memories Are Made of This

    ERIC Educational Resources Information Center

    Chang, Christine

    2010-01-01

    In this article, the author shares her memories of Sally Smith, the founder of The Lab School of Washington, where she works as the director of the Occupational Therapy. When the author first met Smith, Smith asked her what brought her to The Lab School at that point in her career. She told Smith that her background was rather eclectic, since she…

  14. The influence of visual feedback from the recent past on the programming of grip aperture is grasp-specific, shared between hands, and mediated by sensorimotor memory not task set.

    PubMed

    Tang, Rixin; Whitwell, Robert L; Goodale, Melvyn A

    2015-05-01

    Goal-directed movements, such as reaching out to grasp an object, are necessarily constrained by the spatial properties of the target such as its size, shape, and position. For example, during a reach-to-grasp movement, the peak width of the aperture formed by the thumb and fingers in flight (peak grip aperture, PGA) is linearly related to the target's size. Suppressing vision throughout the movement (visual open loop) has a small though significant effect on this relationship. Visual open loop conditions also produce a large increase in the PGA compared to when vision is available throughout the movement (visual closed loop). Curiously, this differential effect of the availability of visual feedback is influenced by the presentation order: the difference in PGA between closed- and open-loop trials is smaller when these trials are intermixed (an effect we have called 'homogenization'). Thus, grasping movements are affected not only by the availability of visual feedback (closed loop or open loop) but also by what happened on the previous trial. It is not clear, however, whether this carry-over effect is mediated through motor (or sensorimotor) memory or through the interference of different task sets for closed-loop and open-loop feedback that determine when the movements are fully specified. We reasoned that sensorimotor memory, but not a task set for closed and open loop feedback, would be specific to the type of response. We tested this prediction in a condition in which pointing to targets was alternated with grasping those same targets. Critically, in this condition, when pointing was performed in open loop, grasping was always performed in closed loop (and vice versa). Despite the fact that closed- and open-loop trials were alternating in this condition, we found no evidence for homogenization of the PGA. Homogenization did occur, however, in a follow-up experiment in which grasping movements and visual feedback were alternated between the left and the right

  15. Generalized quantum secret sharing

    SciTech Connect

    Singh, Sudhir Kumar; Srikanth, R.

    2005-01-01

    We explore a generalization of quantum secret sharing (QSS) in which classical shares play a complementary role to quantum shares, exploring further consequences of an idea first studied by Nascimento, Mueller-Quade, and Imai [Phys. Rev. A 64, 042311 (2001)]. We examine three ways, termed inflation, compression, and twin thresholding, by which the proportion of classical shares can be augmented. This has the important application that it reduces quantum (information processing) players by replacing them with their classical counterparts, thereby making quantum secret sharing considerably easier and less expensive to implement in a practical setting. In compression, a QSS scheme is turned into an equivalent scheme with fewer quantum players, compensated for by suitable classical shares. In inflation, a QSS scheme is enlarged by adding only classical shares and players. In a twin-threshold scheme, we invoke two separate thresholds for classical and quantum shares based on the idea of information dilution.

  16. Flashbulb Memories

    PubMed Central

    Hirst, William; Phelps, Elizabeth A.

    2015-01-01

    We review and analyze the key theories, debates, findings, and omissions of the existing literature on flashbulb memories (FBMs), including what factors affect their formation, retention, and degree of confidence. We argue that FBMs do not require special memory mechanisms and are best characterized as involving both forgetting and mnemonic distortions, despite a high level of confidence. Factual memories for FBM-inducing events generally follow a similar pattern. Although no necessary and sufficient factors straightforwardly account for FBM retention, media attention particularly shapes memory for the events themselves. FBMs are best characterized in term of repetitions, even of mnemonic distortions, whereas event memories evidence corrections. The bearing of this literature on social identity and traumatic memories is also discussed. PMID:26997762

  17. Virtual memory

    NASA Technical Reports Server (NTRS)

    Denning, P. J.

    1986-01-01

    Virtual memory was conceived as a way to automate overlaying of program segments. Modern computers have very large main memories, but need automatic solutions to the relocation and protection problems. Virtual memory serves this need as well and is thus useful in computers of all sizes. The history of the idea is traced, showing how it has become a widespread, little noticed feature of computers today.

  18. Skilled Memory.

    DTIC Science & Technology

    1980-11-06

    Morse code (Bryan & Harter , 1899). In every case, memory performance of the expert seems to violate the established limits of short- term memory. How is...of immediate memory. Quarterly Journal of Experimental psychology, 1958, 10, 12-21. Bryan, W. L., & Harter N. psychological Review, 1899, 6, 345-375...16, 1980 Page 5 Civil Govt Non Govt Dr. Susan Chipman 1 Dr. John R. Anderson Learning and Development Department of Psychology National Institute of

  19. Memory in health and in schizophrenia.

    PubMed

    Gur, Ruben C; Gur, Raquel E

    2013-12-01

    Memory is an important capacity needed for survival in a changing environment, and its principles are shared across species. These principles have been studied since the inception of behavioral science, and more recently neuroscience has helped understand brain systems and mechanisms responsible for enabling aspects of memory. Here we outline the history of work on memory and its neural underpinning, and describe the major dimensions of memory processing that have been evaluated by cognitive neuroscience, focusing on episodic memory. We present evidence in healthy populations for sex differences-females outperforming in verbal and face memory, and age effects-slowed memory processes with age. We then describe deficits associated with schizophrenia. Impairment in schizophrenia is more severe in patients with negative symptoms-especially flat affect-who also show deficits in measures of social cognition. This evidence implicates medial temporal and frontal regions in schizophrenia.

  20. Memory in health and in schizophrenia

    PubMed Central

    Gur, Ruben C.; Gur, Raquel E.

    2013-01-01

    Memory is an important capacity needed for survival in a changing environment, and its principles are shared across species. These principles have been studied since the inception of behavioral science, and more recently neuroscience has helped understand brain systems and mechanisms responsible for enabling aspects of memory. Here we outline the history of work on memory and its neural underpinning, and describe the major dimensions of memory processing that have been evaluated by cognitive neuroscience, focusing on episodic memory. We present evidence in healthy populations for sex differences—females outperforming in verbal and face memory, and age effects—slowed memory processes with age. We then describe deficits associated with schizophrenia. Impairment in schizophrenia is more severe in patients with negative symptoms—especially flat affect—who also show deficits in measures of social cognition. This evidence implicates medial temporal and frontal regions in schizophrenia. PMID:24459407

  1. DMA shared byte counters in a parallel computer

    DOEpatents

    Chen, Dong; Gara, Alan G.; Heidelberger, Philip; Vranas, Pavlos

    2010-04-06

    A parallel computer system is constructed as a network of interconnected compute nodes. Each of the compute nodes includes at least one processor, a memory and a DMA engine. The DMA engine includes a processor interface for interfacing with the at least one processor, DMA logic, a memory interface for interfacing with the memory, a DMA network interface for interfacing with the network, injection and reception byte counters, injection and reception FIFO metadata, and status registers and control registers. The injection FIFOs maintain memory locations of the injection FIFO metadata memory locations including its current head and tail, and the reception FIFOs maintain the reception FIFO metadata memory locations including its current head and tail. The injection byte counters and reception byte counters may be shared between messages.

  2. Episodic Memories

    ERIC Educational Resources Information Center

    Conway, Martin A.

    2009-01-01

    An account of episodic memories is developed that focuses on the types of knowledge they represent, their properties, and the functions they might serve. It is proposed that episodic memories consist of "episodic elements," summary records of experience often in the form of visual images, associated to a "conceptual frame" that provides a…

  3. Collaging Memories

    ERIC Educational Resources Information Center

    Wallach, Michele

    2011-01-01

    Even middle school students can have memories of their childhoods, of an earlier time. The art of Romare Bearden and the writings of Paul Auster can be used to introduce ideas about time and memory to students and inspire works of their own. Bearden is an exceptional role model for young artists, not only because of his astounding art, but also…

  4. Memory conformity affects inaccurate memories more than accurate memories.

    PubMed

    Wright, Daniel B; Villalba, Daniella K

    2012-01-01

    After controlling for initial confidence, inaccurate memories were shown to be more easily distorted than accurate memories. In two experiments groups of participants viewed 50 stimuli and were then presented with these stimuli plus 50 fillers. During this test phase participants reported their confidence that each stimulus was originally shown. This was followed by computer-generated responses from a bogus participant. After being exposed to this response participants again rated the confidence of their memory. The computer-generated responses systematically distorted participants' responses. Memory distortion depended on initial memory confidence, with uncertain memories being more malleable than confident memories. This effect was moderated by whether the participant's memory was initially accurate or inaccurate. Inaccurate memories were more malleable than accurate memories. The data were consistent with a model describing two types of memory (i.e., recollective and non-recollective memories), which differ in how susceptible these memories are to memory distortion.

  5. A comparison of multiprocessor scheduling methods for iterative data flow architectures

    NASA Technical Reports Server (NTRS)

    Storch, Matthew

    1993-01-01

    A comparative study is made between the Algorithm to Architecture Mapping Model (ATAMM) and three other related multiprocessing models from the published literature. The primary focus of all four models is the non-preemptive scheduling of large-grain iterative data flow graphs as required in real-time systems, control applications, signal processing, and pipelined computations. Important characteristics of the models such as injection control, dynamic assignment, multiple node instantiations, static optimum unfolding, range-chart guided scheduling, and mathematical optimization are identified. The models from the literature are compared with the ATAMM for performance, scheduling methods, memory requirements, and complexity of scheduling and design procedures.

  6. Eliminating Useless Messages in Write-Update Protocols on Scalable Multiprocessors.

    DTIC Science & Technology

    1994-10-01

    between successive write operations to the data [Eggers and Katz, 1988]. The disadvantage of WU is that every write operation to shared data requires...CetogogDston ushi useful739 4- term $.9 1.8 - teno prolif I prdif 15 ’false 1.8- Wee I M 12.5- 1o 2 - 0.41 . 1.0 1 .5 1.5- 050 1 042 00.40.7 0.4 .. 0.0

  7. A multilevel nonvolatile magnetoelectric memory

    NASA Astrophysics Data System (ADS)

    Shen, Jianxin; Cong, Junzhuang; Shang, Dashan; Chai, Yisheng; Shen, Shipeng; Zhai, Kun; Sun, Young

    2016-09-01

    The coexistence and coupling between magnetization and electric polarization in multiferroic materials provide extra degrees of freedom for creating next-generation memory devices. A variety of concepts of multiferroic or magnetoelectric memories have been proposed and explored in the past decade. Here we propose a new principle to realize a multilevel nonvolatile memory based on the multiple states of the magnetoelectric coefficient (α) of multiferroics. Because the states of α depends on the relative orientation between magnetization and polarization, one can reach different levels of α by controlling the ratio of up and down ferroelectric domains with external electric fields. Our experiments in a device made of the PMN-PT/Terfenol-D multiferroic heterostructure confirm that the states of α can be well controlled between positive and negative by applying selective electric fields. Consequently, two-level, four-level, and eight-level nonvolatile memory devices are demonstrated at room temperature. This kind of multilevel magnetoelectric memory retains all the advantages of ferroelectric random access memory but overcomes the drawback of destructive reading of polarization. In contrast, the reading of α is nondestructive and highly efficient in a parallel way, with an independent reading coil shared by all the memory cells.

  8. A multilevel nonvolatile magnetoelectric memory

    PubMed Central

    Shen, Jianxin; Cong, Junzhuang; Shang, Dashan; Chai, Yisheng; Shen, Shipeng; Zhai, Kun; Sun, Young

    2016-01-01

    The coexistence and coupling between magnetization and electric polarization in multiferroic materials provide extra degrees of freedom for creating next-generation memory devices. A variety of concepts of multiferroic or magnetoelectric memories have been proposed and explored in the past decade. Here we propose a new principle to realize a multilevel nonvolatile memory based on the multiple states of the magnetoelectric coefficient (α) of multiferroics. Because the states of α depends on the relative orientation between magnetization and polarization, one can reach different levels of α by controlling the ratio of up and down ferroelectric domains with external electric fields. Our experiments in a device made of the PMN-PT/Terfenol-D multiferroic heterostructure confirm that the states of α can be well controlled between positive and negative by applying selective electric fields. Consequently, two-level, four-level, and eight-level nonvolatile memory devices are demonstrated at room temperature. This kind of multilevel magnetoelectric memory retains all the advantages of ferroelectric random access memory but overcomes the drawback of destructive reading of polarization. In contrast, the reading of α is nondestructive and highly efficient in a parallel way, with an independent reading coil shared by all the memory cells. PMID:27681812

  9. Proactive quantum secret sharing

    NASA Astrophysics Data System (ADS)

    Qin, Huawang; Dai, Yuewei

    2015-11-01

    A proactive quantum secret sharing scheme is proposed, in which the participants can update their key shares periodically. In an updating period, one participant randomly generates the EPR pairs, and the other participants update their key shares and perform the corresponding unitary operations on the particles of the EPR pairs. Then, the participant who generated the EPR pairs performs the Bell-state measurement and updates his key share according to the result of the Bell-state measurement. After an updating period, each participant can change his key share, but the secret is changeless, and the old key shares will be useless even if they have been stolen by the attacker. The proactive property of our scheme is very useful to resist the mobile attacker.

  10. Parallel variable-band Choleski solvers for computational structural analysis applications on vector multiprocessor supercomputers

    NASA Technical Reports Server (NTRS)

    Poole, E. L.; Overman, A. L.

    1991-01-01

    A Choleski method used to solve linear systems of equations that arise in large scale structural analyses is described. The method uses a novel variable-band storage scheme and is structured to exploit fast local memory caches while minimizing data access delays between main memory and vector registers. Several parallel implementations of this method are described for the CRAY-2 and CRAY Y-MP computers demonstrating the use of microtasking and autotasking directives. A portable parallel language, FORCE, is also used for two different parallel implementations, demonstrating the use of CRAY macrotasking. Results are presented comparing the matrix factorization times for three representative structural analysis problems from runs made in both dedicated and multi-user modes on both the CRAY-2 and CRAY Y-MP computers. CPU and wall clock timings are given for the various parallel methods and are compared to single processor timings of the same algorithm. Computation rates over 1 GIGAFLOP (1 billion floating point operations per second) on a four processor CRAY-2 and over 2 GIGAFLOPS on an eight processor CRAY Y-MP are demonstrated as measured by wall clock time in a dedicated environment. Reduced wall clock times for the parallel methods relative to the single processor implementation of the same Choleski algorithm are also demonstrated for runs made in multi-user mode.

  11. Commissioning of SharePlan: The Liverpool Experience

    NASA Astrophysics Data System (ADS)

    Xing, Aitang; Deshpande, Shrikant; Arumugam, Sankar; George, Armia; Holloway, Lois; Goozee, Gary

    2014-03-01

    SharePlan is a treatment planning system developed by Raysearch Laboratories AB to enable creation of a linear accelerator intensity modulated radiotherapy (IMRT) plan as a backup for a Tomotherapy plan. A 6MV Elekta Synergy Linear accelerator photon beam was modelled in SharePlan. The beam model was validated using Matrix Evolution, a 2D ion chamber array, for two head-neck and three prostate plans using 3%/3mm Gamma criteria. For 39 IMRT beams, the minimum and maximum Gamma pass rates are 95.4% and 98.7%. SharePlan is able to generate backup IMRT plans which are deliverable on a traditional linear accelerator and accurate in terms of clinical criteria. During use of SharePlan, however, an out-of-memory error frequently occurred and SharePlan was forced to be closed. This error occurred occasionally at any of these steps: loading the Tomotherapy plan into SharePlan, generating the IMRT plan, selecting the optimal plan, approving the plan and setting up a QA plan. The out-of-memory error was caused by memory leakage in one or more of the C/C++ functions implemented in SharePlan fluence engine, dose engine or optimizer, as acknowledged by the manufacturer. Because of the interruption caused by out-of-memory errors, SharePlan has not been implemented in our clinic although accuracy has been verified. A new software program is now being provided to our centre to replace SharePlan.

  12. Support for non-locking parallel reception of packets belonging to a single memory reception FIFO

    DOEpatents

    Chen, Dong [Yorktown Heights, NY; Heidelberger, Philip [Yorktown Heights, NY; Salapura, Valentina [Yorktown Heights, NY; Senger, Robert M [Yorktown Heights, NY; Steinmacher-Burow, Burkhard [Boeblingen, DE; Sugawara, Yutaka [Yorktown Heights, NY

    2011-01-27

    A method and apparatus for distributed parallel messaging in a parallel computing system. A plurality of DMA engine units are configured in a multiprocessor system to operate in parallel, one DMA engine unit for transferring a current packet received at a network reception queue to a memory location in a memory FIFO (rmFIFO) region of a memory. A control unit implements logic to determine whether any prior received packet destined for that rmFIFO is still in a process of being stored in the associated memory by another DMA engine unit of the plurality, and prevent the one DMA engine unit from indicating completion of storing the current received packet in the reception memory FIFO (rmFIFO) until all prior received packets destined for that rmFIFO are completely stored by the other DMA engine units. Thus, there is provided non-locking support so that multiple packets destined for a single rmFIFO are transferred and stored in parallel to predetermined locations in a memory.

  13. Models, Norms and Sharing.

    ERIC Educational Resources Information Center

    Harris, Mary B.

    To investigate the effect of modeling on altruism, 156 third and fifth grade children were exposed to a model who either shared with them, gave to a charity, or refused to share. The test apparatus, identified as a game, consisted of a box with signal lights and a chute through which marbles were dispensed. Subjects and the model played the game…

  14. Shared Parenting Dysfunction.

    ERIC Educational Resources Information Center

    Turkat, Ira Daniel

    2002-01-01

    Joint custody of children is the most prevalent court ordered arrangement for families of divorce. A growing body of literature indicates that many parents engage in behaviors that are incompatible with shared parenting. This article provides specific criteria for a definition of the Shared Parenting Dysfunction. Clinical aspects of the phenomenon…

  15. Intelligence Sharing in Bosnia

    DTIC Science & Technology

    2007-11-02

    increases with the demands of near real time accurate intelligence for operational decision-making. Given this environment, intelligence-sharing...operating system providing actionable near-real- time intelligence to commanders for coalition synchronization and the requirement to protect national...real time accurate intelligence for operational decision-making. Given this environment, intelligence-sharing requirements across an ad hoc coalition

  16. Collective memory: a perspective from (experimental) clinical psychology.

    PubMed

    Wessel, Ineke; Moulds, Michelle L

    2008-04-01

    This paper considers the concept of collective memory from an experimental clinical psychology perspective. Exploration of the term collective reveals a broad distinction between literatures that view collective memories as a property of groups (collectivistic memory) and those that regard these memories as a property of individuals who are, to a greater or lesser extent, an integral part of their social environment (social memory). First, we argue that the understanding of collectivistic memory phenomena may benefit from drawing parallels with current psychological models such as the self-memory system theory of individualistic autobiographical memory. Second, we suggest that the social memory literature may inform the study of trauma-related disorders. We argue that a factual focus induced by collaborative remembering may be beneficial to natural recovery in the immediate aftermath of trauma, and propose that shared remembering techniques may provide a useful addition to the treatment of post-traumatic stress disorder.

  17. Rearview Memories

    ERIC Educational Resources Information Center

    Gross, Gwen E.

    2008-01-01

    In this article, the author shares her experience when she was still a student until she became a superintendent. In her 17th year in the superintendency, the author finds the joys of her work all around her, grateful to be bestowed with the gift of leadership. She shares with colleagues a few especially meaningful moments from her professional…

  18. Memory Systems Do Not Divide on Consciousness: Reinterpreting Memory in Terms of Activation and Binding

    PubMed Central

    Reder, Lynne M.; Park, Heekyeong; Kieffaber, Paul D.

    2009-01-01

    There is a popular hypothesis that performance on implicit and explicit memory tasks reflects 2 distinct memory systems. Explicit memory is said to store those experiences that can be consciously recollected, and implicit memory is said to store experiences and affect subsequent behavior but to be unavailable to conscious awareness. Although this division based on awareness is a useful taxonomy for memory tasks, the authors review the evidence that the unconscious character of implicit memory does not necessitate that it be treated as a separate system of human memory. They also argue that some implicit and explicit memory tasks share the same memory representations and that the important distinction is whether the task (implicit or explicit) requires the formation of a new association. The authors review and critique dissociations from the behavioral, amnesia, and neuroimaging literatures that have been advanced in support of separate explicit and implicit memory systems by highlighting contradictory evidence and by illustrating how the data can be accounted for using a simple computational memory model that assumes the same memory representation for those disparate tasks. PMID:19210052

  19. Nonpreemptive run-time scheduling issues on a multitasked, multiprogrammed multiprocessor with dependencies, bidimensional tasks, folding and dynamic graphs

    SciTech Connect

    Miller, Allan Ray

    1987-05-01

    Increases in high speed hardware have mandated studies in software techniques to exploit the parallel capabilities. This thesis examines the effects a run-time scheduler has on a multiprocessor. The model consists of directed, acyclic graphs, generated from serial FORTRAN benchmark programs by the parallel compiler Parafrase. A multitasked, multiprogrammed environment is created. Dependencies are generated by the compiler. Tasks are bidimensional, i.e., they may specify both time and processor requests. Processor requests may be folded into execution time by the scheduler. The graphs may arrive at arbitrary time intervals. The general case is NP-hard, thus, a variety of heuristics are examined by a simulator. Multiprogramming demonstrates a greater need for a run-time scheduler than does monoprogramming for a variety of reasons, e.g., greater stress on the processors, a larger number of independent control paths, more variety in the task parameters, etc. The dynamic critical path series of algorithms perform well. Dynamic critical volume did not add much. Unfortunately, dynamic critical path maximizes turnaround time as well as throughput. Two schedulers are presented which balance throughput and turnaround time. The first requires classification of jobs by type; the second requires selection of a ratio value which is dependent upon system parameters. 45 refs., 19 figs., 20 tabs.

  20. A parallel row-based algorithm with error control for standard-cell replacement on a hypercube multiprocessor

    NASA Technical Reports Server (NTRS)

    Sargent, Jeff Scott

    1988-01-01

    A new row-based parallel algorithm for standard-cell placement targeted for execution on a hypercube multiprocessor is presented. Key features of this implementation include a dynamic simulated-annealing schedule, row-partitioning of the VLSI chip image, and two novel new approaches to controlling error in parallel cell-placement algorithms; Heuristic Cell-Coloring and Adaptive (Parallel Move) Sequence Control. Heuristic Cell-Coloring identifies sets of noninteracting cells that can be moved repeatedly, and in parallel, with no buildup of error in the placement cost. Adaptive Sequence Control allows multiple parallel cell moves to take place between global cell-position updates. This feedback mechanism is based on an error bound derived analytically from the traditional annealing move-acceptance profile. Placement results are presented for real industry circuits and the performance is summarized of an implementation on the Intel iPSC/2 Hypercube. The runtime of this algorithm is 5 to 16 times faster than a previous program developed for the Hypercube, while producing equivalent quality placement. An integrated place and route program for the Intel iPSC/2 Hypercube is currently being developed.

  1. Cueing others' memories.

    PubMed

    Tullis, Jonathan G; Benjamin, Aaron S

    2015-05-01

    Many situations require us to generate external cues to support later retrieval from memory. For instance, we create file names in order to cue our memory to a file's contents, and instructors create lecture slides to remember what points to make during classes. We even generate cues for others when we remind friends of shared experiences or send colleagues a computer file that is named in such a way so as to remind them of its contents. Here we explore how and how well learners tailor retrieval cues for different intended recipients. Across three experiments, subjects generated verbal cues for a list of target words for themselves or for others. Learners generated cues for others by increasing the normative cue-to-target associative strength but also by increasing the number of other words their cues point to, relative to cues that they generated for themselves. This strategy was effective: such cues supported higher levels of recall for others than cues generated for oneself. Generating cues for others also required more time than generating cues for oneself. Learners responded to the differential demands of cue generation for others by effortfully excluding personal, episodic knowledge and including knowledge that they estimate to be broadly shared.

  2. Share with thy neighbors

    NASA Astrophysics Data System (ADS)

    Chandra, Surendar; Yu, Xuwen

    2007-01-01

    Peer to peer (P2P) systems are traditionally designed to scale to a large number of nodes. However, we focus on scenarios where the sharing is effected only among neighbors. Localized sharing is particularly attractive in scenarios where wide area network connectivity is undesirable, expensive or unavailable. On the other hand, local neighbors may not offer the wide variety of objects possible in a much larger system. The goal of this paper is to investigate a P2P system that shares contents with its neighbors. We analyze the sharing behavior of Apple iTunes users in an University setting. iTunes restricts the sharing of audio and video objects to peers within the same LAN sub-network. We show that users are already making a significant amount of content available for local sharing. We show that these systems are not appropriate for applications that require access to a specific object. We argue that mechanisms that allow the user to specify classes of interesting objects are better suited for these systems. Mechanisms such as bloom filters can allow each peer to summarize the contents available in the neighborhood, reducing network search overhead. This research can form the basis for future storage systems that utilize the shared storage available in neighbors and build a probabilistic storage for local consumption.

  3. System and method for programmable bank selection for banked memory subsystems

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Gara, Alan G.; Giampapa, Mark E.; Hoenicke, Dirk; Ohmacht, Martin; Salapura, Valentina; Sugavanam, Krishnan

    2010-09-07

    A programmable memory system and method for enabling one or more processor devices access to shared memory in a computing environment, the shared memory including one or more memory storage structures having addressable locations for storing data. The system comprises: one or more first logic devices associated with a respective one or more processor devices, each first logic device for receiving physical memory address signals and programmable for generating a respective memory storage structure select signal upon receipt of pre-determined address bit values at selected physical memory address bit locations; and, a second logic device responsive to each of the respective select signal for generating an address signal used for selecting a memory storage structure for processor access. The system thus enables each processor device of a computing environment memory storage access distributed across the one or more memory storage structures.

  4. Memory Network For Distributed Data Processors

    NASA Technical Reports Server (NTRS)

    Bolen, David; Jensen, Dean; Millard, ED; Robinson, Dave; Scanlon, George

    1992-01-01

    Universal Memory Network (UMN) is modular, digital data-communication system enabling computers with differing bus architectures to share 32-bit-wide data between locations up to 3 km apart with less than one millisecond of latency. Makes it possible to design sophisticated real-time and near-real-time data-processing systems without data-transfer "bottlenecks". This enterprise network permits transmission of volume of data equivalent to an encyclopedia each second. Facilities benefiting from Universal Memory Network include telemetry stations, simulation facilities, power-plants, and large laboratories or any facility sharing very large volumes of data. Main hub of UMN is reflection center including smaller hubs called Shared Memory Interfaces.

  5. Fear Memory.

    PubMed

    Izquierdo, Ivan; Furini, Cristiane R G; Myskiw, Jociane C

    2016-04-01

    Fear memory is the best-studied form of memory. It was thoroughly investigated in the past 60 years mostly using two classical conditioning procedures (contextual fear conditioning and fear conditioning to a tone) and one instrumental procedure (one-trial inhibitory avoidance). Fear memory is formed in the hippocampus (contextual conditioning and inhibitory avoidance), in the basolateral amygdala (inhibitory avoidance), and in the lateral amygdala (conditioning to a tone). The circuitry involves, in addition, the pre- and infralimbic ventromedial prefrontal cortex, the central amygdala subnuclei, and the dentate gyrus. Fear learning models, notably inhibitory avoidance, have also been very useful for the analysis of the biochemical mechanisms of memory consolidation as a whole. These studies have capitalized on in vitro observations on long-term potentiation and other kinds of plasticity. The effect of a very large number of drugs on fear learning has been intensively studied, often as a prelude to the investigation of effects on anxiety. The extinction of fear learning involves to an extent a reversal of the flow of information in the mentioned structures and is used in the therapy of posttraumatic stress disorder and fear memories in general.

  6. Accelerating Spectrum Sharing Technologies

    SciTech Connect

    Juan D. Deaton; Lynda L. Brighton; Rangam Subramanian; Hussein Moradi; Jose Loera

    2013-09-01

    Spectrum sharing potentially holds the promise of solving the emerging spectrum crisis. However, technology innovators face the conundrum of developing spectrum sharing technologies without the ability to experiment and test with real incumbent systems. Interference with operational incumbents can prevent critical services, and the cost of deploying and operating an incumbent system can be prohibitive. Thus, the lack of incumbent systems and frequency authorization for technology incubation and demonstration has stymied spectrum sharing research. To this end, industry, academia, and regulators all require a test facility for validating hypotheses and demonstrating functionality without affecting operational incumbent systems. This article proposes a four-phase program supported by our spectrum accountability architecture. We propose that our comprehensive experimentation and testing approach for technology incubation and demonstration will accelerate the development of spectrum sharing technologies.

  7. A Sharing Proposition.

    ERIC Educational Resources Information Center

    Sturgeon, Julie

    2002-01-01

    Describes how the University of Vermont and St. Michael's College in Burlington, Vermont cooperated to share a single card access system. Discusses the planning, financial, and marketplace advantages of the cooperation. (EV)

  8. Balancing Loads Among Parallel Data Processors

    NASA Technical Reports Server (NTRS)

    Baffes, Paul Thomas

    1990-01-01

    Heuristic algorithm minimizes amount of memory used by multiprocessor system. Distributes load of many identical, short computations among multiple parallel digital data processors, each of which has its own (local) memory. Each processor operates on distinct and independent set of data in larger shared memory. As integral part of load-balancing scheme, total amount of space used in shared memory minimized. Possible applications include artificial neural networks or image processors for which "pipeline" and vector methods of load balancing inappropriate.

  9. Common molecular mechanisms in explicit and implicit memory.

    PubMed

    Barco, Angel; Bailey, Craig H; Kandel, Eric R

    2006-06-01

    Cellular and molecular studies of both implicit and explicit memory suggest that experience-dependent modulation of synaptic strength and structure is a fundamental mechanism by which these memories are encoded and stored within the brain. In this review, we focus on recent advances in our understanding of two types of memory storage: (i) sensitization in Aplysia, a simple form of implicit memory, and (ii) formation of explicit spatial memories in the mouse hippocampus. These two processes share common molecular mechanisms that have been highly conserved through evolution.

  10. Programming distributed memory architectures using Kali

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush; Vanrosendale, John

    1990-01-01

    Programming nonshared memory systems is more difficult than programming shared memory systems, in part because of the relatively low level of current programming environments for such machines. A new programming environment is presented, Kali, which provides a global name space and allows direct access to remote data values. In order to retain efficiency, Kali provides a system on annotations, allowing the user to control those aspects of the program critical to performance, such as data distribution and load balancing. The primitives and constructs provided by the language is described, and some of the issues raised in translating a Kali program for execution on distributed memory systems are also discussed.

  11. Fueling Memories

    PubMed Central

    Powell, Jonathan D.; Pollizzi, Kristen

    2012-01-01

    A hallmark of the adaptive immune response is rapid and robust activation upon rechallenge. In the current issue of Immunity van der Windt et al. (2012) provide an important link between mitochondrial respiratory capacity and the development of CD8+ T cell memory. PMID:22284413

  12. Childhood Memories.

    ERIC Educational Resources Information Center

    Soto, Lourdes Diaz

    2001-01-01

    Describes how artwork can be a valuable catalyst for discussions in preservice education classes, allowing students to explore how their work as educators relates to their childhood memories and can be shaped by childhood experiences. Examines an art exhibition in which diverse artists depicted autobiographical text in their paintings. Discusses…

  13. Retracing Memories

    ERIC Educational Resources Information Center

    Harrison, David L.

    2005-01-01

    There are plenty of paths to poetry but few are as accessible as retracing ones own memories. When students are asked to write about something they remember, they are given them the gift of choosing from events that are important enough to recall. They remember because what happened was funny or scary or embarrassing or heartbreaking or silly.…

  14. Hollow memories

    NASA Astrophysics Data System (ADS)

    2014-04-01

    A hollow-core optical fibre filled with warm caesium atoms can temporarily store the properties of photons. Michael Sprague from the University of Oxford, UK, explains to Nature Photonics how this optical memory could be a useful building block for fibre-based quantum optics.

  15. Programming parallel architectures - The BLAZE family of languages

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush

    1989-01-01

    This paper gives an overview of the various approaches to programming multiprocessor architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive, since they remove much of the burden of exploiting parallel architectures from the user. This paper also describes recent work in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described.

  16. Creativity and psychopathology: a shared vulnerability model.

    PubMed

    Carson, Shelley H

    2011-03-01

    Creativity is considered a positive personal trait. However, highly creative people have demonstrated elevated risk for certain forms of psychopathology, including mood disorders, schizophrenia spectrum disorders, and alcoholism. A model of shared vulnerability explains the relation between creativity and psychopathology. This model, supported by recent findings from neuroscience and molecular genetics, suggests that the biological determinants conferring risk for psychopathology interact with protective cognitive factors to enhance creative ideation. Elements of shared vulnerability include cognitive disinhibition (which allows more stimuli into conscious awareness), an attentional style driven by novelty salience, and neural hyperconnectivity that may increase associations among disparate stimuli. These vulnerabilities interact with superior meta-cognitive protective factors, such as high IQ, increased working memory capacity, and enhanced cognitive flexibility, to enlarge the range and depth of stimuli available in conscious awareness to be manipulated and combined to form novel and original ideas.

  17. Reciprocal food sharing in the vampire bat

    NASA Astrophysics Data System (ADS)

    Wilkinson, Gerald S.

    1984-03-01

    Behavioural reciprocity can be evolutionarily stable1-3. Initial increase in frequency depends, however, on reciprocal altruists interacting predominantly with other reciprocal altruists either by associating within kin groups or by having sufficient memory to recognize and not aid nonreciprocators. Theory thus suggests that reciprocity should evolve more easily among animals which live in kin groups. Data are available separating reciprocity from nepotism only for unrelated nonhuman animals4. Here, I show that food sharing by regurgitation of blood among wild vampire bats (Desmodus rotundus) depends equally and independently on degree of relatedness and an index of opportunity for recipro cation. That reciprocity operates within groups containing both kin and nonkin is supported further with data on the availability of blood-sharing occasions, estimates of the economics of shar ing blood, and experiments which show that unrelated bats will reciprocally exchange blood in captivity.

  18. Coordinating Shared Activities

    NASA Technical Reports Server (NTRS)

    Clement, Bradley

    2004-01-01

    Shared Activity Coordination (ShAC) is a computer program for planning and scheduling the activities of an autonomous team of interacting spacecraft and exploratory robots. ShAC could also be adapted to such terrestrial uses as helping multiple factory managers work toward competing goals while sharing such common resources as floor space, raw materials, and transports. ShAC iteratively invokes the Continuous Activity Scheduling Planning Execution and Replanning (CASPER) program to replan and propagate changes to other planning programs in an effort to resolve conflicts. A domain-expert specifies which activities and parameters thereof are shared and reports the expected conditions and effects of these activities on the environment. By specifying these conditions and effects differently for each planning program, the domain-expert subprogram defines roles that each spacecraft plays in a coordinated activity. The domain-expert subprogram also specifies which planning program has scheduling control over each shared activity. ShAC enables sharing of information, consensus over the scheduling of collaborative activities, and distributed conflict resolution. As the other planning programs incorporate new goals and alter their schedules in the changing environment, ShAC continually coordinates to respond to unexpected events.

  19. Labia Majora Share

    PubMed Central

    Lee, Hanjing; Yap, Yan Lin; Low, Jeffrey Jen Hui

    2017-01-01

    Defects involving specialised areas with characteristic anatomical features, such as the nipple, upper eyelid, and lip, benefit greatly from the use of sharing procedures. The vulva, a complex 3-dimensional structure, can also be reconstructed through a sharing procedure drawing upon the contralateral vulva. In this report, we present the interesting case of a patient with chronic, massive, localised lymphedema of her left labia majora that was resected in 2011. Five years later, she presented with squamous cell carcinoma over the left vulva region, which is rarely associated with chronic lymphedema. To the best of our knowledge, our management of the radical vulvectomy defect with a labia majora sharing procedure is novel and has not been previously described. The labia major flap presented in this report is a shared flap; that is, a transposition flap based on the dorsal clitoral artery, which has consistent vascular anatomy, making this flap durable and reliable. This procedure epitomises the principle of replacing like with like, does not interfere with leg movement or patient positioning, has minimal donor site morbidity, and preserves other locoregional flap options for future reconstruction. One limitation is the need for a lax contralateral vulva. This labia majora sharing procedure is a viable option in carefully selected patients. PMID:28194353

  20. CaLRS: A Critical-Aware Shared LLC Request Scheduling Algorithm on GPGPU

    PubMed Central

    Ma, Jianliang; Meng, Jinglei; Chen, Tianzhou; Wu, Minghui

    2015-01-01

    Ultra high thread-level parallelism in modern GPUs usually introduces numerous memory requests simultaneously. So there are always plenty of memory requests waiting at each bank of the shared LLC (L2 in this paper) and global memory. For global memory, various schedulers have already been developed to adjust the request sequence. But we find few work has ever focused on the service sequence on the shared LLC. We measured that a big number of GPU applications always queue at LLC bank for services, which provide opportunity to optimize the service order on LLC. Through adjusting the GPU memory request service order, we can improve the schedulability of SM. So we proposed a critical-aware shared LLC request scheduling algorithm (CaLRS) in this paper. The priority representative of memory request is critical for CaLRS. We use the number of memory requests that originate from the same warp but have not been serviced when they arrive at the shared LLC bank to represent the criticality of each warp. Experiments show that the proposed scheme can boost the SM schedulability effectively by promoting the scheduling priority of the memory requests with high criticality and improves the performance of GPU indirectly. PMID:25729772

  1. Shared care (comanagement).

    PubMed

    Montero Ruiz, E

    2016-01-01

    Surgical departments have increasing difficulties in caring for their hospitalised patients due to the patients' advanced age and comorbidity, the growing specialisation in medical training and the strong political-healthcare pressure that a healthcare organisation places on them, where surgical acts take precedence over other activities. The pressure exerted by these departments on the medical area and the deficient response by the interconsultation system have led to the development of a different healthcare organisation model: Shared care, which includes perioperative medicine. In this model, 2 different specialists share the responsibility and authority in caring for hospitalised surgical patients. Internal Medicine is the most appropriate specialty for shared care. Internists who exercise this responsibility should have certain characteristics and must overcome a number of concerns from the surgeon and anaesthesiologist.

  2. Collaborate, compete and share

    NASA Astrophysics Data System (ADS)

    Pugliese, Emanuele; Castellano, Claudio; Marsili, Matteo; Pietronero, Luciano

    2009-02-01

    We introduce and study a model of an interacting population of agents who collaborate in groups which compete for limited resources. Groups are formed by random matching agents and their worth is determined by the sum of the efforts deployed by agents in group formation. Agents, on their side, have to share their effort between contributing to their group’s chances to outcompete other groups and resource sharing among partners, when the group is successful. A simple implementation of this strategic interaction gives rise to static and evolutionary properties with a very rich phenomenology. A robust emerging feature is the separation of the population between agents who invest mainly in the success of their group and agents who concentrate in getting the largest share of their group’s profits.

  3. Cache Sharing and Isolation Tradeoffs in Multicore Mixed-Criticality Systems

    DTIC Science & Technology

    2015-05-01

    common practice today to allow hardware com- ponents such as last-level caches (LLCs) and memory con- trollers to be shared across cores; this can be...which can be cleared to steer LLC accesses from this core to certain ways of the LLC. Under page color- ing, pages of physical memory are assigned...above. The Level- C measurements also need to be taken in a system under load to account for the impact of concurrent evictions and memory -bus

  4. Developmental Differences in the Use of Recognition Memory Rejection Mechanisms

    ERIC Educational Resources Information Center

    Odegard, Timothy N.; Jenkins, Kara M.; Koen, Joshua D.

    2010-01-01

    The current experiment examined the use of plausibility judgments by children to reject distractors presented on "yes/no" recognition memory tests. Participants studied two lists of word pairs that shared either a categorical or rhyme association, which constituted the global nature of the two study conditions. During the recognition memory tests,…

  5. An Emulation Tool for Simulating Matrix Operations on an SIMD (Single Instruction Stream Multiple Data Stream) Multiprocessor.

    DTIC Science & Technology

    1987-10-01

    4.3 DATA TYPES dataval =integer; (Data Type of array matrixdim - 31; {Max rows /cols mat rix -a rray [mat r ixdim~ma t r iidim] of dataval ; ( data...local A,B,R memories.I of dataval ; shiftmem -array[o. .SHREGMAX] ( Shift register memory.) of dataval ;Li cpmem - array[O. .CPMEMMAX] The Central Memory...of dataval ; instruction- record opcode:integer; (Operation code. opl:integer; (First operand. (p:ntgr Second operand. op:ite{r Third operand

  6. Multiparty quantum secret sharing

    SciTech Connect

    Zhang Zhanjun; Li Yong; Man Zhongxiao

    2005-04-01

    Based on a quantum secure direct communication (QSDC) protocol [Phys. Rev. A 69 052319 (2004)], we propose a (n,n)-threshold scheme of multiparty quantum secret sharing of classical messages (QSSCM) using only single photons. We take advantage of this multiparty QSSCM scheme to establish a scheme of multiparty secret sharing of quantum information (SSQI), in which only all quantum information receivers collaborate can the original qubit be reconstructed. A general idea is also proposed for constructing multiparty SSQI schemes from any QSSCM scheme.

  7. Shared Memory Architecture and Explored Alternatives for Interoperability

    DTIC Science & Technology

    2007-12-01

    S Services orchestration choreography Service Provider / BrokerRequestor Discovery publish seek interact Svc (programming in-the-large) (programming...Threat CAS Urban CAS Time Sensitive Strike Orchestration of Analytical Activities Choreography of Agent Interactions Orchestration of Analytical...Activities Choreography of Agent Interactions ACETEF Stimulators JIMM I/F I/F MFS EOIR ACETEF Interface Library Linked Base Class Threat Generator

  8. Shared direct memory access on the Explorer 2-LX

    NASA Technical Reports Server (NTRS)

    Musgrave, Jeffrey L.

    1990-01-01

    Advances in Expert System technology and Artificial Intelligence have provided a framework for applying automated Intelligence to the solution of problems which were generally perceived as intractable using more classical approaches. As a result, hybrid architectures and parallel processing capability have become more common in computing environments. The Texas Instruments Explorer II-LX is an example of a machine which combines a symbolic processing environment, and a computationally oriented environment in a single chassis for integrated problem solutions. This user's manual is an attempt to make these capabilities more accessible to a wider range of engineers and programmers with problems well suited to solution in such an environment.

  9. Sharing the atom bomb

    SciTech Connect

    Chace, J.

    1996-01-01

    Shaken by the devastation of Hiroshima and Nagasaki and fearful that the American atomic monopoly would spark an arms race, Dean Acheson led a push in 1946 to place the bomb-indeed, all atomic energy-under international control. But as the memories of wartime collaboration faded, relations between the superpowers grew increasingly tense, and the confrontational atmosphere undid his proposal. Had Acheson succeeded, the Cold War might not have been. 2 figs.

  10. The Sharing Tree: Preschool Children Learn to Share.

    ERIC Educational Resources Information Center

    Wolf, Arlene; Fine, Elaine

    1996-01-01

    This article describes a learning activity in which preschool children learn cooperative skills and metacognitive strategies as they master sharing strategies guided by leaves on a "sharing tree." Leaf colors (red, yellow, green) cue the child to stop, slow down and think about sharing and playing with others, and go ahead with a sharing activity.…

  11. Memories of the holocaust.

    PubMed

    Unger, Samuel

    2006-03-01

    As Alpha Omegans, we are united not only by our profession but also by a mission to educate ourselves, and others, about preserving our Jewish heritage. It was with this mission in mind that the Alpha Omegan invited me to share with my fraters a very personal, and painful, account of my boyhood in Poland, where I survived the Holocaust. Among the many gruesome episodes I encountered during the war, two remain vivid in my memories. Although this is not an easy story for me to tell, it is one that ultimately gives me great strength, especially as I prepare to disclose it among my dear friends and colleagues of Alpha Omega. May we never forget what some of us lost, what we regained and why we have chosen to build our personal and professional lives in ways that honor our history.

  12. The biochemistry of memory.

    PubMed

    Stock, Jeffry B; Zhang, Sherry

    2013-09-09

    Almost fifty years ago, Julius Adler initiated a program of research to gain insights into the basic biochemistry of intelligent behavior by studying the molecular mechanisms that underlie the chemotactic responses of Escherichia coli. All living organisms share elements of a common biochemistry for metabolism, growth and heredity - why not intelligence? Neurobiologists have demonstrated that this is the case for nervous systems in animals ranging from worms to man. Motile unicellular organisms such as E. coli exhibit rudimentary behaviors that can be loosely described in terms of cognitive phenomena such as memory and learning. Adler's initiative at least raised the prospect that, because of the numerous experimental advantages provided by E. coli, it would be the first organism whose behavior could be understood at molecular resolution.

  13. Sharing Research Results

    ERIC Educational Resources Information Center

    Ashbrook, Peggy

    2011-01-01

    There are many ways to share a collection of data and students' thinking about that data. Explaining the results of science inquiry is important--working scientists and amateurs both contribute information to the body of scientific knowledge. Students can collect data about an activity that is already happening in a classroom (e.g., the qualities…

  14. Shared Governance of Schools.

    ERIC Educational Resources Information Center

    Thomas, M. Donald

    Shared decision-making can help schools keep sight of their true goals. In the educational sector the conflicts that arise in collective bargaining disputes can be destructive to the organization. Schools require more than the mere coexistence of labor and management. They require cooperation and strong, supportive relationships. To establish a…

  15. Learning to Share

    ERIC Educational Resources Information Center

    Raths, David

    2010-01-01

    In the tug-of-war between researchers and IT for supercomputing resources, a centralized approach can help both sides get more bang for their buck. As 2010 began, the University of Washington was preparing to launch its first shared high-performance computing cluster, a 1,500-node system called Hyak, dedicated to research activities. Like other…

  16. Shared decision making

    MedlinePlus

    ... communicate openly and build a relationship of trust. Alternative Names Patient-centered care References Agency for Healthcare Research and Quality. The SHARE Approach. Updated September 2016. www.ahrq.gov/professionals/education/curriculum-tools/shareddecisionmaking/index.html . Accessed October 19, ...

  17. Illegal File Sharing 101

    ERIC Educational Resources Information Center

    Wada, Kent

    2008-01-01

    Much of higher education's unease arises from the cost of dealing with illegal file sharing. Illinois State University, for example, calculated a cost of $76 to process a first claim of copyright infringement and $146 for a second. Responses range from simply passing along claims to elaborate programs architected with specific goals in mind.…

  18. Share the Power.

    ERIC Educational Resources Information Center

    Mitchell, James E.

    1990-01-01

    Site-based management cannot work without the school board's active involvement and determined support. Suggestions are offered from School District 12, Adams County, Colorado, which has been moving away from a centralized administrative system to shared decision-making. (MLF)

  19. Shared Decision Making.

    ERIC Educational Resources Information Center

    Lashway, Larry

    1997-01-01

    In shared decision making (SDM), principals collaborate with teachers and sometimes parents to take actions aimed at improving instruction and school climate. While research on SDM outcomes is still inconclusive, the literature shows that SDM brings both benefits and problems, and that the principal is a key figure. This brief offers a sampling of…

  20. Processing Demand and Short-Term Memory: The Response-Prefix Effect

    ERIC Educational Resources Information Center

    Jahnke, John C.; Nowaczyk, Ronald H.

    1977-01-01

    Seven-digit strings were presented for immediate recall. Before recall, subjects either read or retrieved from memory a single item (response prefix). Results were seen in terms of the sharing of the limited capacity of an active memory system by the memory series, the response prefix, and the operations to retrieve and emit the items. (Editor/RK)

  1. Focus on emotion as a catalyst of memory updating during reconsolidation.

    PubMed

    Stein, Maria; Rohde, Kristina Barbara; Henke, Katharina

    2015-01-01

    We share the idea of Lane et al. that successful psychotherapy exerts its effects through memory reconsolidation. To support it, we add further evidence that a behavioral interference may trigger memory update during reconsolidation. Furthermore, we propose that - in addition to replacing maladaptive emotions - new emotions experienced in the therapeutic process catalyze reconsolidation of the updated memory structure.

  2. Libraries and Development Environments for Monte Carlo Simulations of Lattice Gauge Theories on Parallel Computers

    NASA Astrophysics Data System (ADS)

    Decker, K. M.; Jayewardena, C.; Rehmann, R.

    We describe the library lgtlib, and lgttool, the corresponding development environment for Monte Carlo simulations of lattice gauge theory on multiprocessor vector computers with shared memory. We explain why distributed memory parallel processor (DMPP) architectures are particularly appealing for compute-intensive scientific applications, and introduce the design of a general application and program development environment system for scientific applications on DMPP architectures.

  3. Close Associations and Memory in Brainwriting Groups

    ERIC Educational Resources Information Center

    Coskun, Hamit

    2011-01-01

    The present experiment examined whether or not the type of associations (close (e.g. apple-pear) and distant (e.g. apple-fish) word associations) and memory instruction (paying attention to the ideas of others) had effects on the idea generation performances in the brainwriting paradigm in which all participants shared their ideas by using paper…

  4. Test Sequence Priming in Recognition Memory

    ERIC Educational Resources Information Center

    Johns, Elizabeth E.; Mewhort, D. J. K.

    2009-01-01

    The authors examined priming within the test sequence in 3 recognition memory experiments. A probe primed its successor whenever both probes shared a feature with the same studied item ("interjacent priming"), indicating that the study item like the probe is central to the decision. Interjacent priming occurred even when the 2 probes did…

  5. The Precategorical Nature of Visual Short-Term Memory

    ERIC Educational Resources Information Center

    Quinlan, Philip T.; Cohen, Dale J.

    2016-01-01

    We conducted a series of recognition experiments that assessed whether visual short-term memory (VSTM) is sensitive to shared category membership of to-be-remembered (tbr) images of common objects. In Experiment 1 some of the tbr items shared the same basic level category (e.g., hand axe): Such items were no better retained than others. In the…

  6. Policy enabled information sharing system

    DOEpatents

    Jorgensen, Craig R.; Nelson, Brian D.; Ratheal, Steve W.

    2014-09-02

    A technique for dynamically sharing information includes executing a sharing policy indicating when to share a data object responsive to the occurrence of an event. The data object is created by formatting a data file to be shared with a receiving entity. The data object includes a file data portion and a sharing metadata portion. The data object is encrypted and then automatically transmitted to the receiving entity upon occurrence of the event. The sharing metadata portion includes metadata characterizing the data file and referenced in connection with the sharing policy to determine when to automatically transmit the data object to the receiving entity.

  7. [Neural correlates of memory].

    PubMed

    Fujii, Toshikatsu

    2013-01-01

    Memory can be divided into several types, although all of them involve three successive processes: encoding, storage, and retrieval. In terms of the duration of retention, neurologists classify memory into immediate, recent, and remote memories, whereas psychologists classify memory into short-term and long-term memories. In terms of the content, episodic, semantic, and procedural memories are considered to be different types of memory. Furthermore, researchers on memory have proposed relatively new concepts of memory, i.e., working memory and prospective memory. This article first provides explanations for these several types of memory. Next, neuropsychological characteristics of amnesic syndrome are briefly outlined. Finally, how several different types of memory are affected (or preserved) in patients with amnesic syndrome is described.

  8. The Developmental Influence of Primary Memory Capacity on Working Memory and Academic Achievement

    PubMed Central

    2015-01-01

    In this study, we investigate the development of primary memory capacity among children. Children between the ages of 5 and 8 completed 3 novel tasks (split span, interleaved lists, and a modified free-recall task) that measured primary memory by estimating the number of items in the focus of attention that could be spontaneously recalled in serial order. These tasks were calibrated against traditional measures of simple and complex span. Clear age-related changes in these primary memory estimates were observed. There were marked individual differences in primary memory capacity, but each novel measure was predictive of simple span performance. Among older children, each measure shared variance with reading and mathematics performance, whereas for younger children, the interleaved lists task was the strongest single predictor of academic ability. We argue that these novel tasks have considerable potential for the measurement of primary memory capacity and provide new, complementary ways of measuring the transient memory processes that predict academic performance. The interleaved lists task also shared features with interference control tasks, and our findings suggest that young children have a particular difficulty in resisting distraction and that variance in the ability to resist distraction is also shared with measures of educational attainment. PMID:26075630

  9. Dynamic Load Balancing for Adaptive Computations on Distributed-Memory Machines

    NASA Technical Reports Server (NTRS)

    1999-01-01

    Dynamic load balancing is central to adaptive mesh-based computations on large-scale parallel computers. The principal investigator has investigated various issues on the dynamic load balancing problem under NASA JOVE and JAG rants. The major accomplishments of the project are two graph partitioning algorithms and a load balancing framework. The S-HARP dynamic graph partitioner is known to be the fastest among the known dynamic graph partitioners to date. It can partition a graph of over 100,000 vertices in 0.25 seconds on a 64- processor Cray T3E distributed-memory multiprocessor while maintaining the scalability of over 16-fold speedup. Other known and widely used dynamic graph partitioners take over a second or two while giving low scalability of a few fold speedup on 64 processors. These results have been published in journals and peer-reviewed flagship conferences.

  10. Elastomeric load sharing device

    NASA Technical Reports Server (NTRS)

    Isabelle, Charles J. (Inventor); Kish, Jules G. (Inventor); Stone, Robert A. (Inventor)

    1992-01-01

    An elastomeric load sharing device, interposed in combination between a driven gear and a central drive shaft to facilitate balanced torque distribution in split power transmission systems, includes a cylindrical elastomeric bearing and a plurality of elastomeric bearing pads. The elastomeric bearing and bearing pads comprise one or more layers, each layer including an elastomer having a metal backing strip secured thereto. The elastomeric bearing is configured to have a high radial stiffness and a low torsional stiffness and is operative to radially center the driven gear and to minimize torque transfer through the elastomeric bearing. The bearing pads are configured to have a low radial and torsional stiffness and a high axial stiffness and are operative to compressively transmit torque from the driven gear to the drive shaft. The elastomeric load sharing device has spring rates that compensate for mechanical deviations in the gear train assembly to provide balanced torque distribution between complementary load paths of split power transmission systems.

  11. Shared Health Governance

    PubMed Central

    Ruger, Jennifer Prah

    2014-01-01

    Health and Social Justice (Ruger 2009a) developed the “health capability paradigm,” a conception of justice and health in domestic societies. This idea undergirds an alternative framework of social cooperation called “shared health governance” (SHG). SHG puts forth a set of moral responsibilities, motivational aspirations, and institutional arrangements, and apportions roles for implementation in striving for health justice. This article develops further the SHG framework and explains its importance and implications for governing health domestically. PMID:21745082

  12. Sharing our knowledge.

    PubMed

    Griffiths, Matt

    2017-01-25

    There are more than 70,000 nurse prescribers in the UK, many of whom have years of experience that should be shared with trainee prescribers. In November, the General Pharmaceutical Council launched a consultation on whether pharmacist independent prescribers (PIPs) should be able to mentor trainee PIPs. This discussion, which closes on 1 February, should be expanded to our own and other professional groups, because we could all gain so much from each other.

  13. Intelligence Sharing in Counterproliferation

    DTIC Science & Technology

    2007-09-01

    Routledge, 2004), 75. 3 Dieter Mahncke, Wyn Rees , and Wayne C. Thompson, Redefining transatlantic security relations: The Challenge of Change...Redefining Transatlantic Security Relations, by Dieter Mahncke, Wyn Rees , and Wayne C. Thompson, it is argued that these differences coupled with...Regarding Weapons of Mass Destruction” by U.S. Senators Laurence Silbermann and Charles Robb, “the information sharing problem manifested itself in

  14. University Reactor Sharing Program

    SciTech Connect

    Dr. W.D. Reece

    1999-09-01

    The University Reactor Sharing Program provides funding for reactor experimentation to institutions that do not normally have access to a research reactor. Research projects supported by the program include items such as dating geological material to producing high current super conducting magnets. The funding also gives small colleges and universities the opportunity to use the facility for teaching courses in nuclear processes; specifically neutron activation analysis and gamma spectroscopy.

  15. Efficient quantum secret sharing

    NASA Astrophysics Data System (ADS)

    Qin, Huawang; Dai, Yuewei

    2016-05-01

    An efficient quantum secret sharing scheme is proposed, in which the dealer generates some single particles and then uses the operations of quantum-controlled-not and Hadamard gate to encode a determinate secret into these particles. The participants get their shadows by performing the single-particle measurements on their particles, and even the dealer cannot know their shadows. Compared to the existing schemes, our scheme is more practical within the present technologies.

  16. Processor-Group Aware Runtime Support for Shared-and Global-Address Space Models

    SciTech Connect

    Krishnan, Manoj Kumar; Tipparaju, Vinod; Palmer, Bruce; Nieplocha, Jarek

    2004-12-07

    Exploiting multilevel parallelism using processor groups is becoming increasingly important for programming on high-end systems. This paper describes a group-aware run-time support for shared-/global- address space programming models. The current effort has been undertaken in the context of the Aggregate Remote Memory Copy Interface (ARMCI) [5], a portable runtime system used as a communication layer for Global Arrays [6], Co-Array Fortran (CAF) [9], GPSHMEM [10], Co-Array Python [11], and also end-user applications. The paper describes the management of shared memory, integration of shared memory communication and RDMA on clusters with SMP nodes, and registration. These are all required for efficient multi- method and multi-protocol communication on modern systems. Focus is placed on techniques for supporting process groups while maximizing communication performance and efficiently managing global memory system-wide.

  17. Toward worldwide data sharing

    NASA Astrophysics Data System (ADS)

    Walker, Raymond; Joy, Steven; King, Todd

    2012-07-01

    Over the past decade the nature of space science research has changed dramatically. Earlier investigators could carry out meaningful research by looking at observations from a single instrument on a single spacecraft. Today that is rapidly changing and researchers regularly use data from multiple instruments on multiple spacecraft as well as observations from ground observatories. Increasingly those observations come from missions flown by many countries. Recent advances in distributed data management have made it possible for researchers located around the world to access and use data from multiple nations. By using virtual observatory technology it no longer matters where data are housed they can be freely accessed wherever they reside. In this presentation we will discuss two initiatives designed to make space science data access worldwide. One is the International Planetary Data Alliance (IPDA) and the other is the Heliophysics Data and Model Consortium (HDMC). In both cases the key to worldwide data sharing is adopting common metadata standards. In this talk we will review how these two groups are addressing the worldwide data sharing and their progress in achieving their goals. IPDA and HDMC are two of several efforts to promote broad based data sharing. Talks in the remainder of the symposium will discuss this is more detail.

  18. Synapsin determines memory strength after punishment- and relief-learning.

    PubMed

    Niewalda, Thomas; Michels, Birgit; Jungnickel, Roswitha; Diegelmann, Sören; Kleber, Jörg; Kähne, Thilo; Gerber, Bertram

    2015-05-13

    Adverse life events can induce two kinds of memory with opposite valence, dependent on timing: "negative" memories for stimuli preceding them and "positive" memories for stimuli experienced at the moment of "relief." Such punishment memory and relief memory are found in insects, rats, and man. For example, fruit flies (Drosophila melanogaster) avoid an odor after odor-shock training ("forward conditioning" of the odor), whereas after shock-odor training ("backward conditioning" of the odor) they approach it. Do these timing-dependent associative processes share molecular determinants? We focus on the role of Synapsin, a conserved presynaptic phosphoprotein regulating the balance between the reserve pool and the readily releasable pool of synaptic vesicles. We find that a lack of Synapsin leaves task-relevant sensory and motor faculties unaffected. In contrast, both punishment memory and relief memory scores are reduced. These defects reflect a true lessening of associative memory strength, as distortions in nonassociative processing (e.g., susceptibility to handling, adaptation, habituation, sensitization), discrimination ability, and changes in the time course of coincidence detection can be ruled out as alternative explanations. Reductions in punishment- and relief-memory strength are also observed upon an RNAi-mediated knock-down of Synapsin, and are rescued both by acutely restoring Synapsin and by locally restoring it in the mushroom bodies of mutant flies. Thus, both punishment memory and relief memory require the Synapsin protein and in this sense share genetic and molecular determinants. We note that corresponding molecular commonalities between punishment memory and relief memory in humans would constrain pharmacological attempts to selectively interfere with excessive associative punishment memories, e.g., after traumatic experiences.

  19. Address tracing for parallel machines

    NASA Technical Reports Server (NTRS)

    Stunkel, Craig B.; Janssens, Bob; Fuchs, W. Kent

    1991-01-01

    Recently implemented parallel system address-tracing methods based on several metrics are surveyed. The issues specific to collection of traces for both shared and distributed memory parallel computers are highlighted. Five general categories of address-trace collection methods are examined: hardware-captured, interrupt-based, simulation-based, altered microcode-based, and instrumented program-based traces. The problems unique to shared memory and distributed memory multiprocessors are examined separately.

  20. [Neuroscience and collective memory: memory schemas linking brain, societies and cultures].

    PubMed

    Legrand, Nicolas; Gagnepain, Pierre; Peschanski, Denis; Eustache, Francis

    2015-01-01

    During the last two decades, the effect of intersubjective relationships on cognition has been an emerging topic in cognitive neurosciences leading through a so-called "social turn" to the formation of new domains integrating society and cultures to this research area. Such inquiry has been recently extended to collective memory studies. Collective memory refers to shared representations that are constitutive of the identity of a group and distributed among all its members connected by a common history. After briefly describing those evolutions in the study of human brain and behaviors, we review recent researches that have brought together cognitive psychology, neuroscience and social sciences into collective memory studies. Using the reemerging concept of memory schema, we propose a theoretical framework allowing to account for collective memories formation with a specific focus on the encoding process of historical events. We suggest that (1) if the concept of schema has been mainly used to describe rather passive framework of knowledge, such structure may also be implied in more active fashions in the understanding of significant collective events. And, (2) if some schema researches have restricted themselves to the individual level of inquiry, we describe a strong coherence between memory and cultural frameworks. Integrating the neural basis and properties of memory schema to collective memory studies may pave the way toward a better understanding of the reciprocal interaction between individual memories and cultural resources such as media or education.

  1. Transactive memory systems scale for couples: development and validation

    PubMed Central

    Hewitt, Lauren Y.; Roberts, Lynne D.

    2015-01-01

    People in romantic relationships can develop shared memory systems by pooling their cognitive resources, allowing each person access to more information but with less cognitive effort. Research examining such memory systems in romantic couples largely focuses on remembering word lists or performing lab-based tasks, but these types of activities do not capture the processes underlying couples’ transactive memory systems, and may not be representative of the ways in which romantic couples use their shared memory systems in everyday life. We adapted an existing measure of transactive memory systems for use with romantic couples (TMSS-C), and conducted an initial validation study. In total, 397 participants who each identified as being a member of a romantic relationship of at least 3 months duration completed the study. The data provided a good fit to the anticipated three-factor structure of the components of couples’ transactive memory systems (specialization, credibility and coordination), and there was reasonable evidence of both convergent and divergent validity, as well as strong evidence of test–retest reliability across a 2-week period. The TMSS-C provides a valuable tool that can quickly and easily capture the underlying components of romantic couples’ transactive memory systems. It has potential to help us better understand this intriguing feature of romantic relationships, and how shared memory systems might be associated with other important features of romantic relationships. PMID:25999873

  2. Applicability of Transactional Memory to Modern Codes

    NASA Astrophysics Data System (ADS)

    Bihari, Barna L.

    2010-09-01

    In this paper we illustrate the features and study the applicability of transactional memory ™ as an efficient and easy-to-use alternative for handling memory conflicts in multi-theaded physics simulations that use shared memory. The tool used for our preliminary analysis of this novel construct is IBM's freely available Software Transactional Memory (STM) system. Instead of attempting to apply it to a production grade simulation code, we developed a much simpler test code that exhibits most of the salient features of modern unstructured mesh algorithms, but without the complicated physical models. We apply STM to two frequently used algorithms in realistic multi-physics codes. Our computational experiments indicate a good fit between these application scenarios and the TM features.

  3. Arousal-biased competition in perception and memory

    PubMed Central

    Mather, Mara; Sutherland, Matthew R.

    2010-01-01

    Our everyday surroundings besiege us with information. The battle is for a share of our limited attention and memory, with the brain selecting the winners and discarding the losers. Previous research shows that both bottom-up and top-down factors bias competition in favor of high priority stimuli. We propose that arousal during an event increases this bias both in perception and in long-term memory of the event. Arousal-biased competition theory provides specific predictions about when arousal will enhance and when it will impair memory for events, accounting for some puzzling contradictions in the emotional memory literature. PMID:21660127

  4. A Beginner's Guide to Memory.

    ERIC Educational Resources Information Center

    Hughes, Elizabeth M.

    1981-01-01

    This article is designed to equip the reader with the information needed to deal with questions of computer memory. Discussed are core memory; semiconductor memory; size of memory; expanding memory; charge-coupled device memories; magnetic bubble memory; and read-only and read-mostly memories. (KC)

  5. Fixed Access Network Sharing

    NASA Astrophysics Data System (ADS)

    Cornaglia, Bruno; Young, Gavin; Marchetta, Antonio

    2015-12-01

    Fixed broadband network deployments are moving inexorably to the use of Next Generation Access (NGA) technologies and architectures. These NGA deployments involve building fiber infrastructure increasingly closer to the customer in order to increase the proportion of fiber on the customer's access connection (Fibre-To-The-Home/Building/Door/Cabinet… i.e. FTTx). This increases the speed of services that can be sold and will be increasingly required to meet the demands of new generations of video services as we evolve from HDTV to "Ultra-HD TV" with 4k and 8k lines of video resolution. However, building fiber access networks is a costly endeavor. It requires significant capital in order to cover any significant geographic coverage. Hence many companies are forming partnerships and joint-ventures in order to share the NGA network construction costs. One form of such a partnership involves two companies agreeing to each build to cover a certain geographic area and then "cross-selling" NGA products to each other in order to access customers within their partner's footprint (NGA coverage area). This is tantamount to a bi-lateral wholesale partnership. The concept of Fixed Access Network Sharing (FANS) is to address the possibility of sharing infrastructure with a high degree of flexibility for all network operators involved. By providing greater configuration control over the NGA network infrastructure, the service provider has a greater ability to define the network and hence to define their product capabilities at the active layer. This gives the service provider partners greater product development autonomy plus the ability to differentiate from each other at the active network layer.

  6. Tripartite quantum state sharing.

    PubMed

    Lance, Andrew M; Symul, Thomas; Bowen, Warwick P; Sanders, Barry C; Lam, Ping Koy

    2004-04-30

    We demonstrate a multipartite protocol to securely distribute and reconstruct a quantum state. A secret quantum state is encoded into a tripartite entangled state and distributed to three players. By collaborating, any two of the three players can reconstruct the state, while individual players obtain nothing. We characterize this (2,3) threshold quantum state sharing scheme in terms of fidelity, signal transfer, and reconstruction noise. We demonstrate a fidelity averaged over all reconstruction permutations of 0.73+/-0.04, a level achievable only using quantum resources.

  7. Impact of network sharing in multi-core architectures.

    SciTech Connect

    Narayanaswamy, G.; Balaji, P.; Feng, W.; Mathematics and Computer Science; Virginia Tech

    2008-01-01

    As commodity components continue to dominate the realm of high-end computing, two hardware trends have emerged as major contributors-high-speed networking technologies and multi-core architectures. Communication middleware such as the Message Passing Interface (MPI) uses the network technology for communicating between processes that reside on different physical nodes, while using shared memory for communicating between processes on different cores within the same node. Thus, two conflicting possibilities arise: (i) with the advent of multi-core architectures, the number of processes that reside on the same physical node and hence share the same physical network can potentially increase significantly, resulting in increased network usage, and (ii) given the increase in intra-node shared-memory communication for processes residing on the same node, the network usage can potentially decrease significantly. In this paper, we address these two conflicting possibilities and study the behavior of network usage in multi-core environments with sample scientific applications. Specifically, we analyze trends that result in increase or decrease of network usage, and we derive insights into application performance based on these. We also study the sharing of different resources in the system in multi-core environments and identify the contribution of the network in this mix. In addition, we study different process allocation strategies and analyze their impact on such network sharing.

  8. Memory Retrieval and Interference: Working Memory Issues

    ERIC Educational Resources Information Center

    Radvansky, Gabriel A.; Copeland, David E.

    2006-01-01

    Working memory capacity has been suggested as a factor that is involved in long-term memory retrieval, particularly when that retrieval involves a need to overcome some sort of interference (Bunting, Conway, & Heitz, 2004; Cantor & Engle, 1993). Previous work has suggested that working memory is related to the acquisition of information during…

  9. Vaccines, our shared responsibility.

    PubMed

    Pagliusi, Sonia; Jain, Rishabh; Suri, Rajinder Kumar

    2015-05-05

    The Developing Countries Vaccine Manufacturers' Network (DCVMN) held its fifteenth annual meeting from October 27-29, 2014, New Delhi, India. The DCVMN, together with the co-organizing institution Panacea Biotec, welcomed over 240 delegates representing high-profile governmental and nongovernmental global health organizations from 36 countries. Over the three-day meeting, attendees exchanged information about their efforts to achieve their shared goal of preventing death and disability from known and emerging infectious diseases. Special praise was extended to all stakeholders involved in the success of polio eradication in South East Asia and highlighted challenges in vaccine supply for measles-rubella immunization over the coming decades. Innovative vaccines and vaccine delivery technologies indicated creative solutions for achieving global immunization goals. Discussions were focused on three major themes including regulatory challenges for developing countries that may be overcome with better communication; global collaborations and partnerships for leveraging investments and enable uninterrupted supply of affordable and suitable vaccines; and leading innovation in vaccines difficult to develop, such as dengue, Chikungunya, typhoid-conjugated and EV71, and needle-free technologies that may speed up vaccine delivery. Moving further into the Decade of Vaccines, participants renewed their commitment to shared responsibility toward a world free of vaccine-preventable diseases.

  10. MEMORY FOR POETRY: MORE THAN MEANING?

    PubMed

    Atchley, Rachel M; Hare, Mary L

    The assumption has become that memory for words' sound patterns, or form, is rapidly lost in comparison to content. Memory for form is also assumed to be verbatim rather than schematic. Oral story-telling traditions suggest otherwise. The present experiment investigated if form can be remembered schematically in spoken poetry, a context in which form is important. We also explored if sleep could help preserve memory for form. We tested whether alliterative sound patterns could cue memory for poetry lines both immediately and after a delay of 12 hours that did or did not include sleep. Twelve alliterative poetry lines were modified into same alliteration, different alliteration, and no alliteration paraphrases. We predicted that memory for original poetry lines would be less accurate after 12 hours, same alliteration paraphrases would be falsely recognized as originals more often after 12 hours, and that the no-sleep group would make more errors. Different alliteration and no alliteration paraphrases were not expected to share this effect due to schematically different sound patterns. Our data support these hypotheses and provide evidence that memory for form is schematic in nature, retained in contexts in which form matters, and that sleep may help preserve memory for sound patterns.

  11. MEMORY FOR POETRY: MORE THAN MEANING?

    PubMed Central

    Atchley, Rachel M.; Hare, Mary L.

    2015-01-01

    The assumption has become that memory for words’ sound patterns, or form, is rapidly lost in comparison to content. Memory for form is also assumed to be verbatim rather than schematic. Oral story-telling traditions suggest otherwise. The present experiment investigated if form can be remembered schematically in spoken poetry, a context in which form is important. We also explored if sleep could help preserve memory for form. We tested whether alliterative sound patterns could cue memory for poetry lines both immediately and after a delay of 12 hours that did or did not include sleep. Twelve alliterative poetry lines were modified into same alliteration, different alliteration, and no alliteration paraphrases. We predicted that memory for original poetry lines would be less accurate after 12 hours, same alliteration paraphrases would be falsely recognized as originals more often after 12 hours, and that the no-sleep group would make more errors. Different alliteration and no alliteration paraphrases were not expected to share this effect due to schematically different sound patterns. Our data support these hypotheses and provide evidence that memory for form is schematic in nature, retained in contexts in which form matters, and that sleep may help preserve memory for sound patterns. PMID:26401226

  12. Optical memory

    DOEpatents

    Mao, Samuel S; Zhang, Yanfeng

    2013-07-02

    Optical memory comprising: a semiconductor wire, a first electrode, a second electrode, a light source, a means for producing a first voltage at the first electrode, a means for producing a second voltage at the second electrode, and a means for determining the presence of an electrical voltage across the first electrode and the second electrode exceeding a predefined voltage. The first voltage, preferably less than 0 volts, different from said second voltage. The semiconductor wire is optically transparent and has a bandgap less than the energy produced by the light source. The light source is optically connected to the semiconductor wire. The first electrode and the second electrode are electrically insulated from each other and said semiconductor wire.

  13. 76 FR 55065 - Change in Bank Control Notices; Acquisitions of Shares of a Bank or Bank Holding Company

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-09-06

    ... Denney, Assistant Vice President) 1 Memorial Drive, Kansas City, Missouri 64198-0001: 1. Gregory J. Weed, Cheyenne Wells, Colorado; to acquire voting shares of Weed Investment Group, Inc., and thereby...

  14. Order-memory and association-memory.

    PubMed

    Caplan, Jeremy B

    2015-09-01

    Two highly studied memory functions are memory for associations (items presented in pairs, such as SALT-PEPPER) and memory for order (a list of items whose order matters, such as a telephone number). Order- and association-memory are at the root of many forms of behaviour, from wayfinding, to language, to remembering people's names. Most researchers have investigated memory for order separately from memory for associations. Exceptions to this, associative-chaining models build an ordered list from associations between pairs of items, quite literally understanding association- and order-memory together. Alternatively, positional-coding models have been used to explain order-memory as a completely distinct function from association-memory. Both classes of model have found empirical support and both have faced serious challenges. I argue that models that combine both associative chaining and positional coding are needed. One such hybrid model, which relies on brain-activity rhythms, is promising, but remains to be tested rigourously. I consider two relatively understudied memory behaviours that demand a combination of order- and association-information: memory for the order of items within associations (is it William James or James William?) and judgments of relative order (who left the party earlier, Hermann or William?). Findings from these underexplored procedures are already difficult to reconcile with existing association-memory and order-memory models. Further work with such intermediate experimental paradigms has the potential to provide powerful findings to constrain and guide models into the future, with the aim of explaining a large range of memory functions, encompassing both association- and order-memory.

  15. Model Sharing and Collaboration using HydroShare

    NASA Astrophysics Data System (ADS)

    Goodall, J. L.; Morsy, M. M.; Castronova, A. M.; Miles, B.; Merwade, V.; Tarboton, D. G.

    2015-12-01

    HydroShare is a web-based system funded by the National Science Foundation (NSF) for sharing hydrologic data and models as resources. Resources in HydroShare can either be assigned a generic type, meaning the resource only has Dublin Core metadata properties, or one of a growing number of specific resource types with enhanced metadata profiles defined by the HydroShare development team. Examples of specific resource types in the current release of HydroShare (http://www.hydroshare.org) include time series, geographic raster, Multidimensional (NetCDF), model program, and model instance. Here we describe research and development efforts in HydroShare project for model-related resources types. This work has included efforts to define metadata profiles for common modeling resources, execute models directly through the HydroShare user interface using Docker containers, and interoperate with the 3rd party application SWATShare for model execution and visualization. These examples demonstrate the benefit of HydroShare to support model sharing and address collaborative problems involving modeling. The presentation will conclude with plans for future modeling-related development in HydroShare including supporting the publication of workflow resources, enhanced metadata for additional hydrologic models, and linking model resources with other resources in HydroShare to capture model provenance.

  16. School Nurses Share a Job.

    ERIC Educational Resources Information Center

    Merwin, Elizabeth G.; Voss, Sondra

    1981-01-01

    Job sharing is a relatively new idea in which two or more people share the hours, the work, and the responsibilities of one job. Advantages and disadvantages to this situation are discussed in relation to the experiences of two nurses who shared a position as district nurse. (JN)

  17. Sharing Educational Services. PREP-13.

    ERIC Educational Resources Information Center

    Jongeward, Ray; Heesacker, Frank

    The focus of this report is on shared services in the rural setting. The kit contains three documents of useful information for any school planning a shared service activity to improve rural education. 13-A identifies 215 shared services in 50 states along with an indexing of each service by subject area and by state. 13-B is a series of 10…

  18. Fractions: How to Fair Share

    ERIC Educational Resources Information Center

    Wilson, P. Holt; Edgington, Cynthia P.; Nguyen, Kenny H.; Pescosolido, Ryan S.; Confrey, Jere

    2011-01-01

    Children learn from a very early age what it means to get their "fair share." Whether it is candy or birthday cake, many children successfully create equal-size groups or parts of a collection or whole but later struggle to create fair shares of multiple wholes, such as fairly sharing four pies among a family of seven. Recent research suggests…

  19. Shared Governance: Balancing the Euphoria.

    ERIC Educational Resources Information Center

    Guffey, J. Stephen; Rampp, Lary C.

    This paper presents an alternative view of shared governance within higher education institutions, examining the major problems encountered by institutions as they implement a shared governance model. Based on a review of the literature, it argues that shared governance, though increasingly popular in recent years, is an issue that should be…

  20. Handling debugger breakpoints in a shared instruction system

    DOEpatents

    Gooding, Thomas Michael; Shok, Richard Michael

    2014-01-21

    A debugger debugs processes that execute shared instructions so that a breakpoint set for one process will not cause a breakpoint to occur in the other processes. A breakpoint is set by recording the original instruction at the desired location and writing a trap instruction to the shared instructions at that location. When a process encounters the breakpoint, the process passes control to the debugger for breakpoint processing if the breakpoint was set at that location for that process. If the trap was not set at that location for that process, the cacheline containing the trap is copied to a small scratchpad memory, and the virtual memory mappings are changed to translate the virtual address of the cacheline to the scratchpad. The original instruction is then written to replace the trap instruction in the scratchpad, so that process can execute the instructions in the scatchpad thereby avoiding the trap instruction.

  1. Feature-Based Memory-Driven Attentional Capture: Visual Working Memory Content Affects Visual Attention

    ERIC Educational Resources Information Center

    Olivers, Christian N. L.; Meijer, Frank; Theeuwes, Jan

    2006-01-01

    In 7 experiments, the authors explored whether visual attention (the ability to select relevant visual information) and visual working memory (the ability to retain relevant visual information) share the same content representations. The presence of singleton distractors interfered more strongly with a visual search task when it was accompanied by…

  2. Emotional memory persists longer than event memory.

    PubMed

    Kuriyama, Kenichi; Soshi, Takahiro; Fujii, Takeshi; Kim, Yoshiharu

    2010-03-01

    The interaction between amygdala-driven and hippocampus-driven activities is expected to explain why emotion enhances episodic memory recognition. However, overwhelming behavioral evidence regarding the emotion-induced enhancement of immediate and delayed episodic memory recognition has not been obtained in humans. We found that the recognition performance for event memory differs from that for emotional memory. Although event recognition deteriorated equally for episodes that were or were not emotionally salient, emotional recognition remained high for only stimuli related to emotional episodes. Recognition performance pertaining to delayed emotional memory is an accurate predictor of the context of past episodes.

  3. SHARED TECHNOLOGY TRANSFER PROGRAM

    SciTech Connect

    GRIFFIN, JOHN M. HAUT, RICHARD C.

    2008-03-07

    The program established a collaborative process with domestic industries for the purpose of sharing Navy-developed technology. Private sector businesses were educated so as to increase their awareness of the vast amount of technologies that are available, with an initial focus on technology applications that are related to the Hydrogen, Fuel Cells and Infrastructure Technologies (Hydrogen) Program of the U.S. Department of Energy. Specifically, the project worked to increase industry awareness of the vast technology resources available to them that have been developed with taxpayer funding. NAVSEA-Carderock and the Houston Advanced Research Center teamed with Nicholls State University to catalog NAVSEA-Carderock unclassified technologies, rated the level of readiness of the technologies and established a web based catalog of the technologies. In particular, the catalog contains technology descriptions, including testing summaries and overviews of related presentations.

  4. Sharing a disparate landscape

    NASA Astrophysics Data System (ADS)

    Ali-Khan, Carolyne

    2010-06-01

    Working across boundaries of power, identity, and political geography is fraught with difficulties and contradictions. In Tali Tal and Iris Alkaher's, " Collaborative environmental projects in a multicultural society: Working from within separate or mutual landscapes?" the authors describe their efforts to do this in the highly charged atmosphere of Israel. This forum article offers a response to their efforts. Writing from a framework of critical pedagogy, I use the concepts of space and time to anchor my analysis, as I examine the issue of power in this Jew/Arab collaborative environmental project. This response problematizes "sharing" in a landscape fraught with disparities. It also looks to further Tal and Alkaher's work by geographically and politically grounding it in the broader current conflict and by juxtaposing sustainability with equity.

  5. University Reactor Sharing Program

    SciTech Connect

    W.D. Reese

    2004-02-24

    Research projects supported by the program include items such as dating geological material and producing high current super conducting magnets. The funding continues to give small colleges and universities the valuable opportunity to use the NSC for teaching courses in nuclear processes; specifically neutron activation analysis and gamma spectroscopy. The Reactor Sharing Program has supported the construction of a Fast Neutron Flux Irradiator for users at New Mexico Institute of Mining and Technology and the University of Houston. This device has been characterized and has been found to have near optimum neutron fluxes for A39/Ar 40 dating. Institution final reports and publications resulting from the use of these funds are on file at the Nuclear Science Center.

  6. Similar verbal memory impairments in schizophrenia and healthy aging. Implications for understanding of neural mechanisms.

    PubMed

    Silver, Henry; Bilker, Warren B

    2015-03-30

    Memory is impaired in schizophrenia patients but it is not clear whether this is specific to the illness and whether different types of memory (verbal and nonverbal) or memories in different cognitive domains (executive, object recognition) are similarly affected. To study relationships between memory impairments and schizophrenia we compared memory functions in 77 schizophrenia patients, 58 elderly healthy individuals and 41 young healthy individuals. Tests included verbal associative and logical memory and memory in executive and object recognition domains. We compared relationships of memory functions to each other and to other cognitive functions including psychomotor speed and verbal and spatial working memory. Compared to the young healthy group, schizophrenia patients and elderly healthy individuals showed similar severe impairment in logical memory and in the ability to learn new associations (NAL), and similar but less severe impairment in spatial working memory and executive and object memory. Verbal working memory was significantly more impaired in schizophrenia patients than in the healthy elderly. Verbal episodic memory impairment in schizophrenia may share common mechanisms with similar impairment in healthy aging. Impairment in verbal working memory in contrast may reflect mechanisms specific to schizophrenia. Study of verbal explicit memory impairment tapped by the NAL index may advance understanding of abnormal hippocampus dependent mechanisms common to schizophrenia and aging.

  7. Interference from mere thinking: mental rehearsal temporarily disrupts recall of motor memory.

    PubMed

    Yin, Cong; Wei, Kunlin

    2014-08-01

    Interference between successively learned tasks is widely investigated to study motor memory. However, how simultaneously learned motor memories interact with each other has been rarely studied despite its prevalence in daily life. Assuming that motor memory shares common neural mechanisms with declarative memory system, we made unintuitive predictions that mental rehearsal, as opposed to further practice, of one motor memory will temporarily impair the recall of another simultaneously learned memory. Subjects simultaneously learned two sensorimotor tasks, i.e., visuomotor rotation and gain. They retrieved one memory by either practice or mental rehearsal and then had their memory evaluated. We found that mental rehearsal, instead of execution, impaired the recall of unretrieved memory. This impairment was content-independent, i.e., retrieving either gain or rotation impaired the other memory. Hence, conscious recollection of one motor memory interferes with the recall of another memory. This is analogous to retrieval-induced forgetting in declarative memory, suggesting a common neural process across memory systems. Our findings indicate that motor imagery is sufficient to induce interference between motor memories. Mental rehearsal, currently widely regarded as beneficial for motor performance, negatively affects memory recall when it is exercised for a subset of memorized items.

  8. Neural Differentiation of Incorrectly Predicted Memories.

    PubMed

    Kim, Ghootae; Norman, Kenneth A; Turk-Browne, Nicholas B

    2017-02-22

    When an item is predicted in a particular context but the prediction is violated, memory for that item is weakened (Kim et al., 2014). Here, we explore what happens when such previously mispredicted items are later reencountered. According to prior neural network simulations, this sequence of events-misprediction and subsequent restudy-should lead to differentiation of the item's neural representation from the previous context (on which the misprediction was based). Specifically, misprediction weakens connections in the representation to features shared with the previous context and restudy allows new features to be incorporated into the representation that are not shared with the previous context. This cycle of misprediction and restudy should have the net effect of moving the item's neural representation away from the neural representation of the previous context. We tested this hypothesis using human fMRI by tracking changes in item-specific BOLD activity patterns in the hippocampus, a key structure for representing memories and generating predictions. In left CA2/3/DG, we found greater neural differentiation for items that were repeatedly mispredicted and restudied compared with items from a control condition that was identical except without misprediction. We also measured prediction strength in a trial-by-trial fashion and found that greater misprediction for an item led to more differentiation, further supporting our hypothesis. Therefore, the consequences of prediction error go beyond memory weakening. If the mispredicted item is restudied, the brain adaptively differentiates its memory representation to improve the accuracy of subsequent predictions and to shield it from further weakening.SIGNIFICANCE STATEMENT Competition between overlapping memories leads to weakening of nontarget memories over time, making it easier to access target memories. However, a nontarget memory in one context might become a target memory in another context. How do such memories

  9. Memory Metals

    NASA Technical Reports Server (NTRS)

    1995-01-01

    Under contract to NASA during preparations for the space station, Memry Technologies Inc. investigated shape memory effect (SME). SME is a characteristic of certain metal alloys that can change shape in response to temperature variations. In the late 1980s and early 1990s, Memry used its NASA-acquired expertise to produce a line of home and industrial safety products, and refined the technology in the mid-1990s. Among the new products they developed are three MemrySafe units which prevent scalding from faucets. Each system contains a small valve that reacts to temperature, not pressure. When the water reaches dangerous temperatures, the unit reduces the flow to a trickle; when the scalding temperature subsides, the unit restores normal flow. Other products are the FIRECHEK 2 and 4, heat-activated shutoff valves for industrial process lines, which sense excessive heat and cut off pneumatic pressure. The newest of these products is Memry's Demand Management Water Heater which shifts the electricity requirement from peak to off-peak demands, conserving energy and money.

  10. The Effects of Task Structure on Time-sharing Efficiency and Resource Allocation Optimality

    NASA Technical Reports Server (NTRS)

    Tsang, P. S.; Wickens, C. D.

    1984-01-01

    A distinction was made between two aspects of time sharing performance: time sharing efficiency and attention allocation optimality. A secondary task technique was employed to evaluate the effects of the task structures of the component time shared tasks on both aspects of the time sharing performance. Five pairs of dual tasks differing in their structural configurations were investigated. The primary task was a visual/manual tracking task which requires spatial processing. The secondary task was either another tracking task or a verbal memory task with one of four different input/output configurations. Congruent to a common finding, time-sharing efficiency was observed to decrease with an increasing overlap of resources utilized by the time shared tasks. Research also tends to support the hypothesis that resource allocation is more optimal when the time shared tasks placed heavy demands on common processing resources than when they utilized separate resources.

  11. The science of sharing and the sharing of science

    PubMed Central

    Milkman, Katherine L.; Berger, Jonah

    2014-01-01

    Why do members of the public share some scientific findings and not others? What can scientists do to increase the chances that their findings will be shared widely among nonscientists? To address these questions, we integrate past research on the psychological drivers of interpersonal communication with a study examining the sharing of hundreds of recent scientific discoveries. Our findings offer insights into (i) how attributes of a discovery and the way it is described impact sharing, (ii) who generates discoveries that are likely to be shared, and (iii) which types of people are most likely to share scientific discoveries. The results described here, combined with a review of recent research on interpersonal communication, suggest how scientists can frame their work to increase its dissemination. They also provide insights about which audiences may be the best targets for the diffusion of scientific content. PMID:25225360

  12. Data sharing in neuroimaging research

    PubMed Central

    Poline, Jean-Baptiste; Breeze, Janis L.; Ghosh, Satrajit; Gorgolewski, Krzysztof; Halchenko, Yaroslav O.; Hanke, Michael; Haselgrove, Christian; Helmer, Karl G.; Keator, David B.; Marcus, Daniel S.; Poldrack, Russell A.; Schwartz, Yannick; Ashburner, John; Kennedy, David N.

    2012-01-01

    Significant resources around the world have been invested in neuroimaging studies of brain function and disease. Easier access to this large body of work should have profound impact on research in cognitive neuroscience and psychiatry, leading to advances in the diagnosis and treatment of psychiatric and neurological disease. A trend toward increased sharing of neuroimaging data has emerged in recent years. Nevertheless, a number of barriers continue to impede momentum. Many researchers and institutions remain uncertain about how to share data or lack the tools and expertise to participate in data sharing. The use of electronic data capture (EDC) methods for neuroimaging greatly simplifies the task of data collection and has the potential to help standardize many aspects of data sharing. We review here the motivations for sharing neuroimaging data, the current data sharing landscape, and the sociological or technical barriers that still need to be addressed. The INCF Task Force on Neuroimaging Datasharing, in conjunction with several collaborative groups around the world, has started work on several tools to ease and eventually automate the practice of data sharing. It is hoped that such tools will allow researchers to easily share raw, processed, and derived neuroimaging data, with appropriate metadata and provenance records, and will improve the reproducibility of neuroimaging studies. By providing seamless integration of data sharing and analysis tools within a commodity research environment, the Task Force seeks to identify and minimize barriers to data sharing in the field of neuroimaging. PMID:22493576

  13. Contexts as Shared Commitments

    PubMed Central

    García-Carpintero, Manuel

    2015-01-01

    Contemporary semantics assumes two influential notions of context: one coming from Kaplan (1989), on which contexts are sets of predetermined parameters, and another originating in Stalnaker (1978), on which contexts are sets of propositions that are “common ground.” The latter is deservedly more popular, given its flexibility in accounting for context-dependent aspects of language beyond manifest indexicals, such as epistemic modals, predicates of taste, and so on and so forth; in fact, properly dealing with demonstratives (perhaps ultimately all indexicals) requires that further flexibility. Even if we acknowledge Lewis (1980)'s point that, in a sense, Kaplanian contexts already include common ground contexts, it is better to be clear and explicit about what contexts constitutively are. Now, Stalnaker (1978, 2002, 2014) defines context-as-common-ground as a set of propositions, but recent work shows that this is not an accurate conception. The paper explains why, and provides an alternative. The main reason is that several phenomena (presuppositional treatments of pejoratives and predicates of taste, forces other than assertion) require that the common ground includes non-doxastic attitudes such as appraisals, emotions, etc. Hence the common ground should not be taken to include merely contents (propositions), but those together with attitudes concerning them: shared commitments, as I will defend. PMID:26733087

  14. Contexts as Shared Commitments.

    PubMed

    García-Carpintero, Manuel

    2015-01-01

    Contemporary semantics assumes two influential notions of context: one coming from Kaplan (1989), on which contexts are sets of predetermined parameters, and another originating in Stalnaker (1978), on which contexts are sets of propositions that are "common ground." The latter is deservedly more popular, given its flexibility in accounting for context-dependent aspects of language beyond manifest indexicals, such as epistemic modals, predicates of taste, and so on and so forth; in fact, properly dealing with demonstratives (perhaps ultimately all indexicals) requires that further flexibility. Even if we acknowledge Lewis (1980)'s point that, in a sense, Kaplanian contexts already include common ground contexts, it is better to be clear and explicit about what contexts constitutively are. Now, Stalnaker (1978, 2002, 2014) defines context-as-common-ground as a set of propositions, but recent work shows that this is not an accurate conception. The paper explains why, and provides an alternative. The main reason is that several phenomena (presuppositional treatments of pejoratives and predicates of taste, forces other than assertion) require that the common ground includes non-doxastic attitudes such as appraisals, emotions, etc. Hence the common ground should not be taken to include merely contents (propositions), but those together with attitudes concerning them: shared commitments, as I will defend.

  15. Sharing Lessons Learned

    SciTech Connect

    Mohler, Bryan L.

    2004-09-01

    Workplace safety is inextricably tied to the culture – the leadership, management and organization – of the entire company. Nor is a safety lesson fundamentally different from any other business lesson. With these points in mind, Pacific Northwest National Laboratory recast its lessons learned program in 2000. The laboratory retained elements of a traditional lessons learned program, such as tracking and trending safety metrics, and added a best practices element to increase staff involvement in creating a safer, healthier work environment. Today, the Lessons Learned/Best Practices program offers the latest business thinking summarized from current external publications and shares better ways PNNL staff have discovered for doing things. According to PNNL strategic planning director Marilyn Quadrel, the goal is to sharpen the business acumen, project management ability and leadership skills of all staff and to capture the benefits of practices that emerge from lessons learned. A key tool in the PNNL effort to accelerate learning from past mistakes is one that can be easily implemented by other firms and tailored to their specific needs. It is the weekly placement of Lessons Learned/Best Practices articles in the lab’s internal electronic newsletter. The program is equally applicable in highly regulated environments, such as the national laboratories, and in enterprises that may have fewer external requirements imposed on their operations. And it is cost effective, using less than the equivalent of one fulltime person to administer.

  16. Measuring the phenomenology of autobiographical memory: A short form of the Memory Experiences Questionnaire.

    PubMed

    Luchetti, Martina; Sutin, Angelina R

    2016-01-01

    The Memory Experiences Questionnaire (MEQ) is a theoretically driven and empirically validated 63-item self-report scale designed to measure 10 phenomenological qualities of autobiographical memories: Vividness, Coherence, Accessibility, Time Perspective, Sensory Details, Visual Perspective, Emotional Intensity, Sharing, Distancing and Valence. To develop a short form of the MEQ to use when time is limited, participants from two samples (N = 719; N = 352) retrieved autobiographical memories, rated the phenomenological experience of each memory and completed several scales measuring psychological distress. For each MEQ dimension, the number of items was reduced by one-half based on item content and item-total correlations. Each short-form scale had acceptable internal consistency (median alpha = .79), and, similar to the long-form version of the scales, the new short scales correlated with psychological distress in theoretically meaningful ways. The new short form of the MEQ has similar psychometric proprieties as the original long form and can be used when time is limited.

  17. Memory bias for negative emotional words in recognition memory is driven by effects of category membership

    PubMed Central

    White, Corey N.; Kapucu, Aycan; Bruno, Davide; Rotello, Caren M.; Ratcliff, Roger

    2014-01-01

    Recognition memory studies often find that emotional items are more likely than neutral items to be labeled as studied. Previous work suggests this bias is driven by increased memory strength/familiarity for emotional items. We explored strength and bias interpretations of this effect with the conjecture that emotional stimuli might seem more familiar because they share features with studied items from the same category. Categorical effects were manipulated in a recognition task by presenting lists with a small, medium, or large proportion of emotional words. The liberal memory bias for emotional words was only observed when a medium or large proportion of categorized words were presented in the lists. Similar, though weaker, effects were observed with categorized words that were not emotional (animal names). These results suggest that liberal memory bias for emotional items may be largely driven by effects of category membership. PMID:24303902

  18. Memory bias for negative emotional words in recognition memory is driven by effects of category membership.

    PubMed

    White, Corey N; Kapucu, Aycan; Bruno, Davide; Rotello, Caren M; Ratcliff, Roger

    2014-01-01

    Recognition memory studies often find that emotional items are more likely than neutral items to be labelled as studied. Previous work suggests this bias is driven by increased memory strength/familiarity for emotional items. We explored strength and bias interpretations of this effect with the conjecture that emotional stimuli might seem more familiar because they share features with studied items from the same category. Categorical effects were manipulated in a recognition task by presenting lists with a small, medium or large proportion of emotional words. The liberal memory bias for emotional words was only observed when a medium or large proportion of categorised words were presented in the lists. Similar, though weaker, effects were observed with categorised words that were not emotional (animal names). These results suggest that liberal memory bias for emotional items may be largely driven by effects of category membership.

  19. Relating Hippocampus to Relational Memory Processing across Domains and Delays

    PubMed Central

    Monti, Jim M.; Cooke, Gillian E.; Watson, Patrick D.; Voss, Michelle W.; Kramer, Arthur F.; Cohen, Neal J.

    2015-01-01

    The hippocampus has been implicated in a diverse set of cognitive domains and paradigms, including cognitive mapping, long-term memory, and relational memory, at long or short study–test intervals. Despite the diversity of these areas, their association with the hippocampus may rely on an underlying commonality of relational memory processing shared among them. Most studies assess hippocampal memory within just one of these domains, making it difficult to know whether these paradigms all assess a similar underlying cognitive construct tied to the hippocampus. Here we directly tested the commonality among disparate tasks linked to the hippocampus by using PCA on performance from a battery of 12 cognitive tasks that included two traditional, long-delay neuropsychological tests of memory and two laboratory tests of relational memory (one of spatial and one of visual object associations) that imposed only short delays between study and test. Also included were different tests of memory, executive function, and processing speed. Structural MRI scans from a subset of participants were used to quantify the volume of the hippocampus and other subcortical regions. Results revealed that the 12 tasks clustered into four components; critically, the two neuropsychological tasks of long-term verbal memory and the two laboratory tests of relational memory loaded onto one component. Moreover, bilateral hippocampal volume was strongly tied to performance on this component. Taken together, these data emphasize the important contribution the hippocampus makes to relational memory processing across a broad range of tasks that span multiple domains. PMID:25203273

  20. Relating hippocampus to relational memory processing across domains and delays.

    PubMed

    Monti, Jim M; Cooke, Gillian E; Watson, Patrick D; Voss, Michelle W; Kramer, Arthur F; Cohen, Neal J

    2015-02-01

    The hippocampus has been implicated in a diverse set of cognitive domains and paradigms, including cognitive mapping, long-term memory, and relational memory, at long or short study-test intervals. Despite the diversity of these areas, their association with the hippocampus may rely on an underlying commonality of relational memory processing shared among them. Most studies assess hippocampal memory within just one of these domains, making it difficult to know whether these paradigms all assess a similar underlying cognitive construct tied to the hippocampus. Here we directly tested the commonality among disparate tasks linked to the hippocampus by using PCA on performance from a battery of 12 cognitive tasks that included two traditional, long-delay neuropsychological tests of memory and two laboratory tests of relational memory (one of spatial and one of visual object associations) that imposed only short delays between study and test. Also included were different tests of memory, executive function, and processing speed. Structural MRI scans from a subset of participants were used to quantify the volume of the hippocampus and other subcortical regions. Results revealed that the 12 tasks clustered into four components; critically, the two neuropsychological tasks of long-term verbal memory and the two laboratory tests of relational memory loaded onto one component. Moreover, bilateral hippocampal volume was strongly tied to performance on this component. Taken together, these data emphasize the important contribution the hippocampus makes to relational memory processing across a broad range of tasks that span multiple domains.

  1. Memory beyond expression.

    PubMed

    Delorenzi, A; Maza, F J; Suárez, L D; Barreiro, K; Molina, V A; Stehberg, J

    2014-01-01

    The idea that memories are not invariable after the consolidation process has led to new perspectives about several mnemonic processes. In this framework, we review our studies on the modulation of memory expression during reconsolidation. We propose that during both memory consolidation and reconsolidation, neuromodulators can determine the probability of the memory trace to guide behavior, i.e. they can either increase or decrease its behavioral expressibility without affecting the potential of persistent memories to be activated and become labile. Our hypothesis is based on the findings that positive modulation of memory expression during reconsolidation occurs even if memories are behaviorally unexpressed. This review discusses the original approach taken in the studies of the crab Neohelice (Chasmagnathus) granulata, which was then successfully applied to test the hypothesis in rodent fear memory. Data presented offers a new way of thinking about both weak trainings and experimental amnesia: memory retrieval can be dissociated from memory expression. Furthermore, the strategy presented here allowed us to show in human declarative memory that the periods in which long-term memory can be activated and become labile during reconsolidation exceeds the periods in which that memory is expressed, providing direct evidence that conscious access to memory is not needed for reconsolidation. Specific controls based on the constraints of reminders to trigger reconsolidation allow us to distinguish between obliterated and unexpressed but activated long-term memories after amnesic treatments, weak trainings and forgetting. In the hypothesis discussed, memory expressibility--the outcome of experience-dependent changes in the potential to behave--is considered as a flexible and modulable attribute of long-term memories. Expression seems to be just one of the possible fates of re-activated memories.

  2. Ageing-related stereotypes in memory: When the beliefs come true.

    PubMed

    Bouazzaoui, Badiâa; Follenfant, Alice; Ric, François; Fay, Séverine; Croizet, Jean-Claude; Atzeni, Thierry; Taconnat, Laurence

    2016-01-01

    Age-related stereotype concerns culturally shared beliefs about the inevitable decline of memory with age. In this study, stereotype priming and stereotype threat manipulations were used to explore the impact of age-related stereotype on metamemory beliefs and episodic memory performance. Ninety-two older participants who reported the same perceived memory functioning were divided into two groups: a threatened group and a non-threatened group (control). First, the threatened group was primed with an ageing stereotype questionnaire. Then, both groups were administered memory complaints and memory self-efficacy questionnaires to measure metamemory beliefs. Finally, both groups were administered the Logical Memory task to measure episodic memory, for the threatened group the instructions were manipulated to enhance the stereotype threat. Results indicated that the threatened individuals reported more memory complaints and less memory efficacy, and had lower scores than the control group on the logical memory task. A multiple mediation analysis revealed that the stereotype threat effect on the episodic memory performance was mediated by both memory complaints and memory self-efficacy. This study revealed that stereotype threat impacts belief in one's own memory functioning, which in turn impairs episodic memory performance.

  3. Benzodiazepines and memory

    PubMed Central

    Roth, T.; Roehrs, T.; Wittig, R.; Zorick, F.

    1984-01-01

    1 Benzodiazepines possess anterograde amnesic properties, disrupting both short-term and long-term memory function. 2 The amount of amnesia is systematically related to dose effects and half-life differences among the benzodiazepines. 3 Memory deficits are found for episodic, semantic, and iconic memory function. 4 The deficits in long-term memory are probably the result of a disruption of consolidation of information in memory and not retrieval from memory. The disruption is produced by rapid sleep onset. 5 Thus the long-term amnesia is really a retrograde effect of sleep and not the anterograde effect of the drug. PMID:6151849

  4. Problems of neural memory

    NASA Astrophysics Data System (ADS)

    Mikaelian, Andrei L.

    2005-01-01

    The paper considers the neural memory of the human brain from the viewpoint of visual information processing. A model that explains the principle of data recording and storing, memory relaxation, associative remembering and other memory functions is offered. The model of associative memory is based on the methods of holography, "wave biochemistry" and autowaves. Brief consideration is given to the associative properties of holographic neural structures and the memory architecture using running chemical reactions. The paper also outlines the problem of developing artificial memory elements for restoring the brain functions and possible interface devices for coupling neurons to electronic systems.

  5. Cooperative Data Sharing: Simple Support for Clusters of SMP Nodes

    NASA Technical Reports Server (NTRS)

    DiNucci, David C.; Balley, David H. (Technical Monitor)

    1997-01-01

    Libraries like PVM and MPI send typed messages to allow for heterogeneous cluster computing. Lower-level libraries, such as GAM, provide more efficient access to communication by removing the need to copy messages between the interface and user space in some cases. still lower-level interfaces, such as UNET, get right down to the hardware level to provide maximum performance. However, these are all still interfaces for passing messages from one process to another, and have limited utility in a shared-memory environment, due primarily to the fact that message passing is just another term for copying. This drawback is made more pertinent by today's hybrid architectures (e.g. clusters of SMPs), where it is difficult to know beforehand whether two communicating processes will share memory. As a result, even portable language tools (like HPF compilers) must either map all interprocess communication, into message passing with the accompanying performance degradation in shared memory environments, or they must check each communication at run-time and implement the shared-memory case separately for efficiency. Cooperative Data Sharing (CDS) is a single user-level API which abstracts all communication between processes into the sharing and access coordination of memory regions, in a model which might be described as "distributed shared messages" or "large-grain distributed shared memory". As a result, the user programs to a simple latency-tolerant abstract communication specification which can be mapped efficiently to either a shared-memory or message-passing based run-time system, depending upon the available architecture. Unlike some distributed shared memory interfaces, the user still has complete control over the assignment of data to processors, the forwarding of data to its next likely destination, and the queuing of data until it is needed, so even the relatively high latency present in clusters can be accomodated. CDS does not require special use of an MMU, which

  6. Technology Mediated Information Sharing (Monitor Sharing) in Primary Care Encounters

    ERIC Educational Resources Information Center

    Asan, Onur

    2013-01-01

    The aim of this dissertation study was to identify and describe the use of electronic health records (EHRs) for information sharing between patients and clinicians in primary-care encounters and to understand work system factors influencing information sharing. Ultimately, this will promote better design of EHR technologies and effective training…

  7. Decomposing the relationship between cognitive functioning and self-referent memory beliefs in older adulthood: what's memory got to do with it?

    PubMed

    Payne, Brennan R; Gross, Alden L; Hill, Patrick L; Parisi, Jeanine M; Rebok, George W; Stine-Morrow, Elizabeth A L

    2016-08-12

    With advancing age, episodic memory performance shows marked declines along with concurrent reports of lower subjective memory beliefs. Given that normative age-related declines in episodic memory co-occur with declines in other cognitive domains, we examined the relationship between memory beliefs and multiple domains of cognitive functioning. Confirmatory bi-factor structural equation models were used to parse the shared and independent variance among factors representing episodic memory, psychomotor speed, and executive reasoning in one large cohort study (Senior Odyssey, N = 462), and replicated using another large cohort of healthy older adults (ACTIVE, N = 2802). Accounting for a general fluid cognitive functioning factor (comprised of the shared variance among measures of episodic memory, speed, and reasoning) attenuated the relationship between objective memory performance and subjective memory beliefs in both samples. Moreover, the general cognitive functioning factor was the strongest predictor of memory beliefs in both samples. These findings are consistent with the notion that dispositional memory beliefs may reflect perceptions of cognition more broadly. This may be one reason why memory beliefs have broad predictive validity for interventions that target fluid cognitive ability.

  8. Knowledge of memory functions in European and Asian American adults and children: the relation to autobiographical memory.

    PubMed

    Wang, Qi; Koh, Jessie Bee Kim; Song, Qingfang; Hou, Yubo

    2015-01-01

    This study investigated explicit knowledge of autobiographical memory functions using a newly developed questionnaire. European and Asian American adults (N = 57) and school-aged children (N = 68) indicated their agreement with 13 statements about why people think about and share memories pertaining to four broad functions-self, social, directive and emotion regulation. Children were interviewed for personal memories concurrently with the memory function knowledge assessment and again 3 months later. It was found that adults agreed to the self, social and directive purposes of memory to a greater extent than did children, whereas European American children agreed to the emotion regulation purposes of memory to a greater extent than did European American adults. Furthermore, European American children endorsed more self and emotion regulation functions than did Asian American children, whereas Asian American adults endorsed more directive functions than did European American adults. Children's endorsement of memory functions, particularly social functions, was associated with more detailed and personally meaningful memories. These findings are informative for the understanding of developmental and cultural influences on memory function knowledge and of the relation of such knowledge to autobiographical memory development.

  9. Transcranial magnetic stimulation of visual cortex in memory: cortical state, interference and reactivation of visual content in memory.

    PubMed

    van de Ven, Vincent; Sack, Alexander T

    2013-01-01

    Memory for perceptual events includes the neural representation of the sensory information at short or longer time scales. Recent transcranial magnetic stimulation (TMS) studies of human visual cortex provided evidence that sensory cortex contributes to memory functions. In this review, we provide an exhaustive overview of these studies and ascertain how well the available evidence supports the idea of a causal role of sensory cortex in memory retention and retrieval. We discuss the validity and implications of the studies using a number of methodological and theoretical criteria that are relevant for brain stimulation of visual cortex. While most studies applied TMS to visual cortex to interfere with memory functions, a handful of pioneering studies used TMS to 'reactivate' memories in visual cortex. Interestingly, similar effects of TMS on memory were found in different memory tasks, which suggests that different memory systems share a neural mechanism of memory in visual cortex. At the same time, this neural mechanism likely interacts with higher order brain areas. Based on this overview and evaluation, we provide a first attempt to an integrative framework that describes how sensory processes contribute to memory in visual cortex, and how higher order areas contribute to this mechanism.

  10. Towards Distributed Memory Parallel Program Analysis

    SciTech Connect

    Quinlan, D; Barany, G; Panas, T

    2008-06-17

    This paper presents a parallel attribute evaluation for distributed memory parallel computer architectures where previously only shared memory parallel support for this technique has been developed. Attribute evaluation is a part of how attribute grammars are used for program analysis within modern compilers. Within this work, we have extended ROSE, a open compiler infrastructure, with a distributed memory parallel attribute evaluation mechanism to support user defined global program analysis required for some forms of security analysis which can not be addressed by a file by file view of large scale applications. As a result, user defined security analyses may now run in parallel without the user having to specify the way data is communicated between processors. The automation of communication enables an extensible open-source parallel program analysis infrastructure.

  11. The domain-specific and domain-general relationships of visuospatial working memory to reasoning ability.

    PubMed

    Shipstead, Zach; Yonehiro, Jade

    2016-10-01

    The degree to which visuospatial working memory (VSWM) is separable from working memory in general is an open question. On one hand, the construct is often researched as a unitary, domain-specific system. On the other, there is evidence that VWSM shares a common processing component with verbal memory. One might interpret this shared component as domain-general attention. We used confirmatory factor analysis to demonstrate that VSWM shares a domain-general component with verbal memory tasks and has a domain-specific component that is independent of verbal memory. Furthermore, the domain-general component was found to correlate with reasoning ability in both the visuospatial and verbal domains. The domain-specific component only correlated with reasoning ability when the tests had a strong visuospatial component. We argue that theories of VSWM need to place greater emphasis on its multiply determined nature.

  12. Different Approaches to Shared Services.

    ERIC Educational Resources Information Center

    Hall, Calvin W.

    Divided into four major sections, this collection of articles addresses the sharing of services in California by school districts or by districts and other agencies. The section on advertising and recruitment makes a case for districts to share in the purchase of employment ads or in the hiring of a recruiter. A reprint of an article about a…

  13. Food Sharing: An Evolutionary Perspective.

    ERIC Educational Resources Information Center

    Feinman, Saul

    Food altruism and the consumption of food are examined from a sociological perspective which assumes that humans share food as inclusive fitness actors. Inclusive fitness implies the representation of an individual's genes in future generations through his own or others' offspring. The discussion includes characteristics of food sharing among kin…

  14. Resource Sharing. SPEC Kit 42.

    ERIC Educational Resources Information Center

    Association of Research Libraries, Washington, DC. Office of Management Studies.

    A 1977 Association of Research Libraries (ARL) survey indicated that nearly all respondents viewed enhanced access to needed information and service capabilities as the primary benefit of resource sharing. Most responding libraries participated in more than one type of resource sharing activity, ranging from informal understandings among a few…

  15. Transforming Institutions through Shared Governance

    ERIC Educational Resources Information Center

    Bornstein, Rita

    2012-01-01

    Shared governance is a basic tenet of higher education and is frequently referred to. For shared governance to be successful, board members, administrators, and faculty members must learn to have respect for and confidence in each other, acting inclusively, transparently, and responsibly. Boards need to be active and involved, participating in…

  16. Resource Sharing in Community Colleges.

    ERIC Educational Resources Information Center

    Meyer, Frank; Hines, Edward; Lupo, Anita; Ley, Connie

    1998-01-01

    Presents a study analyzing voluntary resource sharing practices in a state population of 49 community colleges. Asserts that while resource sharing has been used primarily to solve short-term needs, it should be integrated in strategic and long-term fiscal planning. (JDI)

  17. Searching for repressed memory.

    PubMed

    McNally, Richard J

    2012-01-01

    This chapter summarizes the work of my research group on adults who report either repressed, recovered, or continuous memories of childhood sexual abuse (CSA) or who report no history of CSA. Adapting paradigms from cognitive psychology, we tested hypotheses inspired by both the "repressed memory" and "false memory" perspectives on recovered memories of CSA. We found some evidence for the false memory perspective, but no evidence for the repressed memory perspective. However, our work also suggests a third perspective on recovered memories that does not require the concept of repression. Some children do not understand their CSA when it occurs, and do not experience terror. Years later, they recall the experience, and understanding it as abuse, suffer intense distress. The memory failed to come to mind for years, partly because the child did not encode it as terrifying (i.e., traumatic), not because the person was unable to recall it.

  18. Emotional Memory Persists Longer than Event Memory

    ERIC Educational Resources Information Center

    Kuriyama, Kenichi; Soshi, Takahiro; Fujii, Takeshi; Kim, Yoshiharu

    2010-01-01

    The interaction between amygdala-driven and hippocampus-driven activities is expected to explain why emotion enhances episodic memory recognition. However, overwhelming behavioral evidence regarding the emotion-induced enhancement of immediate and delayed episodic memory recognition has not been obtained in humans. We found that the recognition…

  19. Methodology for fast detection of false sharing in threaded scientific codes

    DOEpatents

    Chung, I-Hsin; Cong, Guojing; Murata, Hiroki; Negishi, Yasushi; Wen, Hui-Fang

    2014-11-25

    A profiling tool identifies a code region with a false sharing potential. A static analysis tool classifies variables and arrays in the identified code region. A mapping detection library correlates memory access instructions in the identified code region with variables and arrays in the identified code region while a processor is running the identified code region. The mapping detection library identifies one or more instructions at risk, in the identified code region, which are subject to an analysis by a false sharing detection library. A false sharing detection library performs a run-time analysis of the one or more instructions at risk while the processor is re-running the identified code region. The false sharing detection library determines, based on the performed run-time analysis, whether two different portions of the cache memory line are accessed by the generated binary code.

  20. Shared prefetching to reduce execution skew in multi-threaded systems

    DOEpatents

    Eichenberger, Alexandre E; Gunnels, John A

    2013-07-16

    Mechanisms are provided for optimizing code to perform prefetching of data into a shared memory of a computing device that is shared by a plurality of threads that execute on the computing device. A memory stream of a portion of code that is shared by the plurality of threads is identified. A set of prefetch instructions is distributed across the plurality of threads. Prefetch instructions are inserted into the instruction sequences of the plurality of threads such that each instruction sequence has a separate sub-portion of the set of prefetch instructions, thereby generating optimized code. Executable code is generated based on the optimized code and stored in a storage device. The executable code, when executed, performs the prefetches associated with the distributed set of prefetch instructions in a shared manner across the plurality of threads.

  1. The FORCE: A highly portable parallel programming language

    NASA Technical Reports Server (NTRS)

    Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

    1989-01-01

    Here, it is explained why the FORCE parallel programming language is easily portable among six different shared-memory microprocessors, and how a two-level macro preprocessor makes it possible to hide low level machine dependencies and to build machine-independent high level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared memory multiprocessor executing them.

  2. Successful Shared Governance Through Education.

    PubMed

    Brull, Stacey

    2015-01-01

    Shared governance is one way nurses can attain a healthy work environment. Having direct-care nurses involved in raising relevant clinical and operational issues and creating systematic approaches has been linked to greater levels of empowerment which is often transposed into shared governance. Nurse leaders at one hospital used a comprehensive educational strategy to implement shared governance in less than 2 years. An authoritative style of leadership and decision making does not meet the needs of today's complex health care environment; nor does it meet the needs of today's employees. The focus on a very deliberate and educational strategy for shared governance was successful in building the structures and processes needed to take a unit and division from traditional governance to shared governance in less than 2 years.

  3. Oxytocin increases willingness to socially share one's emotions.

    PubMed

    Lane, Anthony; Luminet, Olivier; Rimé, Bernard; Gross, James J; de Timary, Philippe; Mikolajczak, Moïra

    2013-01-01

    Oxytocin (OT) is a neuropeptide that is attracting growing attention from researchers interested in human emotional and social behavior. There is indeed increasing evidence that OT has a calming effect and that it facilitates pair-bonding and social interactions. Some of OT's effects are thought to be direct, but it has been suggested that OT also may have indirect effects, mediated by changes in behavior. One potentially relevant behavioral change is an increased propensity for "emotional sharing" as this behavior, like OT, is known to have both calming and bonding effects. In this study, 60 healthy young adult men were randomly assigned to receive either intranasal placebo (PL; n = 30) or oxytocin (OT; n = 30). Participants were then instructed to retrieve a painful memory. Subsequently, OT and placebo participants' willingness to disclose to another person event-related facts (factual sharing) vs. event-related emotions (emotional sharing) was evaluated. Whereas the two groups were equally willing to disclose event-related facts, oxytocin was found to specifically increase the willingness to share event-related emotions. This study provides the first evidence that OT increases people's willingness to share their emotions. Importantly, OT did not make people more talkative (word counts were comparable across the two groups) but instead increased the willingness to share the specific component that is responsible for the calming and bonding effects of social sharing: emotions. Findings suggest that OT may shape the form of social sharing so as to maximize its benefits. This might help explain the calming and bonding effects of OT.

  4. Memory and the Self

    ERIC Educational Resources Information Center

    Conway, Martin A.

    2005-01-01

    The Self-Memory System (SMS) is a conceptual framework that emphasizes the interconnectedness of self and memory. Within this framework memory is viewed as the data base of the self. The self is conceived as a complex set of active goals and associated self-images, collectively referred to as the "working self." The relationship between the…

  5. Music, memory and emotion.

    PubMed

    Jäncke, Lutz

    2008-08-08

    Because emotions enhance memory processes and music evokes strong emotions, music could be involved in forming memories, either about pieces of music or about episodes and information associated with particular music. A recent study in BMC Neuroscience has given new insights into the role of emotion in musical memory.

  6. Self-defining memories, scripts, and the life story: narrative identity in personality and psychotherapy.

    PubMed

    Singer, Jefferson A; Blagov, Pavel; Berry, Meredith; Oost, Kathryn M

    2013-12-01

    An integrative model of narrative identity builds on a dual memory system that draws on episodic memory and a long-term self to generate autobiographical memories. Autobiographical memories related to critical goals in a lifetime period lead to life-story memories, which in turn become self-defining memories when linked to an individual's enduring concerns. Self-defining memories that share repetitive emotion-outcome sequences yield narrative scripts, abstracted templates that filter cognitive-affective processing. The life story is the individual's overarching narrative that provides unity and purpose over the life course. Healthy narrative identity combines memory specificity with adaptive meaning-making to achieve insight and well-being, as demonstrated through a literature review of personality and clinical research, as well as new findings from our own research program. A clinical case study drawing on this narrative identity model is also presented with implications for treatment and research.

  7. Towards Memory-Aware Services and Browsing through Lifelogging Sensing

    PubMed Central

    Arcega, Lorena; Font, Jaime; Cetina, Carlos

    2013-01-01

    Every day we receive lots of information through our senses that is lost forever, because it lacked the strength or the repetition needed to generate a lasting memory. Combining the emerging Internet of Things and lifelogging sensors, we believe it is possible to build up a Digital Memory (Dig-Mem) in order to complement the fallible memory of people. This work shows how to realize the Dig-Mem in terms of interactions, affinities, activities, goals and protocols. We also complement this Dig-Mem with memory-aware services and a Dig-Mem browser. Furthermore, we propose a RFID Tag-Sharing technique to speed up the adoption of Dig-Mem. Experimentation reveals an improvement of the user understanding of Dig-Mem as time passes, compared to natural memories where the level of detail decreases over time. PMID:24196436

  8. Concurrent working memory load can facilitate selective attention: evidence for specialized load.

    PubMed

    Park, Soojin; Kim, Min-Shik; Chun, Marvin M

    2007-10-01

    Load theory predicts that concurrent working memory load impairs selective attention and increases distractor interference (N. Lavie, A. Hirst, J. W. de Fockert, & E. Viding). Here, the authors present new evidence that the type of concurrent working memory load determines whether load impairs selective attention or not. Working memory load was paired with a same/different matching task that required focusing on targets while ignoring distractors. When working memory items shared the same limited-capacity processing mechanisms with targets in the matching task, distractor interference increased. However, when working memory items shared processing with distractors in the matching task, distractor interference decreased, facilitating target selection. A specialized load account is proposed to describe the dissociable effects of working memory load on selective processing depending on whether the load overlaps with targets or with distractors.

  9. Nonlinear Secret Image Sharing Scheme

    PubMed Central

    Shin, Sang-Ho; Yoo, Kee-Young

    2014-01-01

    Over the past decade, most of secret image sharing schemes have been proposed by using Shamir's technique. It is based on a linear combination polynomial arithmetic. Although Shamir's technique based secret image sharing schemes are efficient and scalable for various environments, there exists a security threat such as Tompa-Woll attack. Renvall and Ding proposed a new secret sharing technique based on nonlinear combination polynomial arithmetic in order to solve this threat. It is hard to apply to the secret image sharing. In this paper, we propose a (t, n)-threshold nonlinear secret image sharing scheme with steganography concept. In order to achieve a suitable and secure secret image sharing scheme, we adapt a modified LSB embedding technique with XOR Boolean algebra operation, define a new variable m, and change a range of prime p in sharing procedure. In order to evaluate efficiency and security of proposed scheme, we use the embedding capacity and PSNR. As a result of it, average value of PSNR and embedding capacity are 44.78 (dB) and 1.74t⌈log2⁡m⌉ bit-per-pixel (bpp), respectively. PMID:25140334

  10. Nonlinear secret image sharing scheme.

    PubMed

    Shin, Sang-Ho; Lee, Gil-Je; Yoo, Kee-Young

    2014-01-01

    Over the past decade, most of secret image sharing schemes have been proposed by using Shamir's technique. It is based on a linear combination polynomial arithmetic. Although Shamir's technique based secret image sharing schemes are efficient and scalable for various environments, there exists a security threat such as Tompa-Woll attack. Renvall and Ding proposed a new secret sharing technique based on nonlinear combination polynomial arithmetic in order to solve this threat. It is hard to apply to the secret image sharing. In this paper, we propose a (t, n)-threshold nonlinear secret image sharing scheme with steganography concept. In order to achieve a suitable and secure secret image sharing scheme, we adapt a modified LSB embedding technique with XOR Boolean algebra operation, define a new variable m, and change a range of prime p in sharing procedure. In order to evaluate efficiency and security of proposed scheme, we use the embedding capacity and PSNR. As a result of it, average value of PSNR and embedding capacity are 44.78 (dB) and 1.74t⌈log2 m⌉ bit-per-pixel (bpp), respectively.

  11. Roo: A parallel theorem prover

    SciTech Connect

    Lusk, E.L.; McCune, W.W.; Slaney, J.K.

    1991-11-01

    We describe a parallel theorem prover based on the Argonne theorem-proving system OTTER. The parallel system, called Roo, runs on shared-memory multiprocessors such as the Sequent Symmetry. We explain the parallel algorithm used and give performance results that demonstrate near-linear speedups on large problems.

  12. Parallel Lisp simulator

    SciTech Connect

    Weening, J.S.

    1988-05-01

    CSIM is a simulator for parallel Lisp, based on a continuation passing interpreter. It models a shared-memory multiprocessor executing programs written in Common Lisp, extended with several primitives for creating and controlling processes. This paper describes the structure of the simulator, measures its performance, and gives an example of its use with a parallel Lisp program.

  13. Message-passing performance of various computers

    SciTech Connect

    Dongarra, J.J.; Dunigan, T.H.

    1996-02-01

    This report compares the performance of different computer systems message passing. Latency and bandwidth are measured on Convex, Cray, IBM, Intel, KSR, Meiko, nCUBE, NEC, SGI, and TMC multiprocessors. Communication performance is contrasted with the computational power of each system. The comparison includes both shared a memory computers as well as networked workstation cluster.

  14. Cache directory lookup reader set encoding for partial cache line speculation support

    DOEpatents

    Gara, Alan; Ohmacht, Martin

    2014-10-21

    In a multiprocessor system, with conflict checking implemented in a directory lookup of a shared cache memory, a reader set encoding permits dynamic recordation of read accesses. The reader set encoding includes an indication of a portion of a line read, for instance by indicating boundaries of read accesses. Different encodings may apply to different types of speculative execution.

  15. Expansible quantum secret sharing network

    NASA Astrophysics Data System (ADS)

    Sun, Ying; Xu, Sheng-Wei; Chen, Xiu-Bo; Niu, Xin-Xin; Yang, Yi-Xian

    2013-08-01

    In the practical applications, member expansion is a usual demand during the development of a secret sharing network. However, there are few consideration and discussion on network expansibility in the existing quantum secret sharing schemes. We propose an expansible quantum secret sharing scheme with relatively simple and economical quantum resources and show how to split and reconstruct the quantum secret among an expansible user group in our scheme. Its trait, no requirement of any agent's assistant during the process of member expansion, can help to prevent potential menaces of insider cheating. We also give a discussion on the security of this scheme from three aspects.

  16. Effects of Aging on True and False Memory Formation: An fMRI Study

    ERIC Educational Resources Information Center

    Dennis, Nancy A.; Kim, Hongkeun; Cabeza, Roberto

    2007-01-01

    Compared to young, older adults are more likely to forget events that occurred in the past as well as remember events that never happened. Previous studies examining false memories and aging have shown that these memories are more likely to occur when new items share perceptual or semantic similarities with those presented during encoding. It is…

  17. Improvement in Working Memory Is Not Related to Increased Intelligence Scores

    ERIC Educational Resources Information Center

    Colom, Roberto; Quiroga, Ma. Angeles; Shih, Pei Chun; Martinez, Kenia; Burgaleta, Miguel; Martinez-Molina, Agustin; Roman, Francisco J.; Requena, Laura; Ramirez, Isabel

    2010-01-01

    The acknowledged high relationship between working memory and intelligence suggests common underlying cognitive mechanisms and, perhaps, shared biological substrates. If this is the case, improvement in working memory by repeated exposure to challenging span tasks might be reflected in increased intelligence scores. Here we report a study in which…

  18. Developmental Change in Working Memory Strategies: From Passive Maintenance to Active Refreshing

    ERIC Educational Resources Information Center

    Camos, Valerie; Barrouillet, Pierre

    2011-01-01

    Change in strategies is often mentioned as a source of memory development. However, though performance in working memory tasks steadily improves during childhood, theories differ in linking this development to strategy changes. Whereas some theories, such as the time-based resource-sharing model, invoke the age-related increase in use and…

  19. Immunological memory is associative

    SciTech Connect

    Smith, D.J.; Forrest, S.; Perelson, A.S.

    1996-12-31

    The purpose of this paper is to show that immunological memory is an associative and robust memory that belongs to the class of sparse distributed memories. This class of memories derives its associative and robust nature by sparsely sampling the input space and distributing the data among many independent agents. Other members of this class include a model of the cerebellar cortex and Sparse Distributed Memory (SDM). First we present a simplified account of the immune response and immunological memory. Next we present SDM, and then we show the correlations between immunological memory and SDM. Finally, we show how associative recall in the immune response can be both beneficial and detrimental to the fitness of an individual.

  20. Flexible Kernel Memory

    PubMed Central

    Nowicki, Dimitri; Siegelmann, Hava

    2010-01-01

    This paper introduces a new model of associative memory, capable of both binary and continuous-valued inputs. Based on kernel theory, the memory model is on one hand a generalization of Radial Basis Function networks and, on the other, is in feature space, analogous to a Hopfield network. Attractors can be added, deleted, and updated on-line simply, without harming existing memories, and the number of attractors is independent of input dimension. Input vectors do not have to adhere to a fixed or bounded dimensionality; they can increase and decrease it without relearning previous memories. A memory consolidation process enables the network to generalize concepts and form clusters of input data, which outperforms many unsupervised clustering techniques; this process is demonstrated on handwritten digits from MNIST. Another process, reminiscent of memory reconsolidation is introduced, in which existing memories are refreshed and tuned with new inputs; this process is demonstrated on series of morphed faces. PMID:20552013

  1. Stochastic memory: memory enhancement due to noise.

    PubMed

    Stotland, Alexander; Di Ventra, Massimiliano

    2012-01-01

    There are certain classes of resistors, capacitors, and inductors that, when subject to a periodic input of appropriate frequency, develop hysteresis loops in their characteristic response. Here we show that the hysteresis of such memory elements can also be induced by white noise of appropriate intensity even at very low frequencies of the external driving field. We illustrate this phenomenon using a physical model of memory resistor realized by TiO(2) thin films sandwiched between metallic electrodes and discuss under which conditions this effect can be observed experimentally. We also discuss its implications on existing memory systems described in the literature and the role of colored noise.

  2. Stochastic memory: Memory enhancement due to noise

    NASA Astrophysics Data System (ADS)

    Stotland, Alexander; di Ventra, Massimiliano

    2012-01-01

    There are certain classes of resistors, capacitors, and inductors that, when subject to a periodic input of appropriate frequency, develop hysteresis loops in their characteristic response. Here we show that the hysteresis of such memory elements can also be induced by white noise of appropriate intensity even at very low frequencies of the external driving field. We illustrate this phenomenon using a physical model of memory resistor realized by TiO2 thin films sandwiched between metallic electrodes and discuss under which conditions this effect can be observed experimentally. We also discuss its implications on existing memory systems described in the literature and the role of colored noise.

  3. Bilevel shared control for teleoperators

    NASA Technical Reports Server (NTRS)

    Hayati, Samad A. (Inventor); Venkataraman, Subramanian T. (Inventor)

    1992-01-01

    A shared system is disclosed for robot control including integration of the human and autonomous input modalities for an improved control. Autonomously planned motion trajectories are modified by a teleoperator to track unmodelled target motions, while nominal teleoperator motions are modified through compliance to accommodate geometric errors autonomously in the latter. A hierarchical shared system intelligently shares control over a remote robot between the autonomous and teleoperative portions of an overall control system. Architecture is hierarchical, and consists of two levels. The top level represents the task level, while the bottom, the execution level. In space applications, the performance of pure teleoperation systems depend significantly on the communication time delays between the local and the remote sites. Selection/mixing matrices are provided with entries which reflect how each input's signals modality is weighted. The shared control minimizes the detrimental effects caused by these time delays between earth and space.

  4. Information sharing promotes prosocial behaviour

    NASA Astrophysics Data System (ADS)

    Szolnoki, Attila; Perc, Matjaž

    2013-05-01

    More often than not, bad decisions are bad regardless of where and when they are made. Information sharing might thus be utilized to mitigate them. Here we show that sharing information about strategy choice between players residing on two different networks reinforces the evolution of cooperation. In evolutionary games, the strategy reflects the action of each individual that warrants the highest utility in a competitive setting. We therefore assume that identical strategies on the two networks reinforce themselves by lessening their propensity to change. Besides network reciprocity working in favour of cooperation on each individual network, we observe the spontaneous emergence of correlated behaviour between the two networks, which further deters defection. If information is shared not just between individuals but also between groups, the positive effect is even stronger, and this despite the fact that information sharing is implemented without any assumptions with regard to content.

  5. The value of shared services.

    PubMed

    Wallace, Beverly B

    2011-07-01

    A multisite shared services organization, combined with a robust business continuity plan, provides infrastructure and redundancies that mitigate risk for hospital CFOs. These structures can position providers to do the following: move essential operations out of a disaster impact zone, if necessary. Allow resources to focus on immediate patient care needs. Take advantage of economies of scale in temporary staffing. Leverage technology. Share in investments in disaster preparedness and business continuity solutions

  6. Split torque transmission load sharing

    NASA Technical Reports Server (NTRS)

    Krantz, T. L.; Rashidi, M.; Kish, J. G.

    1992-01-01

    Split torque transmissions are attractive alternatives to conventional planetary designs for helicopter transmissions. The split torque designs can offer lighter weight and fewer parts but have not been used extensively for lack of experience, especially with obtaining proper load sharing. Two split torque designs that use different load sharing methods have been studied. Precise indexing and alignment of the geartrain to produce acceptable load sharing has been demonstrated. An elastomeric torque splitter that has large torsional compliance and damping produces even better load sharing while reducing dynamic transmission error and noise. However, the elastomeric torque splitter as now configured is not capable over the full range of operating conditions of a fielded system. A thrust balancing load sharing device was evaluated. Friction forces that oppose the motion of the balance mechanism are significant. A static analysis suggests increasing the helix angle of the input pinion of the thrust balancing design. Also, dynamic analysis of this design predicts good load sharing and significant torsional response to accumulative pitch errors of the gears.

  7. Memory bistable mechanisms of organic memory devices

    NASA Astrophysics Data System (ADS)

    Lee, Ching-Ting; Yu, Li-Zhen; Chen, Hung-Chun

    2010-07-01

    To investigate the memory bistable mechanisms of organic memory devices, the structure of [top Au anode/9,10-di(2-naphthyl)anthracene (ADN) active layer/bottom Au cathode] was deposited using a thermal deposition system. The Au atoms migrated into the ADN active layer was observed from the secondary ion mass spectrometry. The density of 9.6×1016 cm-3 and energy level of 0.553 eV of the induced trapping centers caused by the migrated Au atoms in the ADN active layer were calculated. The induced trapping centers did not influence the carrier injection barrier height between Au and ADN active layer. Therefore, the memory bistable behaviors of the organic memory devices were attributed to the induced trapping centers. The energy diagram was established to verify the mechanisms.

  8. The differential effects of emotional salience on direct associative and relational memory during a nap.

    PubMed

    Alger, Sara E; Payne, Jessica D

    2016-12-01

    Relational memories are formed from shared components between directly learned memory associations, flexibly linking learned information to better inform future judgments. Sleep has been found to facilitate both direct associative and relational memories. However, the impact of incorporating emotionally salient information into learned material and the interaction of emotional salience and sleep in facilitating both types of memory is unknown. Participants encoded two sets of picture pairs, with either emotionally negative or neutral objects paired with neutral faces. The same objects were present in both sets, paired with two different faces across the sets. Baseline memory for these directly paired associates was tested immediately after encoding, followed by either a 90-min nap opportunity or wakefulness. Five hours after learning, a surprise test assessed relational memory, the indirect association between two faces paired with the same object during encoding, followed by a retest of direct associative memory. Overall, negative information was remembered better than neutral for directly learned pairs. A nap facilitated both preservation of direct associative memories and formation of relational memories, compared to remaining awake. Interestingly, however, this sleep benefit was observed specifically for neutral directly paired associates, while both neutral and negative relational associations benefitted from a nap. Finally, REM sleep played opposing roles in neutral direct and relational associative memory formation, with more REM sleep leading to forgetting of direct associations but promoting relational associations, suggesting that, while not benefitting memory consolidation for directly learned details, REM sleep may foster the memory reorganization needed for relational memory.

  9. Psychophysiology of prospective memory.

    PubMed

    Rothen, Nicolas; Meier, Beat

    2014-01-01

    Prospective memory involves the self-initiated retrieval of an intention upon an appropriate retrieval cue. Cue identification can be considered as an orienting reaction and may thus trigger a psychophysiological response. Here we present two experiments in which skin conductance responses (SCRs) elicited by prospective memory cues were compared to SCRs elicited by aversive stimuli to test whether a single prospective memory cue triggers a similar SCR as an aversive stimulus. In Experiment 2 we also assessed whether cue specificity had a differential influence on prospective memory performance and on SCRs. We found that detecting a single prospective memory cue is as likely to elicit a SCR as an aversive stimulus. Missed prospective memory cues also elicited SCRs. On a behavioural level, specific intentions led to better prospective memory performance. However, on a psychophysiological level specificity had no influence. More generally, the results indicate reliable SCRs for prospective memory cues and point to psychophysiological measures as valuable approach, which offers a new way to study one-off prospective memory tasks. Moreover, the findings are consistent with a theory that posits multiple prospective memory retrieval stages.

  10. 7 CFR 1980.391 - Equity sharing.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... property. Shared equity will be the lesser of the interest assistance granted or the amount of value appreciation available for shared equity. Value appreciation available for shared equity means the market value... amount of shared equity. The RHS approval official will calculate shared equity when a borrower's...

  11. 78 FR 18451 - Education and Sharing Day, U.S.A., 2013

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-03-26

    ... help define us as a people. On Education and Sharing Day, U.S.A., we celebrate hard work, service, and... and skills a 21st-century economy demands. We need to give them every chance to work harder, learn... others as we would want to be treated. This day recalls the memory of Rabbi Menachem Mendel...

  12. Forward Association, Backward Association, and the False-Memory Illusion

    ERIC Educational Resources Information Center

    Brainerd, C. J.; Wright, Ron

    2005-01-01

    In the Deese-Roediger-McDermott false-memory illusion, forward associative strength (FAS) is unrelated to the strength of the illusion; this is puzzling, because high-FAS lists ought to share more semantic features with critical unpresented words than should low-FAS lists. The authors show that this null result is probably a truncated range…

  13. When Reasoning Modifies Memory: Schematic Assimilation Triggered by Analogical Mapping

    ERIC Educational Resources Information Center

    Vendetti, Michael S.; Wu, Aaron; Rowshanshad, Ebi; Knowlton, Barbara J.; Holyoak, Keith J.

    2014-01-01

    Analogical mapping highlights shared relations that link 2 situations, potentially at the expense of information that does not fit the dominant pattern of correspondences. To investigate whether analogical mapping can alter subsequent recognition memory for features of a source analog, we performed 2 experiments with 4-term proportional analogies…

  14. Down Memory Lane: Recollections of Lamaze International's First 50 Years

    PubMed Central

    Zwelling, Elaine

    2010-01-01

    The 42-year involvement of one member of Lamaze International is chronicled through a decade-by-decade review of personal memories. The history of Lamaze International is shared through the recollections of her roles as a childbirth educator, faculty member, and member of the board of directors. PMID:21629385

  15. Display Sharing: An Alternative Paradigm

    NASA Technical Reports Server (NTRS)

    Brown, Michael A.

    2010-01-01

    The current Johnson Space Center (JSC) Mission Control Center (MCC) Video Transport System (VTS) provides flight controllers and management the ability to meld raw video from various sources with telemetry to improve situational awareness. However, maintaining a separate infrastructure for video delivery and integration of video content with data adds significant complexity and cost to the system. When considering alternative architectures for a VTS, the current system's ability to share specific computer displays in their entirety to other locations, such as large projector systems, flight control rooms, and back supporting rooms throughout the facilities and centers must be incorporated into any new architecture. Internet Protocol (IP)-based systems also support video delivery and integration. IP-based systems generally have an advantage in terms of cost and maintainability. Although IP-based systems are versatile, the task of sharing a computer display from one workstation to another can be time consuming for an end-user and inconvenient to administer at a system level. The objective of this paper is to present a prototype display sharing enterprise solution. Display sharing is a system which delivers image sharing across the LAN while simultaneously managing bandwidth, supporting encryption, enabling recovery and resynchronization following a loss of signal, and, minimizing latency. Additional critical elements will include image scaling support, multi -sharing, ease of initial integration and configuration, integration with desktop window managers, collaboration tools, host and recipient controls. This goal of this paper is to summarize the various elements of an IP-based display sharing system that can be used in today's control center environment.

  16. Human learning and memory.

    PubMed

    Johnson, M K; Hasher, L

    1987-01-01

    There have been several notable recent trends in the area of learning and memory. Problems with the episodic/semantic distinction have become more apparent, and new efforts have been made (exemplar models, distributed-memory models) to represent general knowledge without assuming a separate semantic system. Less emphasis is being placed on stable, prestored prototypes and more emphasis on a flexible memory system that provides the basis for a multitude of categories or frames of reference, derived on the spot as tasks demand. There is increasing acceptance of the idea that mental models are constructed and stored in memory in addition to, rather than instead of, memorial representations that are more closely tied to perceptions. This gives rise to questions concerning the conditions that permit inferences to be drawn and mental models to be constructed, and to questions concerning the similarities and differences in the nature of the representations in memory of perceived and generated information and in their functions. There has also been a swing from interest in deliberate strategies to interest in automatic, unconscious (even mechanistic!) processes, reflecting an appreciation that certain situations (e.g. recognition, frequency judgements, savings in indirect tasks, aspects of skill acquisition, etc) seem not to depend much on the products of strategic, effortful or reflective processes. There is a lively interest in relations among memory measures and attempts to characterize memory representations and/or processes that could give rise to dissociations among measures. Whether the pattern of results reflects the operation of functional subsystems of memory and, if so, what the "modules" are is far from clear. This issue has been fueled by work with amnesics and has contributed to a revival of interaction between researchers studying learning and memory in humans and those studying learning and memory in animals. Thus, neuroscience rivals computer science as a

  17. Colouring in the Blanks: Memory Drawings of the 1990 Kuwait Invasion

    ERIC Educational Resources Information Center

    Pepin-Wakefield, Yvonne

    2009-01-01

    This study used drawing tasks to examine the similarities and differences between females and males who shared a collective traumatic event in early childhood. Could these childhood memories be recorded, measured, and compared for gender differences in drawings by young adults who had shared a similar experience as children? Exploration of this…

  18. Neural Mechanisms of Interference Control Underlie the Relationship between Fluid Intelligence and Working Memory Span

    ERIC Educational Resources Information Center

    Burgess, Gregory C.; Gray, Jeremy R.; Conway, Andrew R. A.; Braver, Todd S.

    2011-01-01

    Fluid intelligence (gF) and working memory (WM) span predict success in demanding cognitive situations. Recent studies show that much of the variance in gF and WM span is shared, suggesting common neural mechanisms. This study provides a direct investigation of the degree to which shared variance in gF and WM span can be explained by neural…

  19. Toward Enhancing OpenMP's Work-Sharing Directives

    SciTech Connect

    Chapman, B M; Huang, L; Jin, H; Jost, G; de Supinski, B R

    2006-05-17

    OpenMP provides a portable programming interface for shared memory parallel computers (SMPs). Although this interface has proven successful for small SMPs, it requires greater flexibility in light of the steadily growing size of individual SMPs and the recent advent of multithreaded chips. In this paper, we describe two application development experiences that exposed these expressivity problems in the current OpenMP specification. We then propose mechanisms to overcome these limitations, including thread subteams and thread topologies. Thus, we identify language features that improve OpenMP application performance on emerging and large-scale platforms while preserving ease of programming.

  20. Conscious and Unconscious Memory Systems

    PubMed Central

    Squire, Larry R.; Dede, Adam J.O.

    2015-01-01

    The idea that memory is not a single mental faculty has a long and interesting history but became a topic of experimental and biologic inquiry only in the mid-20th century. It is now clear that there are different kinds of memory, which are supported by different brain systems. One major distinction can be drawn between working memory and long-term memory. Long-term memory can be separated into declarative (explicit) memory and a collection of nondeclarative (implicit) forms of memory that include habits, skills, priming, and simple forms of conditioning. These memory systems depend variously on the hippocampus and related structures in the parahippocampal gyrus, as well as on the amygdala, the striatum, cerebellum, and the neocortex. This work recounts the discovery of declarative and nondeclarative memory and then describes the nature of declarative memory, working memory, nondeclarative memory, and the relationship between memory systems. PMID:25731765