external memory algorithms: Topics by Science.gov

Sample records for external memory algorithms

Generalized enhanced suffix array construction in external memory.

PubMed

Louza, Felipe A; Telles, Guilherme P; Hoffmann, Steve; Ciferri, Cristina D A

2017-01-01

Suffix arrays, augmented by additional data structures, allow solving efficiently many string processing problems. The external memory construction of the generalized suffix array for a string collection is a fundamental task when the size of the input collection or the data structure exceeds the available internal memory. In this article we present and analyze [Formula: see text] [introduced in CPM (External memory generalized suffix and [Formula: see text] arrays construction. In: Proceedings of CPM. pp 201-10, 2013)], the first external memory algorithm to construct generalized suffix arrays augmented with the longest common prefix array for a string collection. Our algorithm relies on a combination of buffers, induced sorting and a heap to avoid direct string comparisons. We performed experiments that covered different aspects of our algorithm, including running time, efficiency, external memory access, internal phases and the influence of different optimization strategies. On real datasets of size up to 24 GB and using 2 GB of internal memory, [Formula: see text] showed a competitive performance when compared to [Formula: see text] and [Formula: see text], which are efficient algorithms for a single string according to the related literature. We also show the effect of disk caching managed by the operating system on our algorithm. The proposed algorithm was validated through performance tests using real datasets from different domains, in various combinations, and showed a competitive performance. Our algorithm can also construct the generalized Burrows-Wheeler transform of a string collection with no additional cost except by the output time.
FFTs in external or hierarchical memory

NASA Technical Reports Server (NTRS)

Bailey, David H.

1989-01-01

A description is given of advanced techniques for computing an ordered FFT on a computer with external or hierarchical memory. These algorithms (1) require as few as two passes through the external data set, (2) use strictly unit stride, long vector transfers between main memory and external storage, (3) require only a modest amount of scratch space in main memory, and (4) are well suited for vector and parallel computation. Performance figures are included for implementations of some of these algorithms on Cray supercomputers. Of interest is the fact that a main memory version outperforms the current Cray library FFT routines on the Cray-2, the Cray X-MP, and the Cray Y-MP systems. Using all eight processors on the Cray Y-MP, this main memory routine runs at nearly 2 Gflops.
Wide-Range Motion Estimation Architecture with Dual Search Windows for High Resolution Video Coding

NASA Astrophysics Data System (ADS)

Dung, Lan-Rong; Lin, Meng-Chun

This paper presents a memory-efficient motion estimation (ME) technique for high-resolution video compression. The main objective is to reduce the external memory access, especially for limited local memory resource. The reduction of memory access can successfully save the notorious power consumption. The key to reduce the memory accesses is based on center-biased algorithm in that the center-biased algorithm performs the motion vector (MV) searching with the minimum search data. While considering the data reusability, the proposed dual-search-windowing (DSW) approaches use the secondary windowing as an option per searching necessity. By doing so, the loading of search windows can be alleviated and hence reduce the required external memory bandwidth. The proposed techniques can save up to 81% of external memory bandwidth and require only 135 MBytes/sec, while the quality degradation is less than 0.2dB for 720p HDTV clips coded at 8Mbits/sec.
Application of ant colony optimization in development of models for prediction of anti-HIV-1 activity of HEPT derivatives.

PubMed

Zare-Shahabadi, Vali; Abbasitabar, Fatemeh

2010-09-01

Quantitative structure-activity relationship models were derived for 107 analogs of 1-[(2-hydroxyethoxy) methyl]-6-(phenylthio)thymine, a potent inhibitor of the HIV-1 reverse transcriptase. The activities of these compounds were investigated by means of multiple linear regression (MLR) technique. An ant colony optimization algorithm, called Memorized_ACS, was applied for selecting relevant descriptors and detecting outliers. This algorithm uses an external memory based upon knowledge incorporation from previous iterations. At first, the memory is empty, and then it is filled by running several ACS algorithms. In this respect, after each ACS run, the elite ant is stored in the memory and the process is continued to fill the memory. Here, pheromone updating is performed by all elite ants collected in the memory; this results in improvements in both exploration and exploitation behaviors of the ACS algorithm. The memory is then made empty and is filled again by performing several ACS algorithms using updated pheromone trails. This process is repeated for several iterations. At the end, the memory contains several top solutions for the problem. Number of appearance of each descriptor in the external memory is a good criterion for its importance. Finally, prediction is performed by the elitist ant, and interpretation is carried out by considering the importance of each descriptor. The best MLR model has a training error of 0.47 log (1/EC(50)) units (R(2) = 0.90) and a prediction error of 0.76 log (1/EC(50)) units (R(2) = 0.88). Copyright 2010 Wiley Periodicals, Inc.
LSG: An External-Memory Tool to Compute String Graphs for Next-Generation Sequencing Data Assembly.

PubMed

Bonizzoni, Paola; Vedova, Gianluca Della; Pirola, Yuri; Previtali, Marco; Rizzi, Raffaella

2016-03-01

The large amount of short read data that has to be assembled in future applications, such as in metagenomics or cancer genomics, strongly motivates the investigation of disk-based approaches to index next-generation sequencing (NGS) data. Positive results in this direction stimulate the investigation of efficient external memory algorithms for de novo assembly from NGS data. Our article is also motivated by the open problem of designing a space-efficient algorithm to compute a string graph using an indexing procedure based on the Burrows-Wheeler transform (BWT). We have developed a disk-based algorithm for computing string graphs in external memory: the light string graph (LSG). LSG relies on a new representation of the FM-index that is exploited to use an amount of main memory requirement that is independent from the size of the data set. Moreover, we have developed a pipeline for genome assembly from NGS data that integrates LSG with the assembly step of SGA (Simpson and Durbin, 2012 ), a state-of-the-art string graph-based assembler, and uses BEETL for indexing the input data. LSG is open source software and is available online. We have analyzed our implementation on a 875-million read whole-genome dataset, on which LSG has built the string graph using only 1GB of main memory (reducing the memory occupation by a factor of 50 with respect to SGA), while requiring slightly more than twice the time than SGA. The analysis of the entire pipeline shows an important decrease in memory usage, while managing to have only a moderate increase in the running time.
Generic Entity Resolution in Relational Databases

NASA Astrophysics Data System (ADS)

Sidló, Csaba István

Entity Resolution (ER) covers the problem of identifying distinct representations of real-world entities in heterogeneous databases. We consider the generic formulation of ER problems (GER) with exact outcome. In practice, input data usually resides in relational databases and can grow to huge volumes. Yet, typical solutions described in the literature employ standalone memory resident algorithms. In this paper we utilize facilities of standard, unmodified relational database management systems (RDBMS) to enhance the efficiency of GER algorithms. We study and revise the problem formulation, and propose practical and efficient algorithms optimized for RDBMS external memory processing. We outline a real-world scenario and demonstrate the advantage of algorithms by performing experiments on insurance customer data.
Design of Belief Propagation Based on FPGA for the Multistereo CAFADIS Camera

PubMed Central

Magdaleno, Eduardo; Lüke, Jonás Philipp; Rodríguez, Manuel; Rodríguez-Ramos, José Manuel

2010-01-01

In this paper we describe a fast, specialized hardware implementation of the belief propagation algorithm for the CAFADIS camera, a new plenoptic sensor patented by the University of La Laguna. This camera captures the lightfield of the scene and can be used to find out at which depth each pixel is in focus. The algorithm has been designed for FPGA devices using VHDL. We propose a parallel and pipeline architecture to implement the algorithm without external memory. Although the BRAM resources of the device increase considerably, we can maintain real-time restrictions by using extremely high-performance signal processing capability through parallelism and by accessing several memories simultaneously. The quantifying results with 16 bit precision have shown that performances are really close to the original Matlab programmed algorithm. PMID:22163404
Design of belief propagation based on FPGA for the multistereo CAFADIS camera.

PubMed

Magdaleno, Eduardo; Lüke, Jonás Philipp; Rodríguez, Manuel; Rodríguez-Ramos, José Manuel

2010-01-01

In this paper we describe a fast, specialized hardware implementation of the belief propagation algorithm for the CAFADIS camera, a new plenoptic sensor patented by the University of La Laguna. This camera captures the lightfield of the scene and can be used to find out at which depth each pixel is in focus. The algorithm has been designed for FPGA devices using VHDL. We propose a parallel and pipeline architecture to implement the algorithm without external memory. Although the BRAM resources of the device increase considerably, we can maintain real-time restrictions by using extremely high-performance signal processing capability through parallelism and by accessing several memories simultaneously. The quantifying results with 16 bit precision have shown that performances are really close to the original Matlab programmed algorithm.
GPU-based optimal control for RWM feedback in tokamaks

DOE PAGES

Clement, Mitchell; Hanson, Jeremy; Bialek, Jim; ...

2017-08-23

The design and implementation of a Graphics Processing Unit (GPU) based Resistive Wall Mode (RWM) controller to perform feedback control on the RWM using Linear Quadratic Gaussian (LQG) control is reported herein. Also, the control algorithm is based on a simplified DIII-D VALEN model. By using NVIDIA’s GPUDirect RDMA framework, the digitizer and output module are able to write and read directly to and from GPU memory, eliminating memory transfers between host and GPU. In conclusion, the system and algorithm was able to reduce plasma response excited by externally applied fields by 32% during development experiments.
GPU-based optimal control for RWM feedback in tokamaks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Clement, Mitchell; Hanson, Jeremy; Bialek, Jim

The design and implementation of a Graphics Processing Unit (GPU) based Resistive Wall Mode (RWM) controller to perform feedback control on the RWM using Linear Quadratic Gaussian (LQG) control is reported herein. Also, the control algorithm is based on a simplified DIII-D VALEN model. By using NVIDIA’s GPUDirect RDMA framework, the digitizer and output module are able to write and read directly to and from GPU memory, eliminating memory transfers between host and GPU. In conclusion, the system and algorithm was able to reduce plasma response excited by externally applied fields by 32% during development experiments.
Highly Asynchronous VisitOr Queue Graph Toolkit

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pearce, R.

2012-10-01

HAVOQGT is a C++ framework that can be used to create highly parallel graph traversal algorithms. The framework stores the graph and algorithmic data structures on external memory that is typically mapped to high performance locally attached NAND FLASH arrays. The framework supports a vertex-centered visitor programming model. The frameworkd has been used to implement breadth first search, connected components, and single source shortest path.
LOD-based clustering techniques for efficient large-scale terrain storage and visualization

NASA Astrophysics Data System (ADS)

Bao, Xiaohong; Pajarola, Renato

2003-05-01

Large multi-resolution terrain data sets are usually stored out-of-core. To visualize terrain data at interactive frame rates, the data needs to be organized on disk, loaded into main memory part by part, then rendered efficiently. Many main-memory algorithms have been proposed for efficient vertex selection and mesh construction. Organization of terrain data on disk is quite difficult because the error, the triangulation dependency and the spatial location of each vertex all need to be considered. Previous terrain clustering algorithms did not consider the per-vertex approximation error of individual terrain data sets. Therefore, the vertex sequences on disk are exactly the same for any terrain. In this paper, we propose a novel clustering algorithm which introduces the level-of-detail (LOD) information to terrain data organization to map multi-resolution terrain data to external memory. In our approach the LOD parameters of the terrain elevation points are reflected during clustering. The experiments show that dynamic loading and paging of terrain data at varying LOD is very efficient and minimizes page faults. Additionally, the preprocessing of this algorithm is very fast and works from out-of-core.
Analyze the beta waves of electroencephalogram signals from young musicians and non-musicians in major scale working memory task.

PubMed

Hsu, Chien-Chang; Cheng, Ching-Wen; Chiu, Yi-Shiuan

2017-02-15

Electroencephalograms can record wave variations in any brain activity. Beta waves are produced when an external stimulus induces logical thinking, computation, and reasoning during consciousness. This work uses the beta wave of major scale working memory N-back tasks to analyze the differences between young musicians and non-musicians. After the feature analysis uses signal filtering, Hilbert-Huang transformation, and feature extraction methods to identify differences, k-means clustering algorithm are used to group them into different clusters. The results of feature analysis showed that beta waves significantly differ between young musicians and non-musicians from the low memory load of working memory task. Copyright © 2017 Elsevier B.V. All rights reserved.
Scalable Motion Estimation Processor Core for Multimedia System-on-Chip Applications

NASA Astrophysics Data System (ADS)

Lai, Yeong-Kang; Hsieh, Tian-En; Chen, Lien-Fei

2007-04-01

In this paper, we describe a high-throughput and scalable motion estimation processor architecture for multimedia system-on-chip applications. The number of processing elements (PEs) is scalable according to the variable algorithm parameters and the performance required for different applications. Using the PE rings efficiently and an intelligent memory-interleaving organization, the efficiency of the architecture can be increased. Moreover, using efficient on-chip memories and a data management technique can effectively decrease the power consumption and memory bandwidth. Techniques for reducing the number of interconnections and external memory accesses are also presented. Our results demonstrate that the proposed scalable PE-ringed architecture is a flexible and high-performance processor core in multimedia system-on-chip applications.
A quantum causal discovery algorithm

NASA Astrophysics Data System (ADS)

Giarmatzi, Christina; Costa, Fabio

2018-03-01

Finding a causal model for a set of classical variables is now a well-established task—but what about the quantum equivalent? Even the notion of a quantum causal model is controversial. Here, we present a causal discovery algorithm for quantum systems. The input to the algorithm is a process matrix describing correlations between quantum events. Its output consists of different levels of information about the underlying causal model. Our algorithm determines whether the process is causally ordered by grouping the events into causally ordered non-signaling sets. It detects if all relevant common causes are included in the process, which we label Markovian, or alternatively if some causal relations are mediated through some external memory. For a Markovian process, it outputs a causal model, namely the causal relations and the corresponding mechanisms, represented as quantum states and channels. Our algorithm opens the route to more general quantum causal discovery methods.
Method for refreshing a non-volatile memory

DOEpatents

Riekels, James E.; Schlesinger, Samuel

2008-11-04

A non-volatile memory and a method of refreshing a memory are described. The method includes allowing an external system to control refreshing operations within the memory. The memory may generate a refresh request signal and transmit the refresh request signal to the external system. When the external system finds an available time to process the refresh request, the external system acknowledges the refresh request and transmits a refresh acknowledge signal to the memory. The memory may also comprise a page register for reading and rewriting a data state back to the memory. The page register may comprise latches in lieu of supplemental non-volatile storage elements, thereby conserving real estate within the memory.
Numerical integration of the extended variable generalized Langevin equation with a positive Prony representable memory kernel.

PubMed

Baczewski, Andrew D; Bond, Stephen D

2013-07-28

Generalized Langevin dynamics (GLD) arise in the modeling of a number of systems, ranging from structured fluids that exhibit a viscoelastic mechanical response, to biological systems, and other media that exhibit anomalous diffusive phenomena. Molecular dynamics (MD) simulations that include GLD in conjunction with external and/or pairwise forces require the development of numerical integrators that are efficient, stable, and have known convergence properties. In this article, we derive a family of extended variable integrators for the Generalized Langevin equation with a positive Prony series memory kernel. Using stability and error analysis, we identify a superlative choice of parameters and implement the corresponding numerical algorithm in the LAMMPS MD software package. Salient features of the algorithm include exact conservation of the first and second moments of the equilibrium velocity distribution in some important cases, stable behavior in the limit of conventional Langevin dynamics, and the use of a convolution-free formalism that obviates the need for explicit storage of the time history of particle velocities. Capability is demonstrated with respect to accuracy in numerous canonical examples, stability in certain limits, and an exemplary application in which the effect of a harmonic confining potential is mapped onto a memory kernel.
Importance of balanced architectures in the design of high-performance imaging systems

NASA Astrophysics Data System (ADS)

Sgro, Joseph A.; Stanton, Paul C.

1999-03-01

Imaging systems employed in demanding military and industrial applications, such as automatic target recognition and computer vision, typically require real-time high-performance computing resources. While high- performances computing systems have traditionally relied on proprietary architectures and custom components, recent advances in high performance general-purpose microprocessor technology have produced an abundance of low cost components suitable for use in high-performance computing systems. A common pitfall in the design of high performance imaging system, particularly systems employing scalable multiprocessor architectures, is the failure to balance computational and memory bandwidth. The performance of standard cluster designs, for example, in which several processors share a common memory bus, is typically constrained by memory bandwidth. The symptom characteristic of this problem is failure to the performance of the system to scale as more processors are added. The problem becomes exacerbated if I/O and memory functions share the same bus. The recent introduction of microprocessors with large internal caches and high performance external memory interfaces makes it practical to design high performance imaging system with balanced computational and memory bandwidth. Real word examples of such designs will be presented, along with a discussion of adapting algorithm design to best utilize available memory bandwidth.
Multi-Resolution Indexing for Hierarchical Out-of-Core Traversal of Rectilinear Grids

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pascucci, V.

2000-07-10

The real time processing of very large volumetric meshes introduces specific algorithmic challenges due to the impossibility of fitting the input data in the main memory of a computer. The basic assumption (RAM computational model) of uniform-constant-time access to each memory location is not valid because part of the data is stored out-of-core or in external memory. The performance of most algorithms does not scale well in the transition from the in-core to the out-of-core processing conditions. The performance degradation is due to the high frequency of I/O operations that may start dominating the overall running time. Out-of-core computing [28]more » addresses specifically the issues of algorithm redesign and data layout restructuring to enable data access patterns with minimal performance degradation in out-of-core processing. Results in this area are also valuable in parallel and distributed computing where one has to deal with the similar issue of balancing processing time with data migration time. The solution of the out-of-core processing problem is typically divided into two parts: (i) analysis of a specific algorithm to understand its data access patterns and, when possible, redesign the algorithm to maximize their locality; and (ii) storage of the data in secondary memory with a layout consistent with the access patterns of the algorithm to amortize the cost of each I/O operation over several memory access operations. In the case of a hierarchical visualization algorithms for volumetric data the 3D input hierarchy is traversed to build derived geometric models with adaptive levels of detail. The shape of the output models is then modified dynamically with incremental updates of their level of detail. The parameters that govern this continuous modification of the output geometry are dependent on the runtime user interaction making it impossible to determine a priori what levels of detail are going to be constructed. For example they can be dependent from external parameters like the viewpoint of the current display window or from internal parameters like the isovalue of an isocontour or the position of an orthogonal slice. The structure of the access pattern can be summarized into two main points: (i) the input hierarchy is traversed level by level so that the data in the same level of resolution or in adjacent levels is traversed at the same time and (ii) within each level of resolution the data is mostly traversed at the same time in regions that are geometrically close. In this paper I introduce a new static indexing scheme that induces a data layout satisfying both requirements (i) and (ii) for the hierarchical traversal of n-dimensional regular grids. In one particular implementation the scheme exploits in a new way the recursive construction of the Z-order space filling curve. The standard indexing that maps the input nD data onto a 1D sequence for the Z-order curve is based on a simple bit interleaving operation that merges the n input indices into one index n times longer. This helps in grouping the data for geometric proximity but only for a specific level of detail. In this paper I show how this indexing can be transformed into an alternative index that allows to group the data per level of resolution first and then the data within each level per geometric proximity. This yields a data layout that is appropriate for hierarchical out-of-core processing of large grids.« less
Spacewire router IP-core with priority adaptive routing

NASA Astrophysics Data System (ADS)

Shakhmatov, A. V.; Chekmarev, S. A.; Vergasov, M. Y.; Khanov, V. Kh

2015-10-01

Design of modern spacecraft focuses on using network principles of interaction on-board equipment, in particular in network SpaceWire. Routers are an integral part of most SpaceWire networks. The paper presents an adaptive routing algorithm with a prioritization, allowing more flexibility to manage the routing process. This algorithm is designed to transmit SpaceWire packets over a redundant network. Also a method is proposed for rapid restoration of working capacity after power by saving the routing table and the router configuration in an external non-volatile memory. The proposed solutions used to create IP-core router, and then tested in the FPGA device. The results illustrate the realizability and rationality of the proposed solutions.

Memory-Scalable GPU Spatial Hierarchy Construction.

PubMed

Qiming Hou; Xin Sun; Kun Zhou; Lauterbach, C; Manocha, D

2011-04-01

Recent GPU algorithms for constructing spatial hierarchies have achieved promising performance for moderately complex models by using the breadth-first search (BFS) construction order. While being able to exploit the massive parallelism on the GPU, the BFS order also consumes excessive GPU memory, which becomes a serious issue for interactive applications involving very complex models with more than a few million triangles. In this paper, we propose to use the partial breadth-first search (PBFS) construction order to control memory consumption while maximizing performance. We apply the PBFS order to two hierarchy construction algorithms. The first algorithm is for kd-trees that automatically balances between the level of parallelism and intermediate memory usage. With PBFS, peak memory consumption during construction can be efficiently controlled without costly CPU-GPU data transfer. We also develop memory allocation strategies to effectively limit memory fragmentation. The resulting algorithm scales well with GPU memory and constructs kd-trees of models with millions of triangles at interactive rates on GPUs with 1 GB memory. Compared with existing algorithms, our algorithm is an order of magnitude more scalable for a given GPU memory bound. The second algorithm is for out-of-core bounding volume hierarchy (BVH) construction for very large scenes based on the PBFS construction order. At each iteration, all constructed nodes are dumped to the CPU memory, and the GPU memory is freed for the next iteration's use. In this way, the algorithm is able to build trees that are too large to be stored in the GPU memory. Experiments show that our algorithm can construct BVHs for scenes with up to 20 M triangles, several times larger than previous GPU algorithms.
Kanerva's sparse distributed memory: An associative memory algorithm well-suited to the Connection Machine

NASA Technical Reports Server (NTRS)

Rogers, David

1988-01-01

The advent of the Connection Machine profoundly changes the world of supercomputers. The highly nontraditional architecture makes possible the exploration of algorithms that were impractical for standard Von Neumann architectures. Sparse distributed memory (SDM) is an example of such an algorithm. Sparse distributed memory is a particularly simple and elegant formulation for an associative memory. The foundations for sparse distributed memory are described, and some simple examples of using the memory are presented. The relationship of sparse distributed memory to three important computational systems is shown: random-access memory, neural networks, and the cerebellum of the brain. Finally, the implementation of the algorithm for sparse distributed memory on the Connection Machine is discussed.
Genetic algorithms with memory- and elitism-based immigrants in dynamic environments.

PubMed

Yang, Shengxiang

2008-01-01

In recent years the genetic algorithm community has shown a growing interest in studying dynamic optimization problems. Several approaches have been devised. The random immigrants and memory schemes are two major ones. The random immigrants scheme addresses dynamic environments by maintaining the population diversity while the memory scheme aims to adapt genetic algorithms quickly to new environments by reusing historical information. This paper investigates a hybrid memory and random immigrants scheme, called memory-based immigrants, and a hybrid elitism and random immigrants scheme, called elitism-based immigrants, for genetic algorithms in dynamic environments. In these schemes, the best individual from memory or the elite from the previous generation is retrieved as the base to create immigrants into the population by mutation. This way, not only can diversity be maintained but it is done more efficiently to adapt genetic algorithms to the current environment. Based on a series of systematically constructed dynamic problems, experiments are carried out to compare genetic algorithms with the memory-based and elitism-based immigrants schemes against genetic algorithms with traditional memory and random immigrants schemes and a hybrid memory and multi-population scheme. The sensitivity analysis regarding some key parameters is also carried out. Experimental results show that the memory-based and elitism-based immigrants schemes efficiently improve the performance of genetic algorithms in dynamic environments.
Functions of external cues in prospective memory.

DOT National Transportation Integrated Search

1995-02-01

A simulation of an air traffic control task was the setting for an investigation of the functions of external cues in prospective memory. External cues can support the triggering of an action or memory for the content of the action. : We focused on m...
Optimal operation management of fuel cell/wind/photovoltaic power sources connected to distribution networks

NASA Astrophysics Data System (ADS)

Niknam, Taher; Kavousifard, Abdollah; Tabatabaei, Sajad; Aghaei, Jamshid

2011-10-01

In this paper a new multiobjective modified honey bee mating optimization (MHBMO) algorithm is presented to investigate the distribution feeder reconfiguration (DFR) problem considering renewable energy sources (RESs) (photovoltaics, fuel cell and wind energy) connected to the distribution network. The objective functions of the problem to be minimized are the electrical active power losses, the voltage deviations, the total electrical energy costs and the total emissions of RESs and substations. During the optimization process, the proposed algorithm finds a set of non-dominated (Pareto) optimal solutions which are stored in an external memory called repository. Since the objective functions investigated are not the same, a fuzzy clustering algorithm is utilized to handle the size of the repository in the specified limits. Moreover, a fuzzy-based decision maker is adopted to select the 'best' compromised solution among the non-dominated optimal solutions of multiobjective optimization problem. In order to see the feasibility and effectiveness of the proposed algorithm, two standard distribution test systems are used as case studies.
Very Large Scale Optimization

NASA Technical Reports Server (NTRS)

Vanderplaats, Garrett; Townsend, James C. (Technical Monitor)

2002-01-01

The purpose of this research under the NASA Small Business Innovative Research program was to develop algorithms and associated software to solve very large nonlinear, constrained optimization tasks. Key issues included efficiency, reliability, memory, and gradient calculation requirements. This report describes the general optimization problem, ten candidate methods, and detailed evaluations of four candidates. The algorithm chosen for final development is a modern recreation of a 1960s external penalty function method that uses very limited computer memory and computational time. Although of lower efficiency, the new method can solve problems orders of magnitude larger than current methods. The resulting BIGDOT software has been demonstrated on problems with 50,000 variables and about 50,000 active constraints. For unconstrained optimization, it has solved a problem in excess of 135,000 variables. The method includes a technique for solving discrete variable problems that finds a "good" design, although a theoretical optimum cannot be guaranteed. It is very scalable in that the number of function and gradient evaluations does not change significantly with increased problem size. Test cases are provided to demonstrate the efficiency and reliability of the methods and software.
Google effects on memory: cognitive consequences of having information at our fingertips.

PubMed

Sparrow, Betsy; Liu, Jenny; Wegner, Daniel M

2011-08-05

The advent of the Internet, with sophisticated algorithmic search engines, has made accessing information as easy as lifting a finger. No longer do we have to make costly efforts to find the things we want. We can "Google" the old classmate, find articles online, or look up the actor who was on the tip of our tongue. The results of four studies suggest that when faced with difficult questions, people are primed to think about computers and that when people expect to have future access to information, they have lower rates of recall of the information itself and enhanced recall instead for where to access it. The Internet has become a primary form of external or transactive memory, where information is stored collectively outside ourselves.
A generalized memory test algorithm

NASA Technical Reports Server (NTRS)

Milner, E. J.

1982-01-01

A general algorithm for testing digital computer memory is presented. The test checks that (1) every bit can be cleared and set in each memory work, and (2) bits are not erroneously cleared and/or set elsewhere in memory at the same time. The algorithm can be applied to any size memory block and any size memory word. It is concise and efficient, requiring the very few cycles through memory. For example, a test of 16-bit-word-size memory requries only 384 cycles through memory. Approximately 15 seconds were required to test a 32K block of such memory, using a microcomputer having a cycle time of 133 nanoseconds.
Programmable architecture for pixel level processing tasks in lightweight strapdown IR seekers

NASA Astrophysics Data System (ADS)

Coates, James L.

1993-06-01

Typical processing tasks associated with missile IR seeker applications are described, and a straw man suite of algorithms is presented. A fully programmable multiprocessor architecture is realized on a multimedia video processor (MVP) developed by Texas Instruments. The MVP combines the elements of RISC, floating point, advanced DSPs, graphics processors, display and acquisition control, RAM, and external memory. Front end pixel level tasks typical of missile interceptor applications, operating on 256 x 256 sensor imagery, can be processed at frame rates exceeding 100 Hz in a single MVP chip.
Noise facilitation in associative memories of exponential capacity.

PubMed

Karbasi, Amin; Salavati, Amir Hesam; Shokrollahi, Amin; Varshney, Lav R

2014-11-01

Recent advances in associative memory design through structured pattern sets and graph-based inference algorithms have allowed reliable learning and recall of an exponential number of patterns that satisfy certain subspace constraints. Although these designs correct external errors in recall, they assume neurons that compute noiselessly, in contrast to the highly variable neurons in brain regions thought to operate associatively, such as hippocampus and olfactory cortex. Here we consider associative memories with boundedly noisy internal computations and analytically characterize performance. As long as the internal noise level is below a specified threshold, the error probability in the recall phase can be made exceedingly small. More surprising, we show that internal noise improves the performance of the recall phase while the pattern retrieval capacity remains intact: the number of stored patterns does not reduce with noise (up to a threshold). Computational experiments lend additional support to our theoretical analysis. This work suggests a functional benefit to noisy neurons in biological neuronal networks.
Three list scheduling temporal partitioning algorithm of time space characteristic analysis and compare for dynamic reconfigurable computing

NASA Astrophysics Data System (ADS)

Chen, Naijin

2013-03-01

Level Based Partitioning (LBP) algorithm, Cluster Based Partitioning (CBP) algorithm and Enhance Static List (ESL) temporal partitioning algorithm based on adjacent matrix and adjacent table are designed and implemented in this paper. Also partitioning time and memory occupation based on three algorithms are compared. Experiment results show LBP partitioning algorithm possesses the least partitioning time and better parallel character, as far as memory occupation and partitioning time are concerned, algorithms based on adjacent table have less partitioning time and less space memory occupation.
Computerized scoring algorithms for the Autobiographical Memory Test.

PubMed

Takano, Keisuke; Gutenbrunner, Charlotte; Martens, Kris; Salmon, Karen; Raes, Filip

2018-02-01

Reduced specificity of autobiographical memories is a hallmark of depressive cognition. Autobiographical memory (AM) specificity is typically measured by the Autobiographical Memory Test (AMT), in which respondents are asked to describe personal memories in response to emotional cue words. Due to this free descriptive responding format, the AMT relies on experts' hand scoring for subsequent statistical analyses. This manual coding potentially impedes research activities in big data analytics such as large epidemiological studies. Here, we propose computerized algorithms to automatically score AM specificity for the Dutch (adult participants) and English (youth participants) versions of the AMT by using natural language processing and machine learning techniques. The algorithms showed reliable performances in discriminating specific and nonspecific (e.g., overgeneralized) autobiographical memories in independent testing data sets (area under the receiver operating characteristic curve > .90). Furthermore, outcome values of the algorithms (i.e., decision values of support vector machines) showed a gradient across similar (e.g., specific and extended memories) and different (e.g., specific memory and semantic associates) categories of AMT responses, suggesting that, for both adults and youth, the algorithms well capture the extent to which a memory has features of specific memories. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Effects of Internal and External Vividness on Hippocampal Connectivity during Memory Retrieval

PubMed Central

Ford, Jaclyn H.; Kensinger, Elizabeth A.

2016-01-01

Successful memory for an image can be supported by retrieval of one’s personal reaction to the image (i.e., internal vividness), as well as retrieval of the specific details of the image itself (i.e., external vividness). Prior research suggests that memory vividness relies on regions within the medial temporal lobe, particularly the hippocampus, but it is unclear whether internal and external vividness are supported by the hippocampus in a similar way. To address this open question, the current study examined hippocampal connectivity associated with enhanced internal and external vividness ratings during retrieval. Participants encoded complex visual images paired with verbal titles. During a scanned retrieval session, they were presented with the titles and asked whether each had been seen with an image during encoding. Following retrieval of each image, participants were asked to rate internal and external vividness. Increased hippocampal activity was associated with higher vividness ratings for both scales, supporting prior evidence implicating the hippocampus in retrieval of memory detail. However, different patterns of hippocampal connectivity related to enhanced external and internal vividness. Further, hippocampal connectivity with medial prefrontal regions was associated with increased ratings of internal vividness, but with decreased ratings of external vividness. These findings suggest that the hippocampus may contribute to increased internal and external vividness via distinct mechanisms and that external and internal vividness of memories should be considered as separable measures. PMID:26778653
Minimizing the Disruptive Effects of Prospective Memory in Simulated Air Traffic Control

PubMed Central

Loft, Shayne; Smith, Rebekah E.; Remington, Roger

2015-01-01

Prospective memory refers to remembering to perform an intended action in the future. Failures of prospective memory can occur in air traffic control. In two experiments, we examined the utility of external aids for facilitating air traffic management in a simulated air traffic control task with prospective memory requirements. Participants accepted and handed-off aircraft and detected aircraft conflicts. The prospective memory task involved remembering to deviate from a routine operating procedure when accepting target aircraft. External aids that contained details of the prospective memory task appeared and flashed when target aircraft needed acceptance. In Experiment 1, external aids presented either adjacent or non-adjacent to each of the 20 target aircraft presented over the 40min test phase reduced prospective memory error by 11% compared to a condition without external aids. In Experiment 2, only a single target aircraft was presented a significant time (39min–42min) after presentation of the prospective memory instruction, and the external aids reduced prospective memory error by 34%. In both experiments, costs to the efficiency of non-prospective memory air traffic management (non-target aircraft acceptance response time, conflict detection response time) were reduced by non-adjacent aids compared to no aids or adjacent aids. In contrast, in both experiments, the efficiency of the prospective memory air traffic management (target aircraft acceptance response time) was facilitated by adjacent aids compared to non-adjacent aids. Together, these findings have potential implications for the design of automated alerting systems to maximize multi-task performance in work settings where operators monitor and control demanding perceptual displays. PMID:24059825
Vector Quantization Algorithm Based on Associative Memories

NASA Astrophysics Data System (ADS)

Guzmán, Enrique; Pogrebnyak, Oleksiy; Yáñez, Cornelio; Manrique, Pablo

This paper presents a vector quantization algorithm for image compression based on extended associative memories. The proposed algorithm is divided in two stages. First, an associative network is generated applying the learning phase of the extended associative memories between a codebook generated by the LBG algorithm and a training set. This associative network is named EAM-codebook and represents a new codebook which is used in the next stage. The EAM-codebook establishes a relation between training set and the LBG codebook. Second, the vector quantization process is performed by means of the recalling stage of EAM using as associative memory the EAM-codebook. This process generates a set of the class indices to which each input vector belongs. With respect to the LBG algorithm, the main advantages offered by the proposed algorithm is high processing speed and low demand of resources (system memory); results of image compression and quality are presented.
Runtime support for parallelizing data mining algorithms

NASA Astrophysics Data System (ADS)

Jin, Ruoming; Agrawal, Gagan

2002-03-01

With recent technological advances, shared memory parallel machines have become more scalable, and offer large main memories and high bus bandwidths. They are emerging as good platforms for data warehousing and data mining. In this paper, we focus on shared memory parallelization of data mining algorithms. We have developed a series of techniques for parallelization of data mining algorithms, including full replication, full locking, fixed locking, optimized full locking, and cache-sensitive locking. Unlike previous work on shared memory parallelization of specific data mining algorithms, all of our techniques apply to a large number of common data mining algorithms. In addition, we propose a reduction-object based interface for specifying a data mining algorithm. We show how our runtime system can apply any of the technique we have developed starting from a common specification of the algorithm.
An adaptive replacement algorithm for paged-memory computer systems.

NASA Technical Reports Server (NTRS)

Thorington, J. M., Jr.; Irwin, J. D.

1972-01-01

A general class of adaptive replacement schemes for use in paged memories is developed. One such algorithm, called SIM, is simulated using a probability model that generates memory traces, and the results of the simulation of this adaptive scheme are compared with those obtained using the best nonlookahead algorithms. A technique for implementing this type of adaptive replacement algorithm with state of the art digital hardware is also presented.
CUDA Optimization Strategies for Compute- and Memory-Bound Neuroimaging Algorithms

PubMed Central

Lee, Daren; Dinov, Ivo; Dong, Bin; Gutman, Boris; Yanovsky, Igor; Toga, Arthur W.

2011-01-01

As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern graphical processing units (GPUs) can reduce computational times by orders of magnitude. However, its massively threaded architecture introduces challenges when GPU resources are exceeded. This paper presents optimization strategies for compute- and memory-bound algorithms for the CUDA architecture. For compute-bound algorithms, the registers are reduced through variable reuse via shared memory and the data throughput is increased through heavier thread workloads and maximizing the thread configuration for a single thread block per multiprocessor. For memory-bound algorithms, fitting the data into the fast but limited GPU resources is achieved through reorganizing the data into self-contained structures and employing a multi-pass approach. Memory latencies are reduced by selecting memory resources whose cache performance are optimized for the algorithm's access patterns. We demonstrate the strategies on two computationally expensive algorithms and achieve optimized GPU implementations that perform up to 6× faster than unoptimized ones. Compared to CPU implementations, we achieve peak GPU speedups of 129× for the 3D unbiased nonlinear image registration technique and 93× for the non-local means surface denoising algorithm. PMID:21159404
CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms.

PubMed

Lee, Daren; Dinov, Ivo; Dong, Bin; Gutman, Boris; Yanovsky, Igor; Toga, Arthur W

2012-06-01

As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern graphical processing units (GPUs) can reduce computational times by orders of magnitude. However, its massively threaded architecture introduces challenges when GPU resources are exceeded. This paper presents optimization strategies for compute- and memory-bound algorithms for the CUDA architecture. For compute-bound algorithms, the registers are reduced through variable reuse via shared memory and the data throughput is increased through heavier thread workloads and maximizing the thread configuration for a single thread block per multiprocessor. For memory-bound algorithms, fitting the data into the fast but limited GPU resources is achieved through reorganizing the data into self-contained structures and employing a multi-pass approach. Memory latencies are reduced by selecting memory resources whose cache performance are optimized for the algorithm's access patterns. We demonstrate the strategies on two computationally expensive algorithms and achieve optimized GPU implementations that perform up to 6× faster than unoptimized ones. Compared to CPU implementations, we achieve peak GPU speedups of 129× for the 3D unbiased nonlinear image registration technique and 93× for the non-local means surface denoising algorithm. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Processing device with self-scrubbing logic

DOEpatents

Wojahn, Christopher K.

2016-03-01

An apparatus includes a processing unit including a configuration memory and self-scrubber logic coupled to read the configuration memory to detect compromised data stored in the configuration memory. The apparatus also includes a watchdog unit external to the processing unit and coupled to the self-scrubber logic to detect a failure in the self-scrubber logic. The watchdog unit is coupled to the processing unit to selectively reset the processing unit in response to detecting the failure in the self-scrubber logic. The apparatus also includes an external memory external to the processing unit and coupled to send configuration data to the configuration memory in response to a data feed signal outputted by the self-scrubber logic.

Effects of internal and external vividness on hippocampal connectivity during memory retrieval.

PubMed

Ford, Jaclyn H; Kensinger, Elizabeth A

2016-10-01

Successful memory for an image can be supported by retrieval of one's personal reaction to the image (i.e., internal vividness), as well as retrieval of the specific details of the image itself (i.e., external vividness). Prior research suggests that memory vividness relies on regions within the medial temporal lobe, particularly the hippocampus, but it is unclear whether internal and external vividness are supported by the hippocampus in a similar way. To address this open question, the current study examined hippocampal connectivity associated with enhanced internal and external vividness ratings during retrieval. Participants encoded complex visual images paired with verbal titles. During a scanned retrieval session, they were presented with the titles and asked whether each had been seen with an image during encoding. Following retrieval of each image, participants were asked to rate internal and external vividness. Increased hippocampal activity was associated with higher vividness ratings for both scales, supporting prior evidence implicating the hippocampus in retrieval of memory detail. However, different patterns of hippocampal connectivity related to enhanced external and internal vividness. Further, hippocampal connectivity with medial prefrontal regions was associated with increased ratings of internal vividness, but with decreased ratings of external vividness. These findings suggest that the hippocampus may contribute to increased internal and external vividness via distinct mechanisms and that external and internal vividness of memories should be considered as separable measures. Copyright © 2016 Elsevier Inc. All rights reserved.
High-precision positioning system of four-quadrant detector based on the database query

NASA Astrophysics Data System (ADS)

Zhang, Xin; Deng, Xiao-guo; Su, Xiu-qin; Zheng, Xiao-qiang

2015-02-01

The fine pointing mechanism of the Acquisition, Pointing and Tracking (APT) system in free space laser communication usually use four-quadrant detector (QD) to point and track the laser beam accurately. The positioning precision of QD is one of the key factors of the pointing accuracy to APT system. A positioning system is designed based on FPGA and DSP in this paper, which can realize the sampling of AD, the positioning algorithm and the control of the fast swing mirror. We analyze the positioning error of facular center calculated by universal algorithm when the facular energy obeys Gauss distribution from the working principle of QD. A database is built by calculation and simulation with MatLab software, in which the facular center calculated by universal algorithm is corresponded with the facular center of Gaussian beam, and the database is stored in two pieces of E2PROM as the external memory of DSP. The facular center of Gaussian beam is inquiry in the database on the basis of the facular center calculated by universal algorithm in DSP. The experiment results show that the positioning accuracy of the high-precision positioning system is much better than the positioning accuracy calculated by universal algorithm.
Real-Time Data Processing in the muon system of the D0 detector.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Neeti Parashar et al.

2001-07-03

This paper presents a real-time application of the 16-bit fixed point Digital Signal Processors (DSPs), in the Muon System of the D0 detector located at the Fermilab Tevatron, presently the world's highest-energy hadron collider. As part of the Upgrade for a run beginning in the year 2000, the system is required to process data at an input event rate of 10 KHz without incurring significant deadtime in readout. The ADSP21csp01 processor has high I/O bandwidth, single cycle instruction execution and fast task switching support to provide efficient multisignal processing. The processor's internal memory consists of 4K words of Program Memorymore » and 4K words of Data Memory. In addition there is an external memory of 32K words for general event buffering and 16K words of Dual port Memory for input data queuing. This DSP fulfills the requirement of the Muon subdetector systems for data readout. All error handling, buffering, formatting and transferring of the data to the various trigger levels of the data acquisition system is done in software. The algorithms developed for the system complete these tasks in about 20 {micro}s per event.« less
Memory effects for a stochastic fractional oscillator in a magnetic field

NASA Astrophysics Data System (ADS)

Mankin, Romi; Laas, Katrin; Laas, Tõnu; Paekivi, Sander

2018-01-01

The problem of random motion of harmonically trapped charged particles in a constant external magnetic field is studied. A generalized three-dimensional Langevin equation with a power-law memory kernel is used to model the interaction of Brownian particles with the complex structure of viscoelastic media (e.g., dusty plasmas). The influence of a fluctuating environment is modeled by an additive fractional Gaussian noise. In the long-time limit the exact expressions of the first-order and second-order moments of the fluctuating position for the Brownian particle subjected to an external periodic force in the plane perpendicular to the magnetic field have been calculated. Also, the particle's angular momentum is found. It is shown that an interplay of external periodic forcing, memory, and colored noise can generate a variety of cooperation effects, such as memory-induced sign reversals of the angular momentum, multiresonance versus Larmor frequency, and memory-induced particle confinement in the absence of an external trapping field. Particularly in the case without external trapping, if the memory exponent is lower than a critical value, we find a resonancelike behavior of the anisotropy in the particle position distribution versus the driving frequency, implying that it can be efficiently excited by an oscillating electric field. Similarities and differences between the behaviors of the models with internal and external noises are also discussed.
Distributed-Memory Fast Maximal Independent Set

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kanewala Appuhamilage, Thejaka Amila J.; Zalewski, Marcin J.; Lumsdaine, Andrew

The Maximal Independent Set (MIS) graph problem arises in many applications such as computer vision, information theory, molecular biology, and process scheduling. The growing scale of MIS problems suggests the use of distributed-memory hardware as a cost-effective approach to providing necessary compute and memory resources. Luby proposed four randomized algorithms to solve the MIS problem. All those algorithms are designed focusing on shared-memory machines and are analyzed using the PRAM model. These algorithms do not have direct efficient distributed-memory implementations. In this paper, we extend two of Luby’s seminal MIS algorithms, “Luby(A)” and “Luby(B),” to distributed-memory execution, and we evaluatemore » their performance. We compare our results with the “Filtered MIS” implementation in the Combinatorial BLAS library for two types of synthetic graph inputs.« less
A simplified computational memory model from information processing.

PubMed

Zhang, Lanhua; Zhang, Dongsheng; Deng, Yuqin; Ding, Xiaoqian; Wang, Yan; Tang, Yiyuan; Sun, Baoliang

2016-11-23

This paper is intended to propose a computational model for memory from the view of information processing. The model, called simplified memory information retrieval network (SMIRN), is a bi-modular hierarchical functional memory network by abstracting memory function and simulating memory information processing. At first meta-memory is defined to express the neuron or brain cortices based on the biology and graph theories, and we develop an intra-modular network with the modeling algorithm by mapping the node and edge, and then the bi-modular network is delineated with intra-modular and inter-modular. At last a polynomial retrieval algorithm is introduced. In this paper we simulate the memory phenomena and functions of memorization and strengthening by information processing algorithms. The theoretical analysis and the simulation results show that the model is in accordance with the memory phenomena from information processing view.
Processing device with self-scrubbing logic

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wojahn, Christopher K.

An apparatus includes a processing unit including a configuration memory and self-scrubber logic coupled to read the configuration memory to detect compromised data stored in the configuration memory. The apparatus also includes a watchdog unit external to the processing unit and coupled to the self-scrubber logic to detect a failure in the self-scrubber logic. The watchdog unit is coupled to the processing unit to selectively reset the processing unit in response to detecting the failure in the self-scrubber logic. The apparatus also includes an external memory external to the processing unit and coupled to send configuration data to the configurationmore » memory in response to a data feed signal outputted by the self-scrubber logic.« less
Hänsel, Gretel and the slime mould—how an external spatial memory aids navigation in complex environments

NASA Astrophysics Data System (ADS)

Smith-Ferguson, Jules; Reid, Chris R.; Latty, Tanya; Beekman, Madeleine

2017-10-01

The ability to navigate through an environment is critical to most organisms’ ability to survive and reproduce. The presence of a memory system greatly enhances navigational success. Therefore, natural selection is likely to drive the creation of memory systems, even in non-neuronal organisms, if having such a system is adaptive. Here we examine if the external spatial memory system present in the acellular slime mould, Physarum polycephalum, provides an adaptive advantage for resource acquisition. P. polycephalum lays tracks of extracellular slime as it moves through its environment. Previous work has shown that the presence of extracellular slime allows the organism to escape from a trap in laboratory experiments simply by avoiding areas previously explored. Here we further investigate the benefits of using extracellular slime as an external spatial memory by testing the organism’s ability to navigate through environments of differing complexity with and without the ability to use its external memory. Our results suggest that the external memory has an adaptive advantage in ‘open’ and simple bounded environments. However, in a complex bounded environment, the extracellular slime provides no advantage, and may even negatively affect the organism’s navigational abilities. Our results indicate that the exact experimental set up matters if one wants to fully understand how the presence of extracellular slime affects the slime mould’s search behaviour.
Thinking about thinking: Neural mechanisms and effects on memory.

PubMed

Bonhage, Corinna; Weber, Friederike; Exner, Cornelia; Kanske, Philipp

2016-02-15

It is a well-established finding that memory encoding is impaired if an external secondary task (e.g. tone discrimination) is performed simultaneously. Yet, while studying we are also often engaged in internal secondary tasks such as planning, ruminating, or daydreaming. It remains unclear whether such a secondary internal task has similar effects on memory and what the neural mechanisms underlying such an influence are. We therefore measured participants' blood oxygenation level dependent responses while they learned word-pairs and simultaneously performed different types of secondary tasks (i.e., internal, external, and control). Memory performance decreased in both internal and external secondary tasks compared to the easy control condition. However, while the external task reduced activity in memory-encoding related regions (hippocampus), the internal task increased neural activity in brain regions associated with self-reflection (anterior medial prefrontal cortex), as well as in regions associated with performance monitoring and the perception of salience (anterior insula, dorsal anterior cingulate cortex). Resting-state functional connectivity analyses confirmed that anterior medial prefrontal cortex and anterior insula/dorsal anterior cingulate cortex are part of the default mode network and salience network, respectively. In sum, a secondary internal task impairs memory performance just as a secondary external task, but operates through different neural mechanisms. Copyright © 2015 Elsevier Inc. All rights reserved.
A simplified computational memory model from information processing

PubMed Central

Zhang, Lanhua; Zhang, Dongsheng; Deng, Yuqin; Ding, Xiaoqian; Wang, Yan; Tang, Yiyuan; Sun, Baoliang

2016-01-01

This paper is intended to propose a computational model for memory from the view of information processing. The model, called simplified memory information retrieval network (SMIRN), is a bi-modular hierarchical functional memory network by abstracting memory function and simulating memory information processing. At first meta-memory is defined to express the neuron or brain cortices based on the biology and graph theories, and we develop an intra-modular network with the modeling algorithm by mapping the node and edge, and then the bi-modular network is delineated with intra-modular and inter-modular. At last a polynomial retrieval algorithm is introduced. In this paper we simulate the memory phenomena and functions of memorization and strengthening by information processing algorithms. The theoretical analysis and the simulation results show that the model is in accordance with the memory phenomena from information processing view. PMID:27876847
Optimum location of external markers using feature selection algorithms for real-time tumor tracking in external-beam radiotherapy: a virtual phantom study.

PubMed

Nankali, Saber; Torshabi, Ahmad Esmaili; Miandoab, Payam Samadi; Baghizadeh, Amin

2016-01-08

In external-beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation-based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two "Genetic" and "Ranker" searching procedures. The performance of these algorithms has been evaluated using four-dimensional extended cardiac-torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro-fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F-test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation-based feature selection algorithm, in combination with a genetic search algorithm, proved to yield best performance accuracy for selecting optimum markers.
Transactive memory in organizational groups: the effects of content, consensus, specialization, and accuracy on group performance.

PubMed

Austin, John R

2003-10-01

Previous research on transactive memory has found a positive relationship between transactive memory system development and group performance in single project laboratory and ad hoc groups. Closely related research on shared mental models and expertise recognition supports these findings. In this study, the author examined the relationship between transactive memory systems and performance in mature, continuing groups. A group's transactive memory system, measured as a combination of knowledge stock, knowledge specialization, transactive memory consensus, and transactive memory accuracy, is positively related to group goal performance, external group evaluations, and internal group evaluations. The positive relationship with group performance was found to hold for both task and external relationship transactive memory systems.
Positive and negative generation effects in source monitoring.

PubMed

Riefer, David M; Chien, Yuchin; Reimer, Jason F

2007-10-01

Research is mixed as to whether self-generation improves memory for the source of information. We propose the hypothesis that positive generation effects (better source memory for self-generated information) occur in reality-monitoring paradigms, while negative generation effects (better source memory for externally presented information) tend to occur in external source-monitoring paradigms. This hypothesis was tested in an experiment in which participants read or generated words, followed by a memory test for the source of each word (read or generated) and the word's colour. Meiser and Bröder's (2002) multinomial model for crossed source dimensions was used to analyse the data, showing that source memory for generation (reality monitoring) was superior for the generated words, while source memory for word colour (external source monitoring) was superior for the read words. The model also revealed the influence of strong response biases in the data, demonstrating the usefulness of formal modelling when examining generation effects in source monitoring.
Analogical Reasoning in Adolescents with Intellectual Disability: Effects of External Memories and Time Processing

ERIC Educational Resources Information Center

Denaes, Caroline; Berger, Jean-Louis

2014-01-01

Analogical reasoning involves the comparison of pictures as well as the memorisation of relations. Young children (4-7 years old) and students with moderate intellectual disability have a short memory span, which hampers them in succeeding traditional analogical tests. In the present study, we investigated if, by providing external memory hints,…
A divide-and-conquer algorithm for large-scale de novo transcriptome assembly through combining small assemblies from existing algorithms.

PubMed

Sze, Sing-Hoi; Parrott, Jonathan J; Tarone, Aaron M

2017-12-06

While the continued development of high-throughput sequencing has facilitated studies of entire transcriptomes in non-model organisms, the incorporation of an increasing amount of RNA-Seq libraries has made de novo transcriptome assembly difficult. Although algorithms that can assemble a large amount of RNA-Seq data are available, they are generally very memory-intensive and can only be used to construct small assemblies. We develop a divide-and-conquer strategy that allows these algorithms to be utilized, by subdividing a large RNA-Seq data set into small libraries. Each individual library is assembled independently by an existing algorithm, and a merging algorithm is developed to combine these assemblies by picking a subset of high quality transcripts to form a large transcriptome. When compared to existing algorithms that return a single assembly directly, this strategy achieves comparable or increased accuracy as memory-efficient algorithms that can be used to process a large amount of RNA-Seq data, and comparable or decreased accuracy as memory-intensive algorithms that can only be used to construct small assemblies. Our divide-and-conquer strategy allows memory-intensive de novo transcriptome assembly algorithms to be utilized to construct large assemblies.
KungFQ: a simple and powerful approach to compress fastq files.

PubMed

Grassi, Elena; Di Gregorio, Federico; Molineris, Ivan

2012-01-01

Nowadays storing data derived from deep sequencing experiments has become pivotal and standard compression algorithms do not exploit in a satisfying manner their structure. A number of reference-based compression algorithms have been developed but they are less adequate when approaching new species without fully sequenced genomes or nongenomic data. We developed a tool that takes advantages of fastq characteristics and encodes them in a binary format optimized in order to be further compressed with standard tools (such as gzip or lzma). The algorithm is straightforward and does not need any external reference file, it scans the fastq only once and has a constant memory requirement. Moreover, we added the possibility to perform lossy compression, losing some of the original information (IDs and/or qualities) but resulting in smaller files; it is also possible to define a quality cutoff under which corresponding base calls are converted to N. We achieve 2.82 to 7.77 compression ratios on various fastq files without losing information and 5.37 to 8.77 losing IDs, which are often not used in common analysis pipelines. In this paper, we compare the algorithm performance with known tools, usually obtaining higher compression levels.
The impact of cognitive control, incentives, and working memory load on the P3 responses of externalizing prisoners.

PubMed

Baskin-Sommers, Arielle R; Krusemark, Elizabeth A; Curtin, John J; Lee, Christopher; Vujnovich, Aleice; Newman, Joseph P

2014-02-01

The P3 amplitude reduction is one of the most common correlates of externalizing. However, few studies have used experimental manipulations designed to challenge different cognitive functions in order to clarify the processes that impact this reduction. To examine factors moderating P3 amplitude in trait externalizing, we administered an n-back task that manipulated cognitive control demands, working memory load, and incentives to a sample of male offenders. Offenders with high trait externalizing scores did not display a global reduction in P3 amplitude. Rather, the negative association between trait externalizing and P3 amplitude was specific to trials involving inhibition of a dominant response during infrequent stimuli, in the context of low working memory load, and incentives for performance. In addition, we discuss the potential implications of these findings for externalizing-related psychopathologies. The results complement and expand previous work on the process-level dysfunction contributing to externalizing-related deficits in P3. Copyright © 2013 Elsevier B.V. All rights reserved.
On the definition of the concepts thinking, consciousness, and conscience.

PubMed Central

Monin, A S

1992-01-01

A complex system (CS) is defined as a set of elements, with connections between them, singled out of the environment, capable of getting information from the environment, capable of making decisions (i.e., of choosing between alternatives), and having purposefulness (i.e., an urge towards preferable states or other goals). Thinking is a process that takes place (or which can take place) in some of the CS and consists of (i) receiving information from the environment (and from itself), (ii) memorizing the information, (iii) the subconscious, and (iv) consciousness. Life is a process that takes place in some CS and consists of functions i and ii, as well as (v) reproduction with passing of hereditary information to progeny, and (vi) oriented energy and matter exchange with the environment sufficient for the maintenance of all life processes. Memory is a complex of processes of placing information in memory banks, keeping it there, and producing it according to prescriptions available in the system or to inquiries arising in it. Consciousness is a process of realization by the thinking CS of some set of algorithms consisting of the comparison of its knowledge, intentions, decisions, and actions with reality--i.e., with accumulated and continuously received internal and external information. Conscience is a realization of an algorithm of good and evil pattern recognition. PMID:1631060
Fault Tolerant Frequent Pattern Mining

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shohdy, Sameh; Vishnu, Abhinav; Agrawal, Gagan

FP-Growth algorithm is a Frequent Pattern Mining (FPM) algorithm that has been extensively used to study correlations and patterns in large scale datasets. While several researchers have designed distributed memory FP-Growth algorithms, it is pivotal to consider fault tolerant FP-Growth, which can address the increasing fault rates in large scale systems. In this work, we propose a novel parallel, algorithm-level fault-tolerant FP-Growth algorithm. We leverage algorithmic properties and MPI advanced features to guarantee an O(1) space complexity, achieved by using the dataset memory space itself for checkpointing. We also propose a recovery algorithm that can use in-memory and disk-based checkpointing,more » though in many cases the recovery can be completed without any disk access, and incurring no memory overhead for checkpointing. We evaluate our FT algorithm on a large scale InfiniBand cluster with several large datasets using up to 2K cores. Our evaluation demonstrates excellent efficiency for checkpointing and recovery in comparison to the disk-based approach. We have also observed 20x average speed-up in comparison to Spark, establishing that a well designed algorithm can easily outperform a solution based on a general fault-tolerant programming model.« less
Genetic evolutionary taboo search for optimal marker placement in infrared patient setup

NASA Astrophysics Data System (ADS)

Riboldi, M.; Baroni, G.; Spadea, M. F.; Tagaste, B.; Garibaldi, C.; Cambria, R.; Orecchia, R.; Pedotti, A.

2007-09-01

In infrared patient setup adequate selection of the external fiducial configuration is required for compensating inner target displacements (target registration error, TRE). Genetic algorithms (GA) and taboo search (TS) were applied in a newly designed approach to optimal marker placement: the genetic evolutionary taboo search (GETS) algorithm. In the GETS paradigm, multiple solutions are simultaneously tested in a stochastic evolutionary scheme, where taboo-based decision making and adaptive memory guide the optimization process. The GETS algorithm was tested on a group of ten prostate patients, to be compared to standard optimization and to randomly selected configurations. The changes in the optimal marker configuration, when TRE is minimized for OARs, were specifically examined. Optimal GETS configurations ensured a 26.5% mean decrease in the TRE value, versus 19.4% for conventional quasi-Newton optimization. Common features in GETS marker configurations were highlighted in the dataset of ten patients, even when multiple runs of the stochastic algorithm were performed. Including OARs in TRE minimization did not considerably affect the spatial distribution of GETS marker configurations. In conclusion, the GETS algorithm proved to be highly effective in solving the optimal marker placement problem. Further work is needed to embed site-specific deformation models in the optimization process.

Research on fast Fourier transforms algorithm of huge remote sensing image technology with GPU and partitioning technology.

PubMed

Yang, Xue; Li, Xue-You; Li, Jia-Guo; Ma, Jun; Zhang, Li; Yang, Jan; Du, Quan-Ye

2014-02-01

Fast Fourier transforms (FFT) is a basic approach to remote sensing image processing. With the improvement of capacity of remote sensing image capture with the features of hyperspectrum, high spatial resolution and high temporal resolution, how to use FFT technology to efficiently process huge remote sensing image becomes the critical step and research hot spot of current image processing technology. FFT algorithm, one of the basic algorithms of image processing, can be used for stripe noise removal, image compression, image registration, etc. in processing remote sensing image. CUFFT function library is the FFT algorithm library based on CPU and FFTW. FFTW is a FFT algorithm developed based on CPU in PC platform, and is currently the fastest CPU based FFT algorithm function library. However there is a common problem that once the available memory or memory is less than the capacity of image, there will be out of memory or memory overflow when using the above two methods to realize image FFT arithmetic. To address this problem, a CPU and partitioning technology based Huge Remote Fast Fourier Transform (HRFFT) algorithm is proposed in this paper. By improving the FFT algorithm in CUFFT function library, the problem of out of memory and memory overflow is solved. Moreover, this method is proved rational by experiment combined with the CCD image of HJ-1A satellite. When applied to practical image processing, it improves effect of the image processing, speeds up the processing, which saves the time of computation and achieves sound result.
Acceleration of block-matching algorithms using a custom instruction-based paradigm on a Nios II microprocessor

NASA Astrophysics Data System (ADS)

González, Diego; Botella, Guillermo; García, Carlos; Prieto, Manuel; Tirado, Francisco

2013-12-01

This contribution focuses on the optimization of matching-based motion estimation algorithms widely used for video coding standards using an Altera custom instruction-based paradigm and a combination of synchronous dynamic random access memory (SDRAM) with on-chip memory in Nios II processors. A complete profile of the algorithms is achieved before the optimization, which locates code leaks, and afterward, creates a custom instruction set, which is then added to the specific design, enhancing the original system. As well, every possible memory combination between on-chip memory and SDRAM has been tested to achieve the best performance. The final throughput of the complete designs are shown. This manuscript outlines a low-cost system, mapped using very large scale integration technology, which accelerates software algorithms by converting them into custom hardware logic blocks and showing the best combination between on-chip memory and SDRAM for the Nios II processor.
A New Local Bipolar Autoassociative Memory Based on External Inputs of Discrete Recurrent Neural Networks With Time Delay.

PubMed

Zhou, Caigen; Zeng, Xiaoqin; Luo, Chaomin; Zhang, Huaguang

In this paper, local bipolar auto-associative memories are presented based on discrete recurrent neural networks with a class of gain type activation function. The weight parameters of neural networks are acquired by a set of inequalities without the learning procedure. The global exponential stability criteria are established to ensure the accuracy of the restored patterns by considering time delays and external inputs. The proposed methodology is capable of effectively overcoming spurious memory patterns and achieving memory capacity. The effectiveness, robustness, and fault-tolerant capability are validated by simulated experiments.In this paper, local bipolar auto-associative memories are presented based on discrete recurrent neural networks with a class of gain type activation function. The weight parameters of neural networks are acquired by a set of inequalities without the learning procedure. The global exponential stability criteria are established to ensure the accuracy of the restored patterns by considering time delays and external inputs. The proposed methodology is capable of effectively overcoming spurious memory patterns and achieving memory capacity. The effectiveness, robustness, and fault-tolerant capability are validated by simulated experiments.
A parallel approximate string matching under Levenshtein distance on graphics processing units using warp-shuffle operations

PubMed Central

Ho, ThienLuan; Oh, Seung-Rohk

2017-01-01

Approximate string matching with k-differences has a number of practical applications, ranging from pattern recognition to computational biology. This paper proposes an efficient memory-access algorithm for parallel approximate string matching with k-differences on Graphics Processing Units (GPUs). In the proposed algorithm, all threads in the same GPUs warp share data using warp-shuffle operation instead of accessing the shared memory. Moreover, we implement the proposed algorithm by exploiting the memory structure of GPUs to optimize its performance. Experiment results for real DNA packages revealed that the performance of the proposed algorithm and its implementation archived up to 122.64 and 1.53 times compared to that of sequential algorithm on CPU and previous parallel approximate string matching algorithm on GPUs, respectively. PMID:29016700
Optimum location of external markers using feature selection algorithms for real‐time tumor tracking in external‐beam radiotherapy: a virtual phantom study

PubMed Central

Nankali, Saber; Miandoab, Payam Samadi; Baghizadeh, Amin

2016-01-01

In external‐beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation‐based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two “Genetic” and “Ranker” searching procedures. The performance of these algorithms has been evaluated using four‐dimensional extended cardiac‐torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro‐fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F‐test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation‐based feature selection algorithm, in combination with a genetic search algorithm, proved to yield best performance accuracy for selecting optimum markers. PACS numbers: 87.55.km, 87.56.Fc PMID:26894358
Parallel 3D-TLM algorithm for simulation of the Earth-ionosphere cavity

NASA Astrophysics Data System (ADS)

Toledo-Redondo, Sergio; Salinas, Alfonso; Morente-Molinera, Juan Antonio; Méndez, Antonio; Fornieles, Jesús; Portí, Jorge; Morente, Juan Antonio

2013-03-01

A parallel 3D algorithm for solving time-domain electromagnetic problems with arbitrary geometries is presented. The technique employed is the Transmission Line Modeling (TLM) method implemented in Shared Memory (SM) environments. The benchmarking performed reveals that the maximum speedup depends on the memory size of the problem as well as multiple hardware factors, like the disposition of CPUs, cache, or memory. A maximum speedup of 15 has been measured for the largest problem. In certain circumstances of low memory requirements, superlinear speedup is achieved using our algorithm. The model is employed to model the Earth-ionosphere cavity, thus enabling a study of the natural electromagnetic phenomena that occur in it. The algorithm allows complete 3D simulations of the cavity with a resolution of 10 km, within a reasonable timescale.
A matrix-algebraic formulation of distributed-memory maximal cardinality matching algorithms in bipartite graphs

DOE PAGES

Azad, Ariful; Buluç, Aydın

2016-05-16

We describe parallel algorithms for computing maximal cardinality matching in a bipartite graph on distributed-memory systems. Unlike traditional algorithms that match one vertex at a time, our algorithms process many unmatched vertices simultaneously using a matrix-algebraic formulation of maximal matching. This generic matrix-algebraic framework is used to develop three efficient maximal matching algorithms with minimal changes. The newly developed algorithms have two benefits over existing graph-based algorithms. First, unlike existing parallel algorithms, cardinality of matching obtained by the new algorithms stays constant with increasing processor counts, which is important for predictable and reproducible performance. Second, relying on bulk-synchronous matrix operations,more » these algorithms expose a higher degree of parallelism on distributed-memory platforms than existing graph-based algorithms. We report high-performance implementations of three maximal matching algorithms using hybrid OpenMP-MPI and evaluate the performance of these algorithm using more than 35 real and randomly generated graphs. On real instances, our algorithms achieve up to 200 × speedup on 2048 cores of a Cray XC30 supercomputer. Even higher speedups are obtained on larger synthetically generated graphs where our algorithms show good scaling on up to 16,384 cores.« less
Focus of Attention in Children's Motor Learning: Examining the Role of Age and Working Memory.

PubMed

Brocken, J E A; Kal, E C; van der Kamp, J

2016-01-01

The authors investigated the relative effectiveness of different attentional focus instructions on motor learning in primary school children. In addition, we explored whether the effect of attentional focus on motor learning was influenced by children's age and verbal working memory capacity. Novice 8-9-year old children (n = 30) and 11-12-year-old children (n = 30) practiced a golf putting task. For each age group, half the participants received instructions to focus (internally) on the swing of their arm, while the other half was instructed to focus (externally) on the swing of the club. Children's verbal working memory capacity was assessed with the Automated Working Memory Assessment. Consistent with many reports on adult's motor learning, children in the external groups demonstrated greater improvements in putting accuracy than children who practiced with an internal focus. This effect was similar across age groups. Verbal working memory capacity was not found to be predictive of motor learning, neither for children in the internal focus groups nor for children in the external focus groups. In conclusion, primary school children's motor learning is enhanced by external focus instructions compared to internal focus instructions. The purported modulatory roles of children's working memory, attentional capacity, or focus preferences require further investigation.
Design and implementation of an optical Gaussian noise generator

NASA Astrophysics Data System (ADS)

Za~O, Leonardo; Loss, Gustavo; Coelho, Rosângela

2009-08-01

A design of a fast and accurate optical Gaussian noise generator is proposed and demonstrated. The noise sample generation is based on the Box-Muller algorithm. The functions implementation was performed on a high-speed Altera Stratix EP1S25 field-programmable gate array (FPGA) development kit. It enabled the generation of 150 million 16-bit noise samples per second. The Gaussian noise generator required only 7.4% of the FPGA logic elements, 1.2% of the RAM memory, 0.04% of the ROM memory, and a laser source. The optical pulses were generated by a laser source externally modulated by the data bit samples using the frequency-shift keying technique. The accuracy of the noise samples was evaluated for different sequences size and confidence intervals. The noise sample pattern was validated by the Bhattacharyya distance (Bd) and the autocorrelation function. The results showed that the proposed design of the optical Gaussian noise generator is very promising to evaluate the performance of optical communications channels with very low bit-error-rate values.
The Forensic Potential of Flash Memory

DTIC Science & Technology

2009-09-01

limit range of 10 to 100 years before data is lost [12]. 5. Flash Memory Logical Structure The logical structure of flash memory from least to...area is not standardized and is manufacturer specific. This information will be used by the wear leveling algorithms and as such will be proprietary...memory cells, the manufacturers of the flash implement a wear leveling algorithm . In contrast, a magnetic disk in an overwrite operation will reuse the
Numerical simulation of three dimensional transonic flows

NASA Technical Reports Server (NTRS)

Sahu, Jubaraj; Steger, Joseph L.

1987-01-01

The three-dimensional flow over a projectile has been computed using an implicit, approximately factored, partially flux-split algorithm. A simple composite grid scheme has been developed in which a single grid is partitioned into a series of smaller grids for applications which require an external large memory device such as the SSD of the CRAY X-MP/48, or multitasking. The accuracy and stability of the composite grid scheme has been tested by numerically simulating the flow over an ellipsoid at angle of attack and comparing the solution with a single grid solution. The flowfield over a projectile at M = 0.96 and 4 deg angle-of-attack has been computed using a fine grid, and compared with experiment.
A highly efficient 3D level-set grain growth algorithm tailored for ccNUMA architecture

NASA Astrophysics Data System (ADS)

Mießen, C.; Velinov, N.; Gottstein, G.; Barrales-Mora, L. A.

2017-12-01

A highly efficient simulation model for 2D and 3D grain growth was developed based on the level-set method. The model introduces modern computational concepts to achieve excellent performance on parallel computer architectures. Strong scalability was measured on cache-coherent non-uniform memory access (ccNUMA) architectures. To achieve this, the proposed approach considers the application of local level-set functions at the grain level. Ideal and non-ideal grain growth was simulated in 3D with the objective to study the evolution of statistical representative volume elements in polycrystals. In addition, microstructure evolution in an anisotropic magnetic material affected by an external magnetic field was simulated.
SU-F-J-10: Sliding Mode Control of a SMA Actuated Active Flexible Needle for Medical Procedures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Podder, T

Purpose: In medical interventional procedures such as brachytherapy, ablative therapies and biopsy precise steering and accurate placement of needles are very important for anatomical obstacle avoidance and accurate targeting. This study presents the efficacy of a sliding mode controller for Shape Memory Alloy (SMA) actuated flexible needle for medical procedures. Methods: Second order system dynamics of the SMA actuated active flexible needle was used for deriving the sliding mode control equations. Both proportional-integral-derivative (PID) and adaptive PID sliding mode control (APIDSMC) algorithms were developed and implemented. The flexible needle was attached at the end of a 6 DOF robotic system.more » Through LabView programming environment, the control commands were generated using the PID and APIDSMC algorithms. Experiments with artificial tissue mimicking phantom were performed to evaluate the performance of the controller. The actual needle tip position was obtained using an electromagnetic (EM) tracking sensor (Aurora, NDI, waterloo, Canada) at a sampling period of 1ms. During experiment, external disturbances were created applying force and thermal shock to investigate the robustness of the controllers. Results: The root mean square error (RMSE) values for APIDSMC and PID controllers were 0.75 mm and 0.92 mm, respectively, for sinusoidal reference input. In the presence of external disturbances, the APIDSMC controller showed much smoother and less overshooting response compared to that of the PID controller. Conclusion: Performance of the APIDSMC was superior to the PID controller. The APIDSMC was proved to be more effective controller in compensating the SMA uncertainties and external disturbances with clinically acceptable thresholds.« less
Binary mesh partitioning for cache-efficient visualization.

PubMed

Tchiboukdjian, Marc; Danjean, Vincent; Raffin, Bruno

2010-01-01

One important bottleneck when visualizing large data sets is the data transfer between processor and memory. Cache-aware (CA) and cache-oblivious (CO) algorithms take into consideration the memory hierarchy to design cache efficient algorithms. CO approaches have the advantage to adapt to unknown and varying memory hierarchies. Recent CA and CO algorithms developed for 3D mesh layouts significantly improve performance of previous approaches, but they lack of theoretical performance guarantees. We present in this paper a {\\schmi O}(N\\log N) algorithm to compute a CO layout for unstructured but well shaped meshes. We prove that a coherent traversal of a N-size mesh in dimension d induces less than N/B+{\\schmi O}(N/M;{1/d}) cache-misses where B and M are the block size and the cache size, respectively. Experiments show that our layout computation is faster and significantly less memory consuming than the best known CO algorithm. Performance is comparable to this algorithm for classical visualization algorithm access patterns, or better when the BSP tree produced while computing the layout is used as an acceleration data structure adjusted to the layout. We also show that cache oblivious approaches lead to significant performance increases on recent GPU architectures.
Weather prediction using a genetic memory

NASA Technical Reports Server (NTRS)

Rogers, David

1990-01-01

Kanaerva's sparse distributed memory (SDM) is an associative memory model based on the mathematical properties of high dimensional binary address spaces. Holland's genetic algorithms are a search technique for high dimensional spaces inspired by evolutional processes of DNA. Genetic Memory is a hybrid of the above two systems, in which the memory uses a genetic algorithm to dynamically reconfigure its physical storage locations to reflect correlations between the stored addresses and data. This architecture is designed to maximize the ability of the system to scale-up to handle real world problems.
The design of an adaptive predictive coder using a single-chip digital signal processor

NASA Astrophysics Data System (ADS)

Randolph, M. A.

1985-01-01

A speech coding processor architecture design study has been performed in which Texas Instruments TMS32010 has been selected from among three commercially available digital signal processing integrated circuits and evaluated in an implementation study of real-time Adaptive Predictive Coding (APC). The TMS32010 has been compared with AR&T Bell Laboratories DSP I and Nippon Electric Co. PD7720 and was found to be most suitable for a single chip implementation of APC. A preliminary design system based on TMS32010 has been performed, and several of the hardware and software design issues are discussed. Particular attention was paid to the design of an external memory controller which permits rapid sequential access of external RAM. As a result, it has been determined that a compact hardware implementation of the APC algorithm is feasible based of the TSM32010. Originator-supplied keywords include: vocoders, speech compression, adaptive predictive coding, digital signal processing microcomputers, speech processor architectures, and special purpose processor.
Dynamic Organization of Hierarchical Memories

PubMed Central

Kurikawa, Tomoki; Kaneko, Kunihiko

2016-01-01

In the brain, external objects are categorized in a hierarchical way. Although it is widely accepted that objects are represented as static attractors in neural state space, this view does not take account interaction between intrinsic neural dynamics and external input, which is essential to understand how neural system responds to inputs. Indeed, structured spontaneous neural activity without external inputs is known to exist, and its relationship with evoked activities is discussed. Then, how categorical representation is embedded into the spontaneous and evoked activities has to be uncovered. To address this question, we studied bifurcation process with increasing input after hierarchically clustered associative memories are learned. We found a “dynamic categorization”; neural activity without input wanders globally over the state space including all memories. Then with the increase of input strength, diffuse representation of higher category exhibits transitions to focused ones specific to each object. The hierarchy of memories is embedded in the transition probability from one memory to another during the spontaneous dynamics. With increased input strength, neural activity wanders over a narrower state space including a smaller set of memories, showing more specific category or memory corresponding to the applied input. Moreover, such coarse-to-fine transitions are also observed temporally during transient process under constant input, which agrees with experimental findings in the temporal cortex. These results suggest the hierarchy emerging through interaction with an external input underlies hierarchy during transient process, as well as in the spontaneous activity. PMID:27618549
Memory Management of Multimedia Services in Smart Homes

NASA Astrophysics Data System (ADS)

Kamel, Ibrahim; Muhaureq, Sanaa A.

Nowadays there is a wide spectrum of applications that run in smart home environments. Consequently, home gateway, which is a central component in the smart home, must manage many applications despite limited memory resources. OSGi is a middleware standard for home gateways. OSGi models services as dependent components. Moreover, these applications might differ in their importance. Services collaborate and complement each other to achieve the required results. This paper addresses the following problem: given a home gateway that hosts several applications with different priorities and arbitrary dependencies among them. When the gateway runs out of memory, which application or service will be stopped or kicked out of memory to start a new service. Note that stopping a given service means that all the services that depend on it will be stopped too. Because of the service dependencies, traditional memory management techniques, in the operating system literatures might not be efficient. Our goal is to stop the least important and the least number of services. The paper presents a novel algorithm for home gateway memory management. The proposed algorithm takes into consideration the priority of the application and dependencies between different services, in addition to the amount of memory occupied by each service. We implement the proposed algorithm and performed many experiments to evaluate its performance and execution time. The proposed algorithm is implemented as a part of the OSGi framework (Open Service Gateway initiative). We used best fit and worst fit as yardstick to show the effectiveness of the proposed algorithm.
Recovering Faces from Memory: The Distracting Influence of External Facial Features

ERIC Educational Resources Information Center

Frowd, Charlie D.; Skelton, Faye; Atherton, Chris; Pitchford, Melanie; Hepton, Gemma; Holden, Laura; McIntyre, Alex H.; Hancock, Peter J. B.

2012-01-01

Recognition memory for unfamiliar faces is facilitated when contextual cues (e.g., head pose, background environment, hair and clothing) are consistent between study and test. By contrast, inconsistencies in external features, especially hair, promote errors in unfamiliar face-matching tasks. For the construction of facial composites, as carried…
Slime mold uses an externalized spatial "memory" to navigate in complex environments.

PubMed

Reid, Chris R; Latty, Tanya; Dussutour, Audrey; Beekman, Madeleine

2012-10-23

Spatial memory enhances an organism's navigational ability. Memory typically resides within the brain, but what if an organism has no brain? We show that the brainless slime mold Physarum polycephalum constructs a form of spatial memory by avoiding areas it has previously explored. This mechanism allows the slime mold to solve the U-shaped trap problem--a classic test of autonomous navigational ability commonly used in robotics--requiring the slime mold to reach a chemoattractive goal behind a U-shaped barrier. Drawn into the trap, the organism must rely on other methods than gradient-following to escape and reach the goal. Our data show that spatial memory enhances the organism's ability to navigate in complex environments. We provide a unique demonstration of a spatial memory system in a nonneuronal organism, supporting the theory that an externalized spatial memory may be the functional precursor to the internal memory of higher organisms.

FPGA Accelerated Discrete-SURF for Real-Time Homography Estimation

DTIC Science & Technology

2015-03-26

allows for the sum of a group of pixels to be found with only four memory accesses, and a single calculation...of pixels are retrieved from memory and their Hessian determinant values are compared. If the center pixel of the 3x3 block is greater than the other...process- ing on the FPGA[5][24][31]. Third, previous approaches rely heavily on external memory and other components external to the FPGA, while a logic
Hybrid computing using a neural network with dynamic external memory.

PubMed

Graves, Alex; Wayne, Greg; Reynolds, Malcolm; Harley, Tim; Danihelka, Ivo; Grabska-Barwińska, Agnieszka; Colmenarejo, Sergio Gómez; Grefenstette, Edward; Ramalho, Tiago; Agapiou, John; Badia, Adrià Puigdomènech; Hermann, Karl Moritz; Zwols, Yori; Ostrovski, Georg; Cain, Adam; King, Helen; Summerfield, Christopher; Blunsom, Phil; Kavukcuoglu, Koray; Hassabis, Demis

2016-10-27

Artificial neural networks are remarkably adept at sensory processing, sequence learning and reinforcement learning, but are limited in their ability to represent variables and data structures and to store data over long timescales, owing to the lack of an external memory. Here we introduce a machine learning model called a differentiable neural computer (DNC), which consists of a neural network that can read from and write to an external memory matrix, analogous to the random-access memory in a conventional computer. Like a conventional computer, it can use its memory to represent and manipulate complex data structures, but, like a neural network, it can learn to do so from data. When trained with supervised learning, we demonstrate that a DNC can successfully answer synthetic questions designed to emulate reasoning and inference problems in natural language. We show that it can learn tasks such as finding the shortest path between specified points and inferring the missing links in randomly generated graphs, and then generalize these tasks to specific graphs such as transport networks and family trees. When trained with reinforcement learning, a DNC can complete a moving blocks puzzle in which changing goals are specified by sequences of symbols. Taken together, our results demonstrate that DNCs have the capacity to solve complex, structured tasks that are inaccessible to neural networks without external read-write memory.
Methods for reducing interference in the Complementary Learning Systems model: oscillating inhibition and autonomous memory rehearsal.

PubMed

Norman, Kenneth A; Newman, Ehren L; Perotte, Adler J

2005-11-01

The stability-plasticity problem (i.e. how the brain incorporates new information into its model of the world, while at the same time preserving existing knowledge) has been at the forefront of computational memory research for several decades. In this paper, we critically evaluate how well the Complementary Learning Systems theory of hippocampo-cortical interactions addresses the stability-plasticity problem. We identify two major challenges for the model: Finding a learning algorithm for cortex and hippocampus that enacts selective strengthening of weak memories, and selective punishment of competing memories; and preventing catastrophic forgetting in the case of non-stationary environments (i.e. when items are temporarily removed from the training set). We then discuss potential solutions to these problems: First, we describe a recently developed learning algorithm that leverages neural oscillations to find weak parts of memories (so they can be strengthened) and strong competitors (so they can be punished), and we show how this algorithm outperforms other learning algorithms (CPCA Hebbian learning and Leabra at memorizing overlapping patterns. Second, we describe how autonomous re-activation of memories (separately in cortex and hippocampus) during REM sleep, coupled with the oscillating learning algorithm, can reduce the rate of forgetting of input patterns that are no longer present in the environment. We then present a simple demonstration of how this process can prevent catastrophic interference in an AB-AC learning paradigm.
Optimization of memory use of fragment extension-based protein-ligand docking with an original fast minimum cost flow algorithm.

PubMed

Yanagisawa, Keisuke; Komine, Shunta; Kubota, Rikuto; Ohue, Masahito; Akiyama, Yutaka

2018-06-01

The need to accelerate large-scale protein-ligand docking in virtual screening against a huge compound database led researchers to propose a strategy that entails memorizing the evaluation result of the partial structure of a compound and reusing it to evaluate other compounds. However, the previous method required frequent disk accesses, resulting in insufficient acceleration. Thus, more efficient memory usage can be expected to lead to further acceleration, and optimal memory usage could be achieved by solving the minimum cost flow problem. In this research, we propose a fast algorithm for the minimum cost flow problem utilizing the characteristics of the graph generated for this problem as constraints. The proposed algorithm, which optimized memory usage, was approximately seven times faster compared to existing minimum cost flow algorithms. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Accessing memory

DOEpatents

Yoon, Doe Hyun; Muralimanohar, Naveen; Chang, Jichuan; Ranganthan, Parthasarathy

2017-09-26

A disclosed example method involves performing simultaneous data accesses on at least first and second independently selectable logical sub-ranks to access first data via a wide internal data bus in a memory device. The memory device includes a translation buffer chip, memory chips in independently selectable logical sub-ranks, a narrow external data bus to connect the translation buffer chip to a memory controller, and the wide internal data bus between the translation buffer chip and the memory chips. A data access is performed on only the first independently selectable logical sub-rank to access second data via the wide internal data bus. The example method also involves locating a first portion of the first data, a second portion of the first data, and the second data on the narrow external data bus during separate data transfers.
Destination memory in Alzheimer's Disease: when I imagine telling Ronald Reagan about Paris.

PubMed

El Haj, Mohamad; Postal, Virginie; Allain, Philippe

2013-01-01

Destination memory refers to remembering the destination of information that people output. This present paper establishes a new distinction between external and internal processes within this memory system for both normal aging and Alzheimer's Disease (AD). Young adults, older adults, and mild AD patients were asked either to tell facts (i.e., external destination memory condition) or to imagine telling facts (i.e., internal destination memory condition) to pictures of famous people. The experiment established three major findings. First, the destination memory performance of the AD patients was significantly poorer than that of older adults, which in turn was poorer than that of the young adults. Furthermore, internal destination processes were more prone to being forgotten than external destination memory processes. In other words, participants had more difficulty in remembering whether they had previously imagined telling the facts to the pictures or not (i.e., imagined condition) than in remembering whether they had previously told the facts to the pictures or not (i.e., enacted condition). Second, significant correlations were detected between performances on destination memory and several executive measures such as the Stroop, the Plus-Minus and the Binding tasks. Third, among the executive measures, regression analyses showed that performance on the Stroop task was a main factor in explaining variance in destination memory performance. Our findings reflect the difficulty in remembering the destination of internally generated information. They also demonstrate the involvement of inhibitory processes in destination memory. Copyright © 2011 Elsevier Ltd. All rights reserved.
Storing information in-the-world: Metacognition and cognitive offloading in a short-term memory task.

PubMed

Risko, Evan F; Dunn, Timothy L

2015-11-01

We often store to-be-remembered information externally (e.g., written down on a piece of paper) rather than internally. In the present investigation, we examine factors that influence the decision to store information in-the-world versus in-the-head using a variant of a traditional short term memory task. In Experiments 1a and 1b participants were presented with to-be-remembered items and either had to rely solely on internal memory or had the option to write down the presented information. In Experiments 2a and 2b participants were presented with the same stimuli but made metacognitive judgments about their predicted performance and effort expenditure. The spontaneous use of external storage was related both to the number of items to be remembered and an individual's actual and perceived short-term-memory capacity. Interestingly, individuals often used external storage despite its use affording no observable benefit. Implications for understanding how individuals integrate external resources in pursuing cognitive goals are discussed. Copyright © 2015 Elsevier Inc. All rights reserved.
Maternal depression and trajectories of child internalizing and externalizing problems: the roles of child decision making and working memory.

PubMed

Flouri, E; Ruddy, A; Midouhas, E

2017-04-01

Maternal depression may affect the emotional/behavioural outcomes of children with normal neurocognitive functioning less severely than it does those without. To guide prevention and intervention efforts, research must specify which aspects of a child's cognitive functioning both moderate the effect of maternal depression and are amenable to change. Working memory and decision making may be amenable to change and are so far unexplored as moderators of this effect. Our sample was 17 160 Millennium Cohort Study children. We analysed trajectories of externalizing (conduct and hyperactivity) and internalizing (emotional and peer) problems, measured with the Strengths and Difficulties Questionnaire at the ages 3, 5, 7 and 11 years, using growth curve models. We characterized maternal depression, also time-varying at these ages, by a high score on the K6. Working memory was measured with the Cambridge Neuropsychological Test Automated Battery Spatial Working Memory Task, and decision making (risk taking and quality of decision making) with the Cambridge Gambling Task, both at age 11 years. Maternal depression predicted both the level and the growth of problems. Risk taking and poor-quality decision making were related positively to externalizing and non-significantly to internalizing problems. Poor working memory was related to both problem types. Neither decision making nor working memory explained the effect of maternal depression on child internalizing/externalizing problems. Importantly, risk taking amplified the effect of maternal depression on internalizing problems, and poor working memory that on internalizing and conduct problems. Impaired decision making and working memory in children amplify the adverse effect of maternal depression on, particularly, internalizing problems.
A new scheduling algorithm for parallel sparse LU factorization with static pivoting

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigori, Laura; Li, Xiaoye S.

2002-08-20

In this paper we present a static scheduling algorithm for parallel sparse LU factorization with static pivoting. The algorithm is divided into mapping and scheduling phases, using the symmetric pruned graphs of L' and U to represent dependencies. The scheduling algorithm is designed for driving the parallel execution of the factorization on a distributed-memory architecture. Experimental results and comparisons with SuperLU{_}DIST are reported after applying this algorithm on real world application matrices on an IBM SP RS/6000 distributed memory machine.
Rapid solution of large-scale systems of equations

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.

1994-01-01

The analysis and design of complex aerospace structures requires the rapid solution of large systems of linear and nonlinear equations, eigenvalue extraction for buckling, vibration and flutter modes, structural optimization and design sensitivity calculation. Computers with multiple processors and vector capabilities can offer substantial computational advantages over traditional scalar computer for these analyses. These computers fall into two categories: shared memory computers and distributed memory computers. This presentation covers general-purpose, highly efficient algorithms for generation/assembly or element matrices, solution of systems of linear and nonlinear equations, eigenvalue and design sensitivity analysis and optimization. All algorithms are coded in FORTRAN for shared memory computers and many are adapted to distributed memory computers. The capability and numerical performance of these algorithms will be addressed.
Prospective memory in an air traffic control simulation: External aids that signal when to act

PubMed Central

Loft, Shayne; Smith, Rebekah E.; Bhaskara, Adella

2011-01-01

At work and in our personal life we often need to remember to perform intended actions at some point in the future, referred to as Prospective Memory. Individuals sometimes forget to perform intentions in safety-critical work contexts. Holding intentions can also interfere with ongoing tasks. We applied theories and methods from the experimental literature to test the effectiveness of external aids in reducing prospective memory error and costs to ongoing tasks in an air traffic control simulation. Participants were trained to accept and hand-off aircraft, and to detect aircraft conflicts. For the prospective memory task participants were required to substitute alternative actions for routine actions when accepting target aircraft. Across two experiments, external display aids were provided that presented the details of target aircraft and associated intended actions. We predicted that aids would only be effective if they provided information that was diagnostic of target occurrence and in this study we examined the utility of aids that directly cued participants when to allocate attention to the prospective memory task. When aids were set to flash when the prospective memory target aircraft needed to be accepted, prospective memory error and costs to ongoing tasks of aircraft acceptance and conflict detection were reduced. In contrast, aids that did not alert participants specifically when the target aircraft were present provided no advantage compared to when no aids we used. These findings have practical implications for the potential relative utility of automated external aids for occupations where individuals monitor multi-item dynamic displays. PMID:21443381
Prospective memory in an air traffic control simulation: external aids that signal when to act.

PubMed

Loft, Shayne; Smith, Rebekah E; Bhaskara, Adella

2011-03-01

At work and in our personal life we often need to remember to perform intended actions at some point in the future, referred to as Prospective Memory. Individuals sometimes forget to perform intentions in safety-critical work contexts. Holding intentions can also interfere with ongoing tasks. We applied theories and methods from the experimental literature to test the effectiveness of external aids in reducing prospective memory error and costs to ongoing tasks in an air traffic control simulation. Participants were trained to accept and hand-off aircraft and to detect aircraft conflicts. For the prospective memory task, participants were required to substitute alternative actions for routine actions when accepting target aircraft. Across two experiments, external display aids were provided that presented the details of target aircraft and associated intended actions. We predicted that aids would only be effective if they provided information that was diagnostic of target occurrence, and in this study, we examined the utility of aids that directly cued participants when to allocate attention to the prospective memory task. When aids were set to flash when the prospective memory target aircraft needed to be accepted, prospective memory error and costs to ongoing tasks of aircraft acceptance and conflict detection were reduced. In contrast, aids that did not alert participants specifically when the target aircraft were present provided no advantage compared to when no aids were used. These findings have practical implications for the potential relative utility of automated external aids for occupations where individuals monitor multi-item dynamic displays.
Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA

PubMed Central

Xia, Fei; Dou, Yong; Zhou, Xingming; Yang, Xuejun; Xu, Jiaqing; Zhang, Yang

2009-01-01

Background In the field of RNA secondary structure prediction, the RNAalifold algorithm is one of the most popular methods using free energy minimization. However, general-purpose computers including parallel computers or multi-core computers exhibit parallel efficiency of no more than 50%. Field Programmable Gate-Array (FPGA) chips provide a new approach to accelerate RNAalifold by exploiting fine-grained custom design. Results RNAalifold shows complicated data dependences, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master Processing Element (PE) and multiple slave PEs for fine grain hardware implementation on FPGA. We exploit data reuse schemes to reduce the need to load energy matrices from external memory. We also propose several methods to reduce energy table parameter size by 80%. Conclusion To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete RNAalifold algorithm. The experimental results show a factor of 12.2 speedup over the RNAalifold (ViennaPackage – 1.6.5) software for a group of aligned RNA sequences with 2981-residue running on a Personal Computer (PC) platform with Pentium 4 2.6 GHz CPU. PMID:19208138
Similarities and differences between mind-wandering and external distraction: a latent variable analysis of lapses of attention and their relation to cognitive abilities.

PubMed

Unsworth, Nash; McMillan, Brittany D

2014-07-01

The current study examined the extent to which task-unrelated thoughts represent both vulnerability to mind-wandering and susceptibility to external distraction from an individual difference perspective. Participants performed multiple measures of attention control, working memory capacity, and fluid intelligence. Task-unrelated thoughts were assessed using thought probes during the attention control tasks. Using latent variable techniques, the results suggested that mind-wandering and external distraction reflect distinct, yet correlated constructs, both of which are related to working memory capacity and fluid intelligence. Furthermore, the results suggest that the common variance shared by mind-wandering, external distraction, and attention control is what primarily accounts for their relation with working memory capacity and fluid intelligence. These results support the notion that lapses of attention are strongly related to cognitive abilities. Copyright © 2014 Elsevier B.V. All rights reserved.
Estimation of Attitude and External Acceleration Using Inertial Sensor Measurement During Various Dynamic Conditions

PubMed Central

Lee, Jung Keun; Park, Edward J.; Robinovitch, Stephen N.

2012-01-01

This paper proposes a Kalman filter-based attitude (i.e., roll and pitch) estimation algorithm using an inertial sensor composed of a triaxial accelerometer and a triaxial gyroscope. In particular, the proposed algorithm has been developed for accurate attitude estimation during dynamic conditions, in which external acceleration is present. Although external acceleration is the main source of the attitude estimation error and despite the need for its accurate estimation in many applications, this problem that can be critical for the attitude estimation has not been addressed explicitly in the literature. Accordingly, this paper addresses the combined estimation problem of the attitude and external acceleration. Experimental tests were conducted to verify the performance of the proposed algorithm in various dynamic condition settings and to provide further insight into the variations in the estimation accuracy. Furthermore, two different approaches for dealing with the estimation problem during dynamic conditions were compared, i.e., threshold-based switching approach versus acceleration model-based approach. Based on an external acceleration model, the proposed algorithm was capable of estimating accurate attitudes and external accelerations for short accelerated periods, showing its high effectiveness during short-term fast dynamic conditions. Contrariwise, when the testing condition involved prolonged high external accelerations, the proposed algorithm exhibited gradually increasing errors. However, as soon as the condition returned to static or quasi-static conditions, the algorithm was able to stabilize the estimation error, regaining its high estimation accuracy. PMID:22977288
Slime mold uses an externalized spatial “memory” to navigate in complex environments

PubMed Central

Reid, Chris R.; Latty, Tanya; Dussutour, Audrey; Beekman, Madeleine

2012-01-01

Spatial memory enhances an organism’s navigational ability. Memory typically resides within the brain, but what if an organism has no brain? We show that the brainless slime mold Physarum polycephalum constructs a form of spatial memory by avoiding areas it has previously explored. This mechanism allows the slime mold to solve the U-shaped trap problem—a classic test of autonomous navigational ability commonly used in robotics—requiring the slime mold to reach a chemoattractive goal behind a U-shaped barrier. Drawn into the trap, the organism must rely on other methods than gradient-following to escape and reach the goal. Our data show that spatial memory enhances the organism’s ability to navigate in complex environments. We provide a unique demonstration of a spatial memory system in a nonneuronal organism, supporting the theory that an externalized spatial memory may be the functional precursor to the internal memory of higher organisms. PMID:23045640
Time Frame Affects Vantage Point in Episodic and Semantic Autobiographical Memory: Evidence from Response Latencies

PubMed Central

Karylowski, Jerzy J.; Mrozinski, Blazej

2017-01-01

Previous research suggests that, with the passage of time, representations of self in episodic memory become less dependent on their initial (internal) vantage point and shift toward an external perspective that is normally characteristic of how other people are represented. The present experiment examined this phenomenon in both episodic and semantic autobiographical memory using latency of self-judgments as a measure of accessibility of the internal vs. the external perspective. Results confirmed that in the case of representations of the self retrieved from recent autobiographical memories, trait-judgments regarding unobservable self-aspects (internal perspective) were faster than trait judgments regarding observable self-aspects (external perspective). Yet, in the case of self-representations retrieved from memories of a more distant past, judgments regarding observable self-aspects were faster. Those results occurred for both self-representations retrieved from episodic memory and for representations retrieved from the semantic memory. In addition, regardless of the effect of time, greater accessibility of unobservable (vs. observable) self-aspects was associated with the episodic rather than semantic autobiographical memory. Those results were modified by neither declared trait’s self-descriptiveness (yes vs. no responses) nor by its desirability (highly desirable vs. moderately desirable traits). Implications for compatibility between how self and others are represented and for the role of self in social perception are discussed. PMID:28473793
Time Frame Affects Vantage Point in Episodic and Semantic Autobiographical Memory: Evidence from Response Latencies.

PubMed

Karylowski, Jerzy J; Mrozinski, Blazej

2017-01-01

Previous research suggests that, with the passage of time, representations of self in episodic memory become less dependent on their initial (internal) vantage point and shift toward an external perspective that is normally characteristic of how other people are represented. The present experiment examined this phenomenon in both episodic and semantic autobiographical memory using latency of self-judgments as a measure of accessibility of the internal vs. the external perspective. Results confirmed that in the case of representations of the self retrieved from recent autobiographical memories, trait-judgments regarding unobservable self-aspects (internal perspective) were faster than trait judgments regarding observable self-aspects (external perspective). Yet, in the case of self-representations retrieved from memories of a more distant past, judgments regarding observable self-aspects were faster. Those results occurred for both self-representations retrieved from episodic memory and for representations retrieved from the semantic memory. In addition, regardless of the effect of time, greater accessibility of unobservable (vs. observable) self-aspects was associated with the episodic rather than semantic autobiographical memory. Those results were modified by neither declared trait's self-descriptiveness ( yes vs. no responses) nor by its desirability (highly desirable vs. moderately desirable traits). Implications for compatibility between how self and others are represented and for the role of self in social perception are discussed.
Efficient image compression algorithm for computer-animated images

NASA Astrophysics Data System (ADS)

Yfantis, Evangelos A.; Au, Matthew Y.; Miel, G.

1992-10-01

An image compression algorithm is described. The algorithm is an extension of the run-length image compression algorithm and its implementation is relatively easy. This algorithm was implemented and compared with other existing popular compression algorithms and with the Lempel-Ziv (LZ) coding. The Lempel-Ziv algorithm is available as a utility in the UNIX operating system and is also referred to as the UNIX uncompress. Sometimes our algorithm is best in terms of saving memory space, and sometimes one of the competing algorithms is best. The algorithm is lossless, and the intent is for the algorithm to be used in computer graphics animated images. Comparisons made with the LZ algorithm indicate that the decompression time using our algorithm is faster than that using the LZ algorithm. Once the data are in memory, a relatively simple and fast transformation is applied to uncompress the file.
Research on memory management in embedded systems

NASA Astrophysics Data System (ADS)

Huang, Xian-ying; Yang, Wu

2005-12-01

Memory is a scarce resource in embedded system due to cost and size. Thus, applications in embedded systems cannot use memory randomly, such as in desktop applications. However, data and code must be stored into memory for running. The purpose of this paper is to save memory in developing embedded applications and guarantee running under limited memory conditions. Embedded systems often have small memory and are required to run a long time. Thus, a purpose of this study is to construct an allocator that can allocate memory effectively and bear a long-time running situation, reduce memory fragmentation and memory exhaustion. Memory fragmentation and exhaustion are related to the algorithm memory allocated. Static memory allocation cannot produce fragmentation. In this paper it is attempted to find an effective allocation algorithm dynamically, which can reduce memory fragmentation. Data is the critical part that ensures an application can run regularly, which takes up a large amount of memory. The amount of data that can be stored in the same size of memory is relevant with the selected data structure. Skills for designing application data in mobile phone are explained and discussed also.

Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

PubMed

Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

2011-01-01

The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.
Spiking neural network simulation: memory-optimal synaptic event scheduling.

PubMed

Stewart, Robert D; Gurney, Kevin N

2011-06-01

Spiking neural network simulations incorporating variable transmission delays require synaptic events to be scheduled prior to delivery. Conventional methods have memory requirements that scale with the total number of synapses in a network. We introduce novel scheduling algorithms for both discrete and continuous event delivery, where the memory requirement scales instead with the number of neurons. Superior algorithmic performance is demonstrated using large-scale, benchmarking network simulations.
Episodic and semantic components of autobiographical memories and imagined future events in post-traumatic stress disorder.

PubMed

Brown, Adam D; Addis, Donna Rose; Romano, Tracy A; Marmar, Charles R; Bryant, Richard A; Hirst, William; Schacter, Daniel L

2014-01-01

Individuals with post-traumatic stress disorder (PTSD) tend to retrieve autobiographical memories with less episodic specificity, referred to as overgeneralised autobiographical memory. In line with evidence that autobiographical memory overlaps with one's capacity to imagine the future, recent work has also shown that individuals with PTSD also imagine themselves in the future with less episodic specificity. To date most studies quantify episodic specificity by the presence of a distinct event. However, this method does not distinguish between the numbers of internal (episodic) and external (semantic) details, which can provide additional insights into remembering the past and imagining the future. This study employed the Autobiographical Interview (AI) coding scheme to the autobiographical memory and imagined future event narratives generated by combat veterans with and without PTSD. Responses were coded for the number of internal and external details. Compared to combat veterans without PTSD, those with PTSD generated more external than internal details when recalling past or imagining future events, and fewer internal details were associated with greater symptom severity. The potential mechanisms underlying these bidirectional deficits and clinical implications are discussed.
Investigation of the optimum location of external markers for patient setup accuracy enhancement at external beam radiotherapy

PubMed Central

Torshabi, Ahmad Esmaili; Nankali, Saber

2016-01-01

In external beam radiotherapy, one of the most common and reliable methods for patient geometrical setup and/or predicting the tumor location is use of external markers. In this study, the main challenging issue is increasing the accuracy of patient setup by investigating external markers location. Since the location of each external marker may yield different patient setup accuracy, it is important to assess different locations of external markers using appropriate selective algorithms. To do this, two commercially available algorithms entitled a) canonical correlation analysis (CCA) and b) principal component analysis (PCA) were proposed as input selection algorithms. They work on the basis of maximum correlation coefficient and minimum variance between given datasets. The proposed input selection algorithms work in combination with an adaptive neuro‐fuzzy inference system (ANFIS) as a correlation model to give patient positioning information as output. Our proposed algorithms provide input file of ANFIS correlation model accurately. The required dataset for this study was prepared by means of a NURBS‐based 4D XCAT anthropomorphic phantom that can model the shape and structure of complex organs in human body along with motion information of dynamic organs. Moreover, a database of four real patients undergoing radiation therapy for lung cancers was utilized in this study for validation of proposed strategy. Final analyzed results demonstrate that input selection algorithms can reasonably select specific external markers from those areas of the thorax region where root mean square error (RMSE) of ANFIS model has minimum values at that given area. It is also found that the selected marker locations lie closely in those areas where surface point motion has a large amplitude and a high correlation. PACS number(s): 87.55.km, 87.55.N PMID:27929479
Aspects of GPU perfomance in algorithms with random memory access

NASA Astrophysics Data System (ADS)

Kashkovsky, Alexander V.; Shershnev, Anton A.; Vashchenkov, Pavel V.

2017-10-01

The numerical code for solving the Boltzmann equation on the hybrid computational cluster using the Direct Simulation Monte Carlo (DSMC) method showed that on Tesla K40 accelerators computational performance drops dramatically with increase of percentage of occupied GPU memory. Testing revealed that memory access time increases tens of times after certain critical percentage of memory is occupied. Moreover, it seems to be the common problem of all NVidia's GPUs arising from its architecture. Few modifications of the numerical algorithm were suggested to overcome this problem. One of them, based on the splitting the memory into "virtual" blocks, resulted in 2.5 times speed up.
Strategic offloading of delayed intentions into the external environment.

PubMed

Gilbert, Sam J

2015-01-01

In everyday life, we often use external artefacts such as diaries to help us remember intended behaviours. In addition, we commonly manipulate our environment, for example by placing reminders in noticeable places. Yet strategic offloading of intentions to the external environment is not typically permitted in laboratory tasks examining memory for delayed intentions. What factors influence our use of such strategies, and what behavioural consequences do they have? This article describes four online experiments (N = 1196) examining a novel web-based task in which participants hold intentions for brief periods, with the option to strategically externalize these intentions by creating a reminder. This task significantly predicted participants' fulfilment of a naturalistic intention embedded within their everyday activities up to one week later (with greater predictive ability than more traditional prospective memory tasks, albeit with weak effect size). Setting external reminders improved performance, and it was more prevalent in older adults. Furthermore, participants set reminders adaptively, based on (a) memory load, and (b) the likelihood of distraction. These results suggest the importance of metacognitive processes in triggering intention offloading, which can increase the probability that intentions are eventually fulfilled.
Functional magnetic resonance imaging study of external source memory and its relation to cognitive insight in non-clinical subjects.

PubMed

Buchy, Lisa; Hawco, Colin; Bodnar, Michael; Izadi, Sarah; Dell'Elce, Jennifer; Messina, Katrina; Lepage, Martin

2014-09-01

Previous research has linked cognitive insight (a measure of self-reflectiveness and self-certainty) in psychosis with neurocognitive and neuroanatomical disturbances in the fronto-hippocampal neural network. The authors' goal was to use functional magnetic resonance imaging (fMRI) to investigate the neural correlates of cognitive insight during an external source memory paradigm in non-clinical subjects. At encoding, 24 non-clinical subjects travelled through a virtual city where they came across 20 separate people, each paired with a unique object in a distinct location. fMRI data were then acquired while participants viewed images of the city, and completed source recognition memory judgments of where and with whom objects were seen, which is known to involve prefrontal cortex. Cognitive insight was assessed with the Beck Cognitive Insight Scale. External source memory was associated with neural activity in a widespread network consisting of frontal cortex, including ventrolateral prefrontal cortex (VLPFC), temporal and occipital cortices. Activation in VLPFC correlated with higher self-reflectiveness and activation in midbrain correlated with lower self-certainty during source memory attributions. Neither self-reflectiveness nor self-certainty significantly correlated with source memory accuracy. By means of virtual reality and in the context of an external source memory paradigm, the study identified a preliminary functional neural basis for cognitive insight in the VLPFC in healthy people that accords with our fronto-hippocampal theoretical model as well as recent neuroimaging data in people with psychosis. The results may facilitate the understanding of the role of neural mechanisms in psychotic disorders associated with cognitive insight distortions. © 2014 The Authors. Psychiatry and Clinical Neurosciences © 2014 Japanese Society of Psychiatry and Neurology.
Recurrent Neural Networks With Auxiliary Memory Units.

PubMed

Wang, Jianyong; Zhang, Lei; Guo, Quan; Yi, Zhang

2018-05-01

Memory is one of the most important mechanisms in recurrent neural networks (RNNs) learning. It plays a crucial role in practical applications, such as sequence learning. With a good memory mechanism, long term history can be fused with current information, and can thus improve RNNs learning. Developing a suitable memory mechanism is always desirable in the field of RNNs. This paper proposes a novel memory mechanism for RNNs. The main contributions of this paper are: 1) an auxiliary memory unit (AMU) is proposed, which results in a new special RNN model (AMU-RNN), separating the memory and output explicitly and 2) an efficient learning algorithm is developed by employing the technique of error flow truncation. The proposed AMU-RNN model, together with the developed learning algorithm, can learn and maintain stable memory over a long time range. This method overcomes both the learning conflict problem and gradient vanishing problem. Unlike the traditional method, which mixes the memory and output with a single neuron in a recurrent unit, the AMU provides an auxiliary memory neuron to maintain memory in particular. By separating the memory and output in a recurrent unit, the problem of learning conflicts can be eliminated easily. Moreover, by using the technique of error flow truncation, each auxiliary memory neuron ensures constant error flow during the learning process. The experiments demonstrate good performance of the proposed AMU-RNNs and the developed learning algorithm. The method exhibits quite efficient learning performance with stable convergence in the AMU-RNN learning and outperforms the state-of-the-art RNN models in sequence generation and sequence classification tasks.
Memory Self-Efficacy and Strategy Use in Successful Elders.

ERIC Educational Resources Information Center

McDougall, Graham J.

1995-01-01

The Metamemory in Adulthood Questionnaire, Memory Self-Efficacy Questionnaire, and measures of depression and health status were completed by 169 adults over 55 in Texas and Louisiana. External memory strategies (lists, notes) were used more often than internal (elaboration, rehearsal). Memory efficacy decreased significantly with age, and anxiety…
A voting-based star identification algorithm utilizing local and global distribution

NASA Astrophysics Data System (ADS)

Fan, Qiaoyun; Zhong, Xuyang; Sun, Junhua

2018-03-01

A novel star identification algorithm based on voting scheme is presented in this paper. In the proposed algorithm, the global distribution and local distribution of sensor stars are fully utilized, and the stratified voting scheme is adopted to obtain the candidates for sensor stars. The database optimization is employed to reduce its memory requirement and improve the robustness of the proposed algorithm. The simulation shows that the proposed algorithm exhibits 99.81% identification rate with 2-pixel standard deviations of positional noises and 0.322-Mv magnitude noises. Compared with two similar algorithms, the proposed algorithm is more robust towards noise, and the average identification time and required memory is less. Furthermore, the real sky test shows that the proposed algorithm performs well on the real star images.
ABINIT: Plane-Wave-Based Density-Functional Theory on High Performance Computers

NASA Astrophysics Data System (ADS)

Torrent, Marc

2014-03-01

For several years, a continuous effort has been produced to adapt electronic structure codes based on Density-Functional Theory to the future computing architectures. Among these codes, ABINIT is based on a plane-wave description of the wave functions which allows to treat systems of any kind. Porting such a code on petascale architectures pose difficulties related to the many-body nature of the DFT equations. To improve the performances of ABINIT - especially for what concerns standard LDA/GGA ground-state and response-function calculations - several strategies have been followed: A full multi-level parallelisation MPI scheme has been implemented, exploiting all possible levels and distributing both computation and memory. It allows to increase the number of distributed processes and could not be achieved without a strong restructuring of the code. The core algorithm used to solve the eigen problem (``Locally Optimal Blocked Congugate Gradient''), a Blocked-Davidson-like algorithm, is based on a distribution of processes combining plane-waves and bands. In addition to the distributed memory parallelization, a full hybrid scheme has been implemented, using standard shared-memory directives (openMP/openACC) or porting some comsuming code sections to Graphics Processing Units (GPU). As no simple performance model exists, the complexity of use has been increased; the code efficiency strongly depends on the distribution of processes among the numerous levels. ABINIT is able to predict the performances of several process distributions and automatically choose the most favourable one. On the other hand, a big effort has been carried out to analyse the performances of the code on petascale architectures, showing which sections of codes have to be improved; they all are related to Matrix Algebra (diagonalisation, orthogonalisation). The different strategies employed to improve the code scalability will be described. They are based on an exploration of new diagonalization algorithm, as well as the use of external optimized librairies. Part of this work has been supported by the european Prace project (PaRtnership for Advanced Computing in Europe) in the framework of its workpackage 8.
Novel memory architecture for video signal processor

NASA Astrophysics Data System (ADS)

Hung, Jen-Sheng; Lin, Chia-Hsing; Jen, Chein-Wei

1993-11-01

An on-chip memory architecture for video signal processor (VSP) is proposed. This memory structure is a two-level design for the different data locality in video applications. The upper level--Memory A provides enough storage capacity to reduce the impact on the limitation of chip I/O bandwidth, and the lower level--Memory B provides enough data parallelism and flexibility to meet the requirements of multiple reconfigurable pipeline function units in a single VSP chip. The needed memory size is decided by the memory usage analysis for video algorithms and the number of function units. Both levels of memory adopted a dual-port memory scheme to sustain the simultaneous read and write operations. Especially, Memory B uses multiple one-read-one-write memory banks to emulate the real multiport memory. Therefore, one can change the configuration of Memory B to several sets of memories with variable read/write ports by adjusting the bus switches. Then the numbers of read ports and write ports in proposed memory can meet requirement of data flow patterns in different video coding algorithms. We have finished the design of a prototype memory design using 1.2- micrometers SPDM SRAM technology and will fabricated it through TSMC, in Taiwan.
DANoC: An Efficient Algorithm and Hardware Codesign of Deep Neural Networks on Chip.

PubMed

Zhou, Xichuan; Li, Shengli; Tang, Fang; Hu, Shengdong; Lin, Zhi; Zhang, Lei

2017-07-18

Deep neural networks (NNs) are the state-of-the-art models for understanding the content of images and videos. However, implementing deep NNs in embedded systems is a challenging task, e.g., a typical deep belief network could exhaust gigabytes of memory and result in bandwidth and computational bottlenecks. To address this challenge, this paper presents an algorithm and hardware codesign for efficient deep neural computation. A hardware-oriented deep learning algorithm, named the deep adaptive network, is proposed to explore the sparsity of neural connections. By adaptively removing the majority of neural connections and robustly representing the reserved connections using binary integers, the proposed algorithm could save up to 99.9% memory utility and computational resources without undermining classification accuracy. An efficient sparse-mapping-memory-based hardware architecture is proposed to fully take advantage of the algorithmic optimization. Different from traditional Von Neumann architecture, the deep-adaptive network on chip (DANoC) brings communication and computation in close proximity to avoid power-hungry parameter transfers between on-board memory and on-chip computational units. Experiments over different image classification benchmarks show that the DANoC system achieves competitively high accuracy and efficiency comparing with the state-of-the-art approaches.
Recovering faces from memory: the distracting influence of external facial features.

PubMed

Frowd, Charlie D; Skelton, Faye; Atherton, Chris; Pitchford, Melanie; Hepton, Gemma; Holden, Laura; McIntyre, Alex H; Hancock, Peter J B

2012-06-01

Recognition memory for unfamiliar faces is facilitated when contextual cues (e.g., head pose, background environment, hair and clothing) are consistent between study and test. By contrast, inconsistencies in external features, especially hair, promote errors in unfamiliar face-matching tasks. For the construction of facial composites, as carried out by witnesses and victims of crime, the role of external features (hair, ears, and neck) is less clear, although research does suggest their involvement. Here, over three experiments, we investigate the impact of external features for recovering facial memories using a modern, recognition-based composite system, EvoFIT. Participant-constructors inspected an unfamiliar target face and, one day later, repeatedly selected items from arrays of whole faces, with "breeding," to "evolve" a composite with EvoFIT; further participants (evaluators) named the resulting composites. In Experiment 1, the important internal-features (eyes, brows, nose, and mouth) were constructed more identifiably when the visual presence of external features was decreased by Gaussian blur during construction: higher blur yielded more identifiable internal-features. In Experiment 2, increasing the visible extent of external features (to match the target's) in the presented face-arrays also improved internal-features quality, although less so than when external features were masked throughout construction. Experiment 3 demonstrated that masking external-features promoted substantially more identifiable images than using the previous method of blurring external-features. Overall, the research indicates that external features are a distractive rather than a beneficial cue for face construction; the results also provide a much better method to construct composites, one that should dramatically increase identification of offenders.
The Quantum Binding Problem in the Context of Associative Memory

PubMed Central

Wichert, Andreas

2016-01-01

We present a method to solve the binding problem by using a quantum algorithm for the retrieval of associations from associative memory during visual scene analysis. The problem is solved by mapping the information representing different objects into superposition by using entanglement and Grover’s amplification algorithm. PMID:27603782
Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records.

PubMed

Luo, Yuan; Szolovits, Peter

2016-01-01

In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text. We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints. Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies. We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm. We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen's interval algebra. We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm. Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen's relations in logarithmic time, attaining the theoretic lower bound. Updating time is kept logarithmic and the space requirement is kept linear at the same time. We also discuss interval management in external memory models and higher dimensions.
Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records

PubMed Central

Luo, Yuan; Szolovits, Peter

2016-01-01

In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text. We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints. Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies. We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm. We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen’s interval algebra. We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm. Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen’s relations in logarithmic time, attaining the theoretic lower bound. Updating time is kept logarithmic and the space requirement is kept linear at the same time. We also discuss interval management in external memory models and higher dimensions. PMID:27478379
Non-Markovianity-assisted high-fidelity Deutsch-Jozsa algorithm in diamond

NASA Astrophysics Data System (ADS)

Dong, Yang; Zheng, Yu; Li, Shen; Li, Cong-Cong; Chen, Xiang-Dong; Guo, Guang-Can; Sun, Fang-Wen

2018-01-01

The memory effects in non-Markovian quantum dynamics can induce the revival of quantum coherence, which is believed to provide important physical resources for quantum information processing (QIP). However, no real quantum algorithms have been demonstrated with the help of such memory effects. Here, we experimentally implemented a non-Markovianity-assisted high-fidelity refined Deutsch-Jozsa algorithm (RDJA) with a solid spin in diamond. The memory effects can induce pronounced non-monotonic variations in the RDJA results, which were confirmed to follow a non-Markovian quantum process by measuring the non-Markovianity of the spin system. By applying the memory effects as physical resources with the assistance of dynamical decoupling, the probability of success of RDJA was elevated above 97% in the open quantum system. This study not only demonstrates that the non-Markovianity is an important physical resource but also presents a feasible way to employ this physical resource. It will stimulate the application of the memory effects in non-Markovian quantum dynamics to improve the performance of practical QIP.
Visual navigation of the UAVs on the basis of 3D natural landmarks

NASA Astrophysics Data System (ADS)

Karpenko, Simon; Konovalenko, Ivan; Miller, Alexander; Miller, Boris; Nikolaev, Dmitry

2015-12-01

This work considers the tracking of the UAV (unmanned aviation vehicle) on the basis of onboard observations of natural landmarks including azimuth and elevation angles. It is assumed that UAV's cameras are able to capture the angular position of reference points and to measure the angles of the sight line. Such measurements involve the real position of UAV in implicit form, and therefore some of nonlinear filters such as Extended Kalman filter (EKF) or others must be used in order to implement these measurements for UAV control. Recently it was shown that modified pseudomeasurement method may be used to control UAV on the basis of the observation of reference points assigned along the UAV path in advance. However, the use of such set of points needs the cumbersome recognition procedure with the huge volume of on-board memory. The natural landmarks serving as such reference points which may be determined on-line can significantly reduce the on-board memory and the computational difficulties. The principal difference of this work is the usage of the 3D reference points coordinates which permits to determine the position of the UAV more precisely and thereby to guide along the path with higher accuracy which is extremely important for successful performance of the autonomous missions. The article suggests the new RANSAC for ISOMETRY algorithm and the use of recently developed estimation and control algorithms for tracking of given reference path under external perturbation and noised angular measurements.
Composite Particle Swarm Optimizer With Historical Memory for Function Optimization.

PubMed

Li, Jie; Zhang, JunQi; Jiang, ChangJun; Zhou, MengChu

2015-10-01

Particle swarm optimization (PSO) algorithm is a population-based stochastic optimization technique. It is characterized by the collaborative search in which each particle is attracted toward the global best position (gbest) in the swarm and its own best position (pbest). However, all of particles' historical promising pbests in PSO are lost except their current pbests. In order to solve this problem, this paper proposes a novel composite PSO algorithm, called historical memory-based PSO (HMPSO), which uses an estimation of distribution algorithm to estimate and preserve the distribution information of particles' historical promising pbests. Each particle has three candidate positions, which are generated from the historical memory, particles' current pbests, and the swarm's gbest. Then the best candidate position is adopted. Experiments on 28 CEC2013 benchmark functions demonstrate the superiority of HMPSO over other algorithms.

Real-time tumor motion estimation using respiratory surrogate via memory-based learning

NASA Astrophysics Data System (ADS)

Li, Ruijiang; Lewis, John H.; Berbeco, Ross I.; Xing, Lei

2012-08-01

Respiratory tumor motion is a major challenge in radiation therapy for thoracic and abdominal cancers. Effective motion management requires an accurate knowledge of the real-time tumor motion. External respiration monitoring devices (optical, etc) provide a noninvasive, non-ionizing, low-cost and practical approach to obtain the respiratory signal. Due to the highly complex and nonlinear relations between tumor and surrogate motion, its ultimate success hinges on the ability to accurately infer the tumor motion from respiratory surrogates. Given their widespread use in the clinic, such a method is critically needed. We propose to use a powerful memory-based learning method to find the complex relations between tumor motion and respiratory surrogates. The method first stores the training data in memory and then finds relevant data to answer a particular query. Nearby data points are assigned high relevance (or weights) and conversely distant data are assigned low relevance. By fitting relatively simple models to local patches instead of fitting one single global model, it is able to capture highly nonlinear and complex relations between the internal tumor motion and external surrogates accurately. Due to the local nature of weighting functions, the method is inherently robust to outliers in the training data. Moreover, both training and adapting to new data are performed almost instantaneously with memory-based learning, making it suitable for dynamically following variable internal/external relations. We evaluated the method using respiratory motion data from 11 patients. The data set consists of simultaneous measurement of 3D tumor motion and 1D abdominal surface (used as the surrogate signal in this study). There are a total of 171 respiratory traces, with an average peak-to-peak amplitude of ∼15 mm and average duration of ∼115 s per trace. Given only 5 s (roughly one breath) pretreatment training data, the method achieved an average 3D error of 1.5 mm and 95th percentile error of 3.4 mm on unseen test data. The average 3D error was further reduced to 1.4 mm when the model was tuned to its optimal setting for each respiratory trace. In one trace where a few outliers are present in the training data, the proposed method achieved an error reduction of as much as ∼50% compared with the best linear model (1.0 mm versus 2.1 mm). The memory-based learning technique is able to accurately capture the highly complex and nonlinear relations between tumor and surrogate motion in an efficient manner (a few milliseconds per estimate). Furthermore, the algorithm is particularly suitable to handle situations where the training data are contaminated by large errors or outliers. These desirable properties make it an ideal candidate for accurate and robust tumor gating/tracking using respiratory surrogates.
Robust Vision-Based Pose Estimation Algorithm for AN Uav with Known Gravity Vector

NASA Astrophysics Data System (ADS)

Kniaz, V. V.

2016-06-01

Accurate estimation of camera external orientation with respect to a known object is one of the central problems in photogrammetry and computer vision. In recent years this problem is gaining an increasing attention in the field of UAV autonomous flight. Such application requires a real-time performance and robustness of the external orientation estimation algorithm. The accuracy of the solution is strongly dependent on the number of reference points visible on the given image. The problem only has an analytical solution if 3 or more reference points are visible. However, in limited visibility conditions it is often needed to perform external orientation with only 2 visible reference points. In such case the solution could be found if the gravity vector direction in the camera coordinate system is known. A number of algorithms for external orientation estimation for the case of 2 known reference points and a gravity vector were developed to date. Most of these algorithms provide analytical solution in the form of polynomial equation that is subject to large errors in the case of complex reference points configurations. This paper is focused on the development of a new computationally effective and robust algorithm for external orientation based on positions of 2 known reference points and a gravity vector. The algorithm implementation for guidance of a Parrot AR.Drone 2.0 micro-UAV is discussed. The experimental evaluation of the algorithm proved its computational efficiency and robustness against errors in reference points positions and complex configurations.
Working Memory and Behavioural Problems in Relation to Malay Writing of Primary School Children

ERIC Educational Resources Information Center

Ling, Teo-Sieak; Jiar, Yeo-Kee

2017-01-01

Deficit in working memory is common among young children across multiple abilities. Teachers have pointed to poor memory as one contributing factor to inattentiveness and short attention spans as well as some behavioural problems among students. This study aimed to explore the relationship among working memory, externalizing and internalizing…
"Family Stories" and Their Implications for Preschoolers' Memories of Personal Events

ERIC Educational Resources Information Center

Larkina, Marina; Bauer, Patricia J.

2012-01-01

Most adults experience childhood amnesia: They have very few memories of events prior to 3 to 4 years of age. Nevertheless, some early memories are retained. Multiple factors likely are responsible for the survival of early childhood memories, including external representations such as videos, photographs, and conversations about past experiences,…
External Threat Risk Assessment Algorithm (ExTRAA)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Powell, Troy C.

Two risk assessment algorithms and philosophies have been augmented and combined to form a new algorit hm, the External Threat Risk Assessment Algorithm (ExTRAA), that allows for effective and statistically sound analysis of external threat sources in relation to individual attack methods . In addition to the attack method use probability and the attack method employment consequence, t he concept of defining threat sources is added to the risk assessment process. Sample data is tabulated and depicted in radar plots and bar graphs for algorithm demonstration purposes. The largest success of ExTRAA is its ability to visualize the kind ofmore » r isk posed in a given situation using the radar plot method.« less
Quantum associative memory with linear and non-linear algorithms for the diagnosis of some tropical diseases.

PubMed

Tchapet Njafa, J-P; Nana Engo, S G

2018-01-01

This paper presents the QAMDiagnos, a model of Quantum Associative Memory (QAM) that can be a helpful tool for medical staff without experience or laboratory facilities, for the diagnosis of four tropical diseases (malaria, typhoid fever, yellow fever and dengue) which have several similar signs and symptoms. The memory can distinguish a single infection from a polyinfection. Our model is a combination of the improved versions of the original linear quantum retrieving algorithm proposed by Ventura and the non-linear quantum search algorithm of Abrams and Lloyd. From the given simulation results, it appears that the efficiency of recognition is good when particular signs and symptoms of a disease are inserted given that the linear algorithm is the main algorithm. The non-linear algorithm helps confirm or correct the diagnosis or give some advice to the medical staff for the treatment. So, our QAMDiagnos that has a friendly graphical user interface for desktop and smart-phone is a sensitive and a low-cost diagnostic tool that enables rapid and accurate diagnosis of four tropical diseases. Copyright © 2017 Elsevier Ltd. All rights reserved.
The external-internal loop of interference: Two types of attention and their influence on the learning abilities of mice

PubMed Central

Sauce, Bruno; Wass, Christopher; Smith, Andrew; Kwan, Stephanie; Matzel, Louis D.

2016-01-01

Attention is a component of the working memory system, and as such, is responsible for protecting task-relevant information from interference. Cognitive performance (particularly outside of the laboratory) is often plagued by interference, and the source of this interference, either external or internal, might influence the expression of individual differences in attentional ability. By definition, external attention (also described as “selective attention”) protects working memory against sensorial distractors of all kinds, while internal attention (also called “inhibition”) protects working memory against emotional impulses, irrelevant information from memory, and automatically-generated responses. At present, it is unclear if these two types of attention are expressed independently in non-human animals, and how they might differentially impact performance on other cognitive processes, such as learning. By using a diverse battery of four attention tests (with varying levels of internal and external sources of interference), here we aimed both to explore this issue, and to obtain a robust and general (less task-specific) measure of attention in mice. Exploratory factor analyses revealed two factors (external and internal attention) that in total, accounted for 73% of the variance in attentional performance. Confirmatory factor analyses found an excellent fit with the data of the model of attention that assumed an external and internal distinction (with a resulting correlation of 0.43). In contrast, a model of attention that assumed one source of variance (i.e., “general attention”) exhibited a poor fit with the data. Regarding the relationship between attention and learning, higher resistance against external sources of interference promoted better new learning, but tended to impair performance when cognitive flexibility was required, such as during the reversal of a previously instantiated response. The present results suggest that there can be (at least) two types of attention that contribute to the common variance in attentional performance in mice, and that external and internal attentions might have opposing influences on the rate at which animals learn. PMID:25452087
A Tensor Product Formulation of Strassen's Matrix Multiplication Algorithm with Memory Reduction

DOE PAGES

Kumar, B.; Huang, C. -H.; Sadayappan, P.; ...

1995-01-01

In this article, we present a program generation strategy of Strassen's matrix multiplication algorithm using a programming methodology based on tensor product formulas. In this methodology, block recursive programs such as the fast Fourier Transforms and Strassen's matrix multiplication algorithm are expressed as algebraic formulas involving tensor products and other matrix operations. Such formulas can be systematically translated to high-performance parallel/vector codes for various architectures. In this article, we present a nonrecursive implementation of Strassen's algorithm for shared memory vector processors such as the Cray Y-MP. A previous implementation of Strassen's algorithm synthesized from tensor product formulas required working storagemore » of size O(7 n ) for multiplying 2 n × 2 n matrices. We present a modified formulation in which the working storage requirement is reduced to O(4 n ). The modified formulation exhibits sufficient parallelism for efficient implementation on a shared memory multiprocessor. Performance results on a Cray Y-MP8/64 are presented.« less
Adaptive mesh refinement for characteristic grids

NASA Astrophysics Data System (ADS)

Thornburg, Jonathan

2011-05-01

I consider techniques for Berger-Oliger adaptive mesh refinement (AMR) when numerically solving partial differential equations with wave-like solutions, using characteristic (double-null) grids. Such AMR algorithms are naturally recursive, and the best-known past Berger-Oliger characteristic AMR algorithm, that of Pretorius and Lehner (J Comp Phys 198:10, 2004), recurses on individual "diamond" characteristic grid cells. This leads to the use of fine-grained memory management, with individual grid cells kept in two-dimensional linked lists at each refinement level. This complicates the implementation and adds overhead in both space and time. Here I describe a Berger-Oliger characteristic AMR algorithm which instead recurses on null slices. This algorithm is very similar to the usual Cauchy Berger-Oliger algorithm, and uses relatively coarse-grained memory management, allowing entire null slices to be stored in contiguous arrays in memory. The algorithm is very efficient in both space and time. I describe discretizations yielding both second and fourth order global accuracy. My code implementing the algorithm described here is included in the electronic supplementary materials accompanying this paper, and is freely available to other researchers under the terms of the GNU general public license.
Set-Membership Identification for Robust Control Design

DTIC Science & Technology

1993-04-28

system G can be regarded as having no memory in (18) in terms of G and 0, we get of events prior to t = 1, the initial time. Roughly, this means all...algorithm in [1]. Also in our application, the size of the matrices involved is quite large and special attention should be paid to the memory ...management and algorithmic implementation; otherwise huge amounts of memory will be required to perform the optimization even for modest values of M and N
Non-tables look-up search algorithm for efficient H.264/AVC context-based adaptive variable length coding decoding

NASA Astrophysics Data System (ADS)

Han, Yishi; Luo, Zhixiao; Wang, Jianhua; Min, Zhixuan; Qin, Xinyu; Sun, Yunlong

2014-09-01

In general, context-based adaptive variable length coding (CAVLC) decoding in H.264/AVC standard requires frequent access to the unstructured variable length coding tables (VLCTs) and significant memory accesses are consumed. Heavy memory accesses will cause high power consumption and time delays, which are serious problems for applications in portable multimedia devices. We propose a method for high-efficiency CAVLC decoding by using a program instead of all the VLCTs. The decoded codeword from VLCTs can be obtained without any table look-up and memory access. The experimental results show that the proposed algorithm achieves 100% memory access saving and 40% decoding time saving without degrading video quality. Additionally, the proposed algorithm shows a better performance compared with conventional CAVLC decoding, such as table look-up by sequential search, table look-up by binary search, Moon's method, and Kim's method.
On the impact of communication complexity in the design of parallel numerical algorithms

NASA Technical Reports Server (NTRS)

Gannon, D.; Vanrosendale, J.

1984-01-01

This paper describes two models of the cost of data movement in parallel numerical algorithms. One model is a generalization of an approach due to Hockney, and is suitable for shared memory multiprocessors where each processor has vector capabilities. The other model is applicable to highly parallel nonshared memory MIMD systems. In the second model, algorithm performance is characterized in terms of the communication network design. Techniques used in VLSI complexity theory are also brought in, and algorithm independent upper bounds on system performance are derived for several problems that are important to scientific computation.
Memory-efficient table look-up optimized algorithm for context-based adaptive variable length decoding in H.264/advanced video coding

NASA Astrophysics Data System (ADS)

Wang, Jianhua; Cheng, Lianglun; Wang, Tao; Peng, Xiaodong

2016-03-01

Table look-up operation plays a very important role during the decoding processing of context-based adaptive variable length decoding (CAVLD) in H.264/advanced video coding (AVC). However, frequent table look-up operation can result in big table memory access, and then lead to high table power consumption. Aiming to solve the problem of big table memory access of current methods, and then reduce high power consumption, a memory-efficient table look-up optimized algorithm is presented for CAVLD. The contribution of this paper lies that index search technology is introduced to reduce big memory access for table look-up, and then reduce high table power consumption. Specifically, in our schemes, we use index search technology to reduce memory access by reducing the searching and matching operations for code_word on the basis of taking advantage of the internal relationship among length of zero in code_prefix, value of code_suffix and code_lengh, thus saving the power consumption of table look-up. The experimental results show that our proposed table look-up algorithm based on index search can lower about 60% memory access consumption compared with table look-up by sequential search scheme, and then save much power consumption for CAVLD in H.264/AVC.
Think globally and solve locally: secondary memory-based network learning for automated multi-species function prediction

PubMed Central

2014-01-01

Background Network-based learning algorithms for automated function prediction (AFP) are negatively affected by the limited coverage of experimental data and limited a priori known functional annotations. As a consequence their application to model organisms is often restricted to well characterized biological processes and pathways, and their effectiveness with poorly annotated species is relatively limited. A possible solution to this problem might consist in the construction of big networks including multiple species, but this in turn poses challenging computational problems, due to the scalability limitations of existing algorithms and the main memory requirements induced by the construction of big networks. Distributed computation or the usage of big computers could in principle respond to these issues, but raises further algorithmic problems and require resources not satisfiable with simple off-the-shelf computers. Results We propose a novel framework for scalable network-based learning of multi-species protein functions based on both a local implementation of existing algorithms and the adoption of innovative technologies: we solve “locally” the AFP problem, by designing “vertex-centric” implementations of network-based algorithms, but we do not give up thinking “globally” by exploiting the overall topology of the network. This is made possible by the adoption of secondary memory-based technologies that allow the efficient use of the large memory available on disks, thus overcoming the main memory limitations of modern off-the-shelf computers. This approach has been applied to the analysis of a large multi-species network including more than 300 species of bacteria and to a network with more than 200,000 proteins belonging to 13 Eukaryotic species. To our knowledge this is the first work where secondary-memory based network analysis has been applied to multi-species function prediction using biological networks with hundreds of thousands of proteins. Conclusions The combination of these algorithmic and technological approaches makes feasible the analysis of large multi-species networks using ordinary computers with limited speed and primary memory, and in perspective could enable the analysis of huge networks (e.g. the whole proteomes available in SwissProt), using well-equipped stand-alone machines. PMID:24843788
A low complexity reweighted proportionate affine projection algorithm with memory and row action projection

NASA Astrophysics Data System (ADS)

Liu, Jianming; Grant, Steven L.; Benesty, Jacob

2015-12-01

A new reweighted proportionate affine projection algorithm (RPAPA) with memory and row action projection (MRAP) is proposed in this paper. The reweighted PAPA is derived from a family of sparseness measures, which demonstrate performance similar to mu-law and the l 0 norm PAPA but with lower computational complexity. The sparseness of the channel is taken into account to improve the performance for dispersive system identification. Meanwhile, the memory of the filter's coefficients is combined with row action projections (RAP) to significantly reduce computational complexity. Simulation results demonstrate that the proposed RPAPA MRAP algorithm outperforms both the affine projection algorithm (APA) and PAPA, and has performance similar to l 0 PAPA and mu-law PAPA, in terms of convergence speed and tracking ability. Meanwhile, the proposed RPAPA MRAP has much lower computational complexity than PAPA, mu-law PAPA, and l 0 PAPA, etc., which makes it very appealing for real-time implementation.
Time and Memory Efficient Online Piecewise Linear Approximation of Sensor Signals.

PubMed

Grützmacher, Florian; Beichler, Benjamin; Hein, Albert; Kirste, Thomas; Haubelt, Christian

2018-05-23

Piecewise linear approximation of sensor signals is a well-known technique in the fields of Data Mining and Activity Recognition. In this context, several algorithms have been developed, some of them with the purpose to be performed on resource constrained microcontroller architectures of wireless sensor nodes. While microcontrollers are usually constrained in computational power and memory resources, all state-of-the-art piecewise linear approximation techniques either need to buffer sensor data or have an execution time depending on the segment’s length. In the paper at hand, we propose a novel piecewise linear approximation algorithm, with a constant computational complexity as well as a constant memory complexity. Our proposed algorithm’s worst-case execution time is one to three orders of magnitude smaller and its average execution time is three to seventy times smaller compared to the state-of-the-art Piecewise Linear Approximation (PLA) algorithms in our experiments. In our evaluations, we show that our algorithm is time and memory efficient without sacrificing the approximation quality compared to other state-of-the-art piecewise linear approximation techniques, while providing a maximum error guarantee per segment, a small parameter space of only one parameter, and a maximum latency of one sample period plus its worst-case execution time.
A Limited-Memory BFGS Algorithm Based on a Trust-Region Quadratic Model for Large-Scale Nonlinear Equations.

PubMed

Li, Yong; Yuan, Gonglin; Wei, Zengxin

2015-01-01

In this paper, a trust-region algorithm is proposed for large-scale nonlinear equations, where the limited-memory BFGS (L-M-BFGS) update matrix is used in the trust-region subproblem to improve the effectiveness of the algorithm for large-scale problems. The global convergence of the presented method is established under suitable conditions. The numerical results of the test problems show that the method is competitive with the norm method.
Parallel Clustering Algorithm for Large-Scale Biological Data Sets

PubMed Central

Wang, Minchao; Zhang, Wu; Ding, Wang; Dai, Dongbo; Zhang, Huiran; Xie, Hao; Chen, Luonan; Guo, Yike; Xie, Jiang

2014-01-01

Backgrounds Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs. Methods Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes. Result A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies. PMID:24705246
Does the Acquisition of Spatial Skill Involve a Shift from Algorithm to Memory Retrieval?

ERIC Educational Resources Information Center

Frank, David J.; Macnamara, Brooke N.

2017-01-01

Performance on verbal and mathematical tasks is enhanced when participants shift from using algorithms to retrieving information directly from memory (Siegler, 1988a). However, it is unknown whether a shift to retrieval is involved in dynamic spatial skill acquisition. For example, do athletes mentally extrapolate the trajectory of the ball, or do…
Reed-Solomon decoder

NASA Technical Reports Server (NTRS)

Lahmeyer, Charles R. (Inventor)

1987-01-01

A Reed-Solomon decoder with dedicated hardware for five sequential algorithms was designed with overall pipelining by memory swapping between input, processing and output memories, and internal pipelining through the five algorithms. The code definition used in decoding is specified by a keyword received with each block of data so that a number of different code formats may be decoded by the same hardware.

Kinetic memory based on the enzyme-limited competition.

PubMed

Hatakeyama, Tetsuhiro S; Kaneko, Kunihiko

2014-08-01

Cellular memory, which allows cells to retain information from their environment, is important for a variety of cellular functions, such as adaptation to external stimuli, cell differentiation, and synaptic plasticity. Although posttranslational modifications have received much attention as a source of cellular memory, the mechanisms directing such alterations have not been fully uncovered. It may be possible to embed memory in multiple stable states in dynamical systems governing modifications. However, several experiments on modifications of proteins suggest long-term relaxation depending on experienced external conditions, without explicit switches over multi-stable states. As an alternative to a multistability memory scheme, we propose "kinetic memory" for epigenetic cellular memory, in which memory is stored as a slow-relaxation process far from a stable fixed state. Information from previous environmental exposure is retained as the long-term maintenance of a cellular state, rather than switches over fixed states. To demonstrate this kinetic memory, we study several models in which multimeric proteins undergo catalytic modifications (e.g., phosphorylation and methylation), and find that a slow relaxation process of the modification state, logarithmic in time, appears when the concentration of a catalyst (enzyme) involved in the modification reactions is lower than that of the substrates. Sharp transitions from a normal fast-relaxation phase into this slow-relaxation phase are revealed, and explained by enzyme-limited competition among modification reactions. The slow-relaxation process is confirmed by simulations of several models of catalytic reactions of protein modifications, and it enables the memorization of external stimuli, as its time course depends crucially on the history of the stimuli. This kinetic memory provides novel insight into a broad class of cellular memory and functions. In particular, applications for long-term potentiation are discussed, including dynamic modifications of calcium-calmodulin kinase II and cAMP-response element-binding protein essential for synaptic plasticity.
A memory-efficient staining algorithm in 3D seismic modelling and imaging

NASA Astrophysics Data System (ADS)

Jia, Xiaofeng; Yang, Lu

2017-08-01

The staining algorithm has been proven to generate high signal-to-noise ratio (S/N) images in poorly illuminated areas in two-dimensional cases. In the staining algorithm, the stained wavefield relevant to the target area and the regular source wavefield forward propagate synchronously. Cross-correlating these two wavefields with the backward propagated receiver wavefield separately, we obtain two images: the local image of the target area and the conventional reverse time migration (RTM) image. This imaging process costs massive computer memory for wavefield storage, especially in large scale three-dimensional cases. To make the staining algorithm applicable to three-dimensional RTM, we develop a method to implement the staining algorithm in three-dimensional acoustic modelling in a standard staggered grid finite difference (FD) scheme. The implementation is adaptive to the order of spatial accuracy of the FD operator. The method can be applied to elastic, electromagnetic, and other wave equations. Taking the memory requirement into account, we adopt a random boundary condition (RBC) to backward extrapolate the receiver wavefield and reconstruct it by reverse propagation using the final wavefield snapshot only. Meanwhile, we forward simulate the stained wavefield and source wavefield simultaneously using the nearly perfectly matched layer (NPML) boundary condition. Experiments on a complex geologic model indicate that the RBC-NPML collaborative strategy not only minimizes the memory consumption but also guarantees high quality imaging results. We apply the staining algorithm to three-dimensional RTM via the proposed strategy. Numerical results show that our staining algorithm can produce high S/N images in the target areas with other structures effectively muted.
The working memory stroop effect: when internal representations clash with external stimuli.

PubMed

Kiyonaga, Anastasia; Egner, Tobias

2014-08-01

Working memory (WM) has recently been described as internally directed attention, which implies that WM content should affect behavior exactly like an externally perceived and attended stimulus. We tested whether holding a color word in WM, rather than attending to it in the external environment, can produce interference in a color-discrimination task, which would mimic the classic Stroop effect. Over three experiments, the WM Stroop effect recapitulated core properties of the classic attentional Stroop effect, displaying equivalent congruency effects, additive contributions from stimulus- and response-level congruency, and susceptibility to modulation by the percentage of congruent and incongruent trials. Moreover, WM maintenance was inversely related to attentional demands during the WM delay between stimulus presentation and recall, with poorer memory performance following incongruent than congruent trials. Together, these results suggest that WM and attention rely on the same resources and operate over the same representations. © The Author(s) 2014.
An algorithm of discovering signatures from DNA databases on a computer cluster.

PubMed

Lee, Hsiao Ping; Sheu, Tzu-Fang

2014-10-05

Signatures are short sequences that are unique and not similar to any other sequence in a database that can be used as the basis to identify different species. Even though several signature discovery algorithms have been proposed in the past, these algorithms require the entirety of databases to be loaded in the memory, thus restricting the amount of data that they can process. It makes those algorithms unable to process databases with large amounts of data. Also, those algorithms use sequential models and have slower discovery speeds, meaning that the efficiency can be improved. In this research, we are debuting the utilization of a divide-and-conquer strategy in signature discovery and have proposed a parallel signature discovery algorithm on a computer cluster. The algorithm applies the divide-and-conquer strategy to solve the problem posed to the existing algorithms where they are unable to process large databases and uses a parallel computing mechanism to effectively improve the efficiency of signature discovery. Even when run with just the memory of regular personal computers, the algorithm can still process large databases such as the human whole-genome EST database which were previously unable to be processed by the existing algorithms. The algorithm proposed in this research is not limited by the amount of usable memory and can rapidly find signatures in large databases, making it useful in applications such as Next Generation Sequencing and other large database analysis and processing. The implementation of the proposed algorithm is available at http://www.cs.pu.edu.tw/~fang/DDCSDPrograms/DDCSD.htm.
The BlueGene/L supercomputer

NASA Astrophysics Data System (ADS)

Bhanota, Gyan; Chen, Dong; Gara, Alan; Vranas, Pavlos

2003-05-01

The architecture of the BlueGene/L massively parallel supercomputer is described. Each computing node consists of a single compute ASIC plus 256 MB of external memory. The compute ASIC integrates two 700 MHz PowerPC 440 integer CPU cores, two 2.8 Gflops floating point units, 4 MB of embedded DRAM as cache, a memory controller for external memory, six 1.4 Gbit/s bi-directional ports for a 3-dimensional torus network connection, three 2.8 Gbit/s bi-directional ports for connecting to a global tree network and a Gigabit Ethernet for I/O. 65,536 of such nodes are connected into a 3-d torus with a geometry of 32×32×64. The total peak performance of the system is 360 Teraflops and the total amount of memory is 16 TeraBytes.
Effects of Rehearsal on Perceived and Imagined Autobiographical Memories.

ERIC Educational Resources Information Center

Suengas, Aurora G.; Johnson, Marcia K.

It has been shown that internally generated (thought or imagination) and externally generated (events, things, or people encountered in the past) autobiographical memories differ in characteristic ways. To examine the consequences of rehearsal on simulated perceived and imagined autobiographical memories, 36 undergraduate students participated in…
A Self-Regulatory Model of Behavioral Disinhibition in Late Adolescence: Integrating Personality Traits, Externalizing Psychopathology, and Cognitive Capacity

PubMed Central

Bogg, Tim; Finn, Peter R.

2011-01-01

Two samples with heterogeneous prevalence of externalizing psychopathology were used to investigate the structure of self-regulatory models of behavioral disinhibition and cognitive capacity. Consistent with expectations, structural equation modeling in the first sample (N = 541) showed a hierarchical model with three lower-order factors of impulsive sensation-seeking, anti-sociality/unconventionality, and lifetime externalizing problem counts, with a behavioral disinhibition superfactor best accounted for the pattern of covariation among six disinhibited personality trait indicators and four externalizing problem indicators. The structure was replicated in a second sample (N = 463) and showed that the behavioral disinhibition superfactor, and not the lower-order impulsive sensation-seeking, anti-sociality/unconventionality, and externalizing problem factors, was associated with lower IQ, reduced short-term memory capacity, and reduced working memory capacity. The results provide a systemic and meaningful integration of major self-regulatory influences during a developmentally important stage of life. PMID:20433626
Fast analysis of molecular dynamics trajectories with graphics processing units-Radial distribution function histogramming

DOE Office of Scientific and Technical Information (OSTI.GOV)

Levine, Benjamin G., E-mail: ben.levine@temple.ed; Stone, John E., E-mail: johns@ks.uiuc.ed; Kohlmeyer, Axel, E-mail: akohlmey@temple.ed

2011-05-01

The calculation of radial distribution functions (RDFs) from molecular dynamics trajectory data is a common and computationally expensive analysis task. The rate limiting step in the calculation of the RDF is building a histogram of the distance between atom pairs in each trajectory frame. Here we present an implementation of this histogramming scheme for multiple graphics processing units (GPUs). The algorithm features a tiling scheme to maximize the reuse of data at the fastest levels of the GPU's memory hierarchy and dynamic load balancing to allow high performance on heterogeneous configurations of GPUs. Several versions of the RDF algorithm aremore » presented, utilizing the specific hardware features found on different generations of GPUs. We take advantage of larger shared memory and atomic memory operations available on state-of-the-art GPUs to accelerate the code significantly. The use of atomic memory operations allows the fast, limited-capacity on-chip memory to be used much more efficiently, resulting in a fivefold increase in performance compared to the version of the algorithm without atomic operations. The ultimate version of the algorithm running in parallel on four NVIDIA GeForce GTX 480 (Fermi) GPUs was found to be 92 times faster than a multithreaded implementation running on an Intel Xeon 5550 CPU. On this multi-GPU hardware, the RDF between two selections of 1,000,000 atoms each can be calculated in 26.9 s per frame. The multi-GPU RDF algorithms described here are implemented in VMD, a widely used and freely available software package for molecular dynamics visualization and analysis.« less
Fast Analysis of Molecular Dynamics Trajectories with Graphics Processing Units—Radial Distribution Function Histogramming

PubMed Central

Stone, John E.; Kohlmeyer, Axel

2011-01-01

The calculation of radial distribution functions (RDFs) from molecular dynamics trajectory data is a common and computationally expensive analysis task. The rate limiting step in the calculation of the RDF is building a histogram of the distance between atom pairs in each trajectory frame. Here we present an implementation of this histogramming scheme for multiple graphics processing units (GPUs). The algorithm features a tiling scheme to maximize the reuse of data at the fastest levels of the GPU’s memory hierarchy and dynamic load balancing to allow high performance on heterogeneous configurations of GPUs. Several versions of the RDF algorithm are presented, utilizing the specific hardware features found on different generations of GPUs. We take advantage of larger shared memory and atomic memory operations available on state-of-the-art GPUs to accelerate the code significantly. The use of atomic memory operations allows the fast, limited-capacity on-chip memory to be used much more efficiently, resulting in a fivefold increase in performance compared to the version of the algorithm without atomic operations. The ultimate version of the algorithm running in parallel on four NVIDIA GeForce GTX 480 (Fermi) GPUs was found to be 92 times faster than a multithreaded implementation running on an Intel Xeon 5550 CPU. On this multi-GPU hardware, the RDF between two selections of 1,000,000 atoms each can be calculated in 26.9 seconds per frame. The multi-GPU RDF algorithms described here are implemented in VMD, a widely used and freely available software package for molecular dynamics visualization and analysis. PMID:21547007
SUBJECTIVE MEMORY IN OLDER AFRICAN AMERICANS

PubMed Central

Sims, Regina C.; Whitfield, Keith E.; Ayotte, Brian J.; Gamaldo, Alyssa A.; Edwards, Christopher L.; Allaire, Jason C.

2013-01-01

The current analysis examined (a) if measures of psychological well-being predict subjective memory, and (b) if subjective memory is consistent with actual memory. Five hundred seventy-nine older African Americans from the Baltimore Study of Black Aging completed measures assessing subjective memory, depressive symptomatology, perceived stress, locus of control, and verbal and working memory. Higher levels of perceived stress and greater externalized locus of control predicted poorer subjective memory, but subjective memory did not predict objective verbal or working memory. Results suggest that subjective memory is influenced by aspects of psychological well-being but is unrelated to objective memory in older African Americans. PMID:21424958
Pattern Discovery and Change Detection of Online Music Query Streams

NASA Astrophysics Data System (ADS)

Li, Hua-Fu

In this paper, an efficient stream mining algorithm, called FTP-stream (Frequent Temporal Pattern mining of streams), is proposed to find the frequent temporal patterns over melody sequence streams. In the framework of our proposed algorithm, an effective bit-sequence representation is used to reduce the time and memory needed to slide the windows. The FTP-stream algorithm can calculate the support threshold in only a single pass based on the concept of bit-sequence representation. It takes the advantage of "left" and "and" operations of the representation. Experiments show that the proposed algorithm only scans the music query stream once, and runs significant faster and consumes less memory than existing algorithms, such as SWFI-stream and Moment.
An Improved Recovery Algorithm for Decayed AES Key Schedule Images

NASA Astrophysics Data System (ADS)

Tsow, Alex

A practical algorithm that recovers AES key schedules from decayed memory images is presented. Halderman et al. [1] established this recovery capability, dubbed the cold-boot attack, as a serious vulnerability for several widespread software-based encryption packages. Our algorithm recovers AES-128 key schedules tens of millions of times faster than the original proof-of-concept release. In practice, it enables reliable recovery of key schedules at 70% decay, well over twice the decay capacity of previous methods. The algorithm is generalized to AES-256 and is empirically shown to recover 256-bit key schedules that have suffered 65% decay. When solutions are unique, the algorithm efficiently validates this property and outputs the solution for memory images decayed up to 60%.
An efficient tensor transpose algorithm for multicore CPU, Intel Xeon Phi, and NVidia Tesla GPU

NASA Astrophysics Data System (ADS)

Lyakh, Dmitry I.

2015-04-01

An efficient parallel tensor transpose algorithm is suggested for shared-memory computing units, namely, multicore CPU, Intel Xeon Phi, and NVidia GPU. The algorithm operates on dense tensors (multidimensional arrays) and is based on the optimization of cache utilization on x86 CPU and the use of shared memory on NVidia GPU. From the applied side, the ultimate goal is to minimize the overhead encountered in the transformation of tensor contractions into matrix multiplications in computer implementations of advanced methods of quantum many-body theory (e.g., in electronic structure theory and nuclear physics). A particular accent is made on higher-dimensional tensors that typically appear in the so-called multireference correlated methods of electronic structure theory. Depending on tensor dimensionality, the presented optimized algorithms can achieve an order of magnitude speedup on x86 CPUs and 2-3 times speedup on NVidia Tesla K20X GPU with respect to the naïve scattering algorithm (no memory access optimization). The tensor transpose routines developed in this work have been incorporated into a general-purpose tensor algebra library (TAL-SH).
Out-of-Core Streamline Visualization on Large Unstructured Meshes

NASA Technical Reports Server (NTRS)

Ueng, Shyh-Kuang; Sikorski, K.; Ma, Kwan-Liu

1997-01-01

It's advantageous for computational scientists to have the capability to perform interactive visualization on their desktop workstations. For data on large unstructured meshes, this capability is not generally available. In particular, particle tracing on unstructured grids can result in a high percentage of non-contiguous memory accesses and therefore may perform very poorly with virtual memory paging schemes. The alternative of visualizing a lower resolution of the data degrades the original high-resolution calculations. This paper presents an out-of-core approach for interactive streamline construction on large unstructured tetrahedral meshes containing millions of elements. The out-of-core algorithm uses an octree to partition and restructure the raw data into subsets stored into disk files for fast data retrieval. A memory management policy tailored to the streamline calculations is used such that during the streamline construction only a very small amount of data are brought into the main memory on demand. By carefully scheduling computation and data fetching, the overhead of reading data from the disk is significantly reduced and good memory performance results. This out-of-core algorithm makes possible interactive streamline visualization of large unstructured-grid data sets on a single mid-range workstation with relatively low main-memory capacity: 5-20 megabytes. Our test results also show that this approach is much more efficient than relying on virtual memory and operating system's paging algorithms.
Block-Based Connected-Component Labeling Algorithm Using Binary Decision Trees

PubMed Central

Chang, Wan-Yu; Chiu, Chung-Cheng; Yang, Jia-Horng

2015-01-01

In this paper, we propose a fast labeling algorithm based on block-based concepts. Because the number of memory access points directly affects the time consumption of the labeling algorithms, the aim of the proposed algorithm is to minimize neighborhood operations. Our algorithm utilizes a block-based view and correlates a raster scan to select the necessary pixels generated by a block-based scan mask. We analyze the advantages of a sequential raster scan for the block-based scan mask, and integrate the block-connected relationships using two different procedures with binary decision trees to reduce unnecessary memory access. This greatly simplifies the pixel locations of the block-based scan mask. Furthermore, our algorithm significantly reduces the number of leaf nodes and depth levels required in the binary decision tree. We analyze the labeling performance of the proposed algorithm alongside that of other labeling algorithms using high-resolution images and foreground images. The experimental results from synthetic and real image datasets demonstrate that the proposed algorithm is faster than other methods. PMID:26393597
Interfacing External Quantum Devices to a Universal Quantum Computer

PubMed Central

Lagana, Antonio A.; Lohe, Max A.; von Smekal, Lorenz

2011-01-01

We present a scheme to use external quantum devices using the universal quantum computer previously constructed. We thereby show how the universal quantum computer can utilize networked quantum information resources to carry out local computations. Such information may come from specialized quantum devices or even from remote universal quantum computers. We show how to accomplish this by devising universal quantum computer programs that implement well known oracle based quantum algorithms, namely the Deutsch, Deutsch-Jozsa, and the Grover algorithms using external black-box quantum oracle devices. In the process, we demonstrate a method to map existing quantum algorithms onto the universal quantum computer. PMID:22216276
Interfacing external quantum devices to a universal quantum computer.

PubMed

Lagana, Antonio A; Lohe, Max A; von Smekal, Lorenz

2011-01-01

We present a scheme to use external quantum devices using the universal quantum computer previously constructed. We thereby show how the universal quantum computer can utilize networked quantum information resources to carry out local computations. Such information may come from specialized quantum devices or even from remote universal quantum computers. We show how to accomplish this by devising universal quantum computer programs that implement well known oracle based quantum algorithms, namely the Deutsch, Deutsch-Jozsa, and the Grover algorithms using external black-box quantum oracle devices. In the process, we demonstrate a method to map existing quantum algorithms onto the universal quantum computer. © 2011 Lagana et al.
Algorithm for optimizing bipolar interconnection weights with applications in associative memories and multitarget classification.

PubMed

Chang, S; Wong, K W; Zhang, W; Zhang, Y

1999-08-10

An algorithm for optimizing a bipolar interconnection weight matrix with the Hopfield network is proposed. The effectiveness of this algorithm is demonstrated by computer simulation and optical implementation. In the optical implementation of the neural network the interconnection weights are biased to yield a nonnegative weight matrix. Moreover, a threshold subchannel is added so that the system can realize, in real time, the bipolar weighted summation in a single channel. Preliminary experimental results obtained from the applications in associative memories and multitarget classification with rotation invariance are shown.
Algorithm for Optimizing Bipolar Interconnection Weights with Applications in Associative Memories and Multitarget Classification

NASA Astrophysics Data System (ADS)

Chang, Shengjiang; Wong, Kwok-Wo; Zhang, Wenwei; Zhang, Yanxin

1999-08-01

An algorithm for optimizing a bipolar interconnection weight matrix with the Hopfield network is proposed. The effectiveness of this algorithm is demonstrated by computer simulation and optical implementation. In the optical implementation of the neural network the interconnection weights are biased to yield a nonnegative weight matrix. Moreover, a threshold subchannel is added so that the system can realize, in real time, the bipolar weighted summation in a single channel. Preliminary experimental results obtained from the applications in associative memories and multitarget classification with rotation invariance are shown.
Scalable Triadic Analysis of Large-Scale Graphs: Multi-Core vs. Multi-Processor vs. Multi-Threaded Shared Memory Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chin, George; Marquez, Andres; Choudhury, Sutanay

2012-09-01

Triadic analysis encompasses a useful set of graph mining methods that is centered on the concept of a triad, which is a subgraph of three nodes and the configuration of directed edges across the nodes. Such methods are often applied in the social sciences as well as many other diverse fields. Triadic methods commonly operate on a triad census that counts the number of triads of every possible edge configuration in a graph. Like other graph algorithms, triadic census algorithms do not scale well when graphs reach tens of millions to billions of nodes. To enable the triadic analysis ofmore » large-scale graphs, we developed and optimized a triad census algorithm to efficiently execute on shared memory architectures. We will retrace the development and evolution of a parallel triad census algorithm. Over the course of several versions, we continually adapted the code’s data structures and program logic to expose more opportunities to exploit parallelism on shared memory that would translate into improved computational performance. We will recall the critical steps and modifications that occurred during code development and optimization. Furthermore, we will compare the performances of triad census algorithm versions on three specific systems: Cray XMT, HP Superdome, and AMD multi-core NUMA machine. These three systems have shared memory architectures but with markedly different hardware capabilities to manage parallelism.« less

Cache and energy efficient algorithms for Nussinov's RNA Folding.

PubMed

Zhao, Chunchun; Sahni, Sartaj

2017-12-06

An RNA folding/RNA secondary structure prediction algorithm determines the non-nested/pseudoknot-free structure by maximizing the number of complementary base pairs and minimizing the energy. Several implementations of Nussinov's classical RNA folding algorithm have been proposed. Our focus is to obtain run time and energy efficiency by reducing the number of cache misses. Three cache-efficient algorithms, ByRow, ByRowSegment and ByBox, for Nussinov's RNA folding are developed. Using a simple LRU cache model, we show that the Classical algorithm of Nussinov has the highest number of cache misses followed by the algorithms Transpose (Li et al.), ByRow, ByRowSegment, and ByBox (in this order). Extensive experiments conducted on four computational platforms-Xeon E5, AMD Athlon 64 X2, Intel I7 and PowerPC A2-using two programming languages-C and Java-show that our cache efficient algorithms are also efficient in terms of run time and energy. Our benchmarking shows that, depending on the computational platform and programming language, either ByRow or ByBox give best run time and energy performance. The C version of these algorithms reduce run time by as much as 97.2% and energy consumption by as much as 88.8% relative to Classical and by as much as 56.3% and 57.8% relative to Transpose. The Java versions reduce run time by as much as 98.3% relative to Classical and by as much as 75.2% relative to Transpose. Transpose achieves run time and energy efficiency at the expense of memory as it takes twice the memory required by Classical. The memory required by ByRow, ByRowSegment, and ByBox is the same as that of Classical. As a result, using the same amount of memory, the algorithms proposed by us can solve problems up to 40% larger than those solvable by Transpose.
A Parallel Rendering Algorithm for MIMD Architectures

NASA Technical Reports Server (NTRS)

Crockett, Thomas W.; Orloff, Tobias

1991-01-01

Applications such as animation and scientific visualization demand high performance rendering of complex three dimensional scenes. To deliver the necessary rendering rates, highly parallel hardware architectures are required. The challenge is then to design algorithms and software which effectively use the hardware parallelism. A rendering algorithm targeted to distributed memory MIMD architectures is described. For maximum performance, the algorithm exploits both object-level and pixel-level parallelism. The behavior of the algorithm is examined both analytically and experimentally. Its performance for large numbers of processors is found to be limited primarily by communication overheads. An experimental implementation for the Intel iPSC/860 shows increasing performance from 1 to 128 processors across a wide range of scene complexities. It is shown that minimal modifications to the algorithm will adapt it for use on shared memory architectures as well.
An implicit higher-order spatially accurate scheme for solving time dependent flows on unstructured meshes

NASA Astrophysics Data System (ADS)

Tomaro, Robert F.

1998-07-01

The present research is aimed at developing a higher-order, spatially accurate scheme for both steady and unsteady flow simulations using unstructured meshes. The resulting scheme must work on a variety of general problems to ensure the creation of a flexible, reliable and accurate aerodynamic analysis tool. To calculate the flow around complex configurations, unstructured grids and the associated flow solvers have been developed. Efficient simulations require the minimum use of computer memory and computational times. Unstructured flow solvers typically require more computer memory than a structured flow solver due to the indirect addressing of the cells. The approach taken in the present research was to modify an existing three-dimensional unstructured flow solver to first decrease the computational time required for a solution and then to increase the spatial accuracy. The terms required to simulate flow involving non-stationary grids were also implemented. First, an implicit solution algorithm was implemented to replace the existing explicit procedure. Several test cases, including internal and external, inviscid and viscous, two-dimensional, three-dimensional and axi-symmetric problems, were simulated for comparison between the explicit and implicit solution procedures. The increased efficiency and robustness of modified code due to the implicit algorithm was demonstrated. Two unsteady test cases, a plunging airfoil and a wing undergoing bending and torsion, were simulated using the implicit algorithm modified to include the terms required for a moving and/or deforming grid. Secondly, a higher than second-order spatially accurate scheme was developed and implemented into the baseline code. Third- and fourth-order spatially accurate schemes were implemented and tested. The original dissipation was modified to include higher-order terms and modified near shock waves to limit pre- and post-shock oscillations. The unsteady cases were repeated using the higher-order spatially accurate code. The new solutions were compared with those obtained using the second-order spatially accurate scheme. Finally, the increased efficiency of using an implicit solution algorithm in a production Computational Fluid Dynamics flow solver was demonstrated for steady and unsteady flows. A third- and fourth-order spatially accurate scheme has been implemented creating a basis for a state-of-the-art aerodynamic analysis tool.
WriteShield: A Pseudo Thin Client for Prevention of Information Leakage

NASA Astrophysics Data System (ADS)

Kirihata, Yasuhiro; Sameshima, Yoshiki; Onoyama, Takashi; Komoda, Norihisa

While thin-client systems are diffusing as an effective security method in enterprises and organizations, there is a new approach called pseudo thin-client system. In this system, local disks of clients are write-protected and user data is forced to save on the central file server to realize the same security effect of conventional thin-client systems. Since it takes purely the software-based simple approach, it does not require the hardware enhancement of network and servers to reduce the installation cost. However there are several problems such as no write control to external media, memory depletion possibility, and lower security because of the exceptional write permission to the system processes. In this paper, we propose WriteShield, a pseudo thin-client system which solves these issues. In this system, the local disks are write-protected with volume filter driver and it has a virtual cache mechanism to extend the memory cache size for the write protection. This paper presents design and implementation details of WriteShield. Besides we describe the security analysis and simulation evaluation of paging algorithms for virtual cache mechanism and measure the disk I/O performance to verify its feasibility in the actual environment.
The external-internal loop of interference: two types of attention and their influence on the learning abilities of mice.

PubMed

Sauce, Bruno; Wass, Christopher; Smith, Andrew; Kwan, Stephanie; Matzel, Louis D

2014-12-01

Attention is a component of the working memory system, and is responsible for protecting task-relevant information from interference. Cognitive performance (particularly outside of the laboratory) is often plagued by interference, and the source of this interference, either external or internal, might influence the expression of individual differences in attentional ability. By definition, external attention (also described as "selective attention") protects working memory against sensorial distractors of all kinds, while internal attention (also called "inhibition") protects working memory against emotional impulses, irrelevant information from memory, and automatically-generated responses. At present, it is unclear if these two types of attention are expressed independently in non-human animals, and how they might differentially impact performance on other cognitive processes, such as learning. By using a diverse battery of four attention tests (with varying levels of internal and external sources of interference), here we aimed both to explore this issue, and to obtain a robust and general (less task-specific) measure of attention in mice. Exploratory factor analyses revealed two factors (external and internal attention) that in total, accounted for 73% of the variance in attentional performance. Confirmatory factor analyses found an excellent fit with the data of the model of attention that assumed an external and internal distinction (with a resulting correlation of 0.43). In contrast, a model of attention that assumed one source of variance (i.e., "general attention") exhibited a poor fit with the data. Regarding the relationship between attention and learning, higher resistance against external sources of interference promoted better new learning, but tended to impair performance when cognitive flexibility was required, such as during the reversal of a previously instantiated response. The present results suggest that there can be (at least) two types of attention that contribute to the common variance in attentional performance in mice, and that external and internal attentions might have opposing influences on the rate at which animals learn. Copyright © 2014 Elsevier Inc. All rights reserved.
External details revisited - A new taxonomy for coding 'non-episodic' content during autobiographical memory retrieval.

PubMed

Strikwerda-Brown, Cherie; Mothakunnel, Annu; Hodges, John R; Piguet, Olivier; Irish, Muireann

2018-04-24

Autobiographical memory (ABM) is typically held to comprise episodic and semantic elements, with the vast majority of studies to date focusing on profiles of episodic details in health and disease. In this context, 'non-episodic' elements are often considered to reflect semantic processing or are discounted from analyses entirely. Mounting evidence suggests that rather than reflecting one unitary entity, semantic autobiographical information may contain discrete subcomponents, which vary in their relative degree of semantic or episodic content. This study aimed to (1) review the existing literature to formally characterize the variability in analysis of 'non-episodic' content (i.e., external details) on the Autobiographical Interview and (2) use these findings to create a theoretically grounded framework for coding external details. Our review exposed discrepancies in the reporting and interpretation of external details across studies, reinforcing the need for a new, consistent approach. We validated our new external details scoring protocol (the 'NExt' taxonomy) in patients with Alzheimer's disease (n = 18) and semantic dementia (n = 13), and 20 healthy older Control participants and compared profiles of the NExt subcategories across groups and time periods. Our results revealed increased sensitivity of the NExt taxonomy in discriminating between ABM profiles of patient groups, when compared to traditionally used internal and external detail metrics. Further, remote and recent autobiographical memories displayed distinct compositions of the NExt detail types. This study is the first to provide a fine-grained and comprehensive taxonomy to parse external details into intuitive subcategories and to validate this protocol in neurodegenerative disorders. © 2018 The British Psychological Society.
General purpose programmable accelerator board

DOEpatents

Robertson, Perry J.; Witzke, Edward L.

2001-01-01

A general purpose accelerator board and acceleration method comprising use of: one or more programmable logic devices; a plurality of memory blocks; bus interface for communicating data between the memory blocks and devices external to the board; and dynamic programming capabilities for providing logic to the programmable logic device to be executed on data in the memory blocks.
Autobiographical Thinking Interferes with Episodic Memory Consolidation

PubMed Central

Craig, Michael; Della Sala, Sergio; Dewar, Michaela

2014-01-01

New episodic memories are retained better if learning is followed by a few minutes of wakeful rest than by the encoding of novel external information. Novel encoding is said to interfere with the consolidation of recently acquired episodic memories. Here we report four experiments in which we examined whether autobiographical thinking, i.e. an ‘internal’ memory activity, also interferes with episodic memory consolidation. Participants were presented with three wordlists consisting of common nouns; one list was followed by wakeful rest, one by novel picture encoding and one by autobiographical retrieval/future imagination, cued by concrete sounds. Both novel encoding and autobiographical retrieval/future imagination lowered wordlist retention significantly. Follow-up experiments demonstrated that the interference by our cued autobiographical retrieval/future imagination delay condition could not be accounted for by the sound cues alone or by executive retrieval processes. Moreover, our results demonstrated evidence of a temporal gradient of interference across experiments. Thus, we propose that rich autobiographical retrieval/future imagination hampers the consolidation of recently acquired episodic memories and that such interference is particularly likely in the presence of external concrete cues. PMID:24736665
Autobiographical thinking interferes with episodic memory consolidation.

PubMed

Craig, Michael; Della Sala, Sergio; Dewar, Michaela

2014-01-01

New episodic memories are retained better if learning is followed by a few minutes of wakeful rest than by the encoding of novel external information. Novel encoding is said to interfere with the consolidation of recently acquired episodic memories. Here we report four experiments in which we examined whether autobiographical thinking, i.e. an 'internal' memory activity, also interferes with episodic memory consolidation. Participants were presented with three wordlists consisting of common nouns; one list was followed by wakeful rest, one by novel picture encoding and one by autobiographical retrieval/future imagination, cued by concrete sounds. Both novel encoding and autobiographical retrieval/future imagination lowered wordlist retention significantly. Follow-up experiments demonstrated that the interference by our cued autobiographical retrieval/future imagination delay condition could not be accounted for by the sound cues alone or by executive retrieval processes. Moreover, our results demonstrated evidence of a temporal gradient of interference across experiments. Thus, we propose that rich autobiographical retrieval/future imagination hampers the consolidation of recently acquired episodic memories and that such interference is particularly likely in the presence of external concrete cues.
External Memory Aid Preferences of Individuals with Mild Memory Impairments.

PubMed

Lanzi, Alyssa; Wallace, Sarah E; Bourgeois, Michelle S

2018-07-01

Individuals with mild memory impairments often rely on external memory aids (EMAs) to compensate for impaired cognitive abilities and to support independent completion of activities of daily living. These strategies are evidence based; however, professionals have limited knowledge regarding individual preferences and guidance on how to incorporate a person-centered approach into the EMA development phase. The purpose of the current study was to qualitatively investigate individuals' preferences and experiences as they relate to EMAs. Data analysis included (1) evaluation of a posttreatment questionnaire to explore individual strategy preferences following intervention and (2) evaluation of group intervention videos using thematic coding to investigate individuals' experiences with strategies during intervention. Results suggest that older adults with mild memory impairments have unique preferences and experiences, despite limited variability in demographic characteristics. Some themes that emerged included memory ability awareness and attitudes toward technology. Within a person-centered approach, skilled professionals must consider individuals' unique needs, preferences, and experiences when developing strategies throughout the continuum of care to promote sustained EMA use within everyday settings. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
A sparse matrix algorithm on the Boolean vector machine

NASA Technical Reports Server (NTRS)

Wagner, Robert A.; Patrick, Merrell L.

1988-01-01

VLSI technology is being used to implement a prototype Boolean Vector Machine (BVM), which is a large network of very small processors with equally small memories that operate in SIMD mode; these use bit-serial arithmetic, and communicate via cube-connected cycles network. The BVM's bit-serial arithmetic and the small memories of individual processors are noted to compromise the system's effectiveness in large numerical problem applications. Attention is presently given to the implementation of a basic matrix-vector iteration algorithm for space matrices of the BVM, in order to generate over 1 billion useful floating-point operations/sec for this iteration algorithm. The algorithm is expressed in a novel language designated 'BVM'.
Efficient frequent pattern mining algorithm based on node sets in cloud computing environment

NASA Astrophysics Data System (ADS)

Billa, V. N. Vinay Kumar; Lakshmanna, K.; Rajesh, K.; Reddy, M. Praveen Kumar; Nagaraja, G.; Sudheer, K.

2017-11-01

The ultimate goal of Data Mining is to determine the hidden information which is useful in making decisions using the large databases collected by an organization. This Data Mining involves many tasks that are to be performed during the process. Mining frequent itemsets is the one of the most important tasks in case of transactional databases. These transactional databases contain the data in very large scale where the mining of these databases involves the consumption of physical memory and time in proportion to the size of the database. A frequent pattern mining algorithm is said to be efficient only if it consumes less memory and time to mine the frequent itemsets from the given large database. Having these points in mind in this thesis we proposed a system which mines frequent itemsets in an optimized way in terms of memory and time by using cloud computing as an important factor to make the process parallel and the application is provided as a service. A complete framework which uses a proven efficient algorithm called FIN algorithm. FIN algorithm works on Nodesets and POC (pre-order coding) tree. In order to evaluate the performance of the system we conduct the experiments to compare the efficiency of the same algorithm applied in a standalone manner and in cloud computing environment on a real time data set which is traffic accidents data set. The results show that the memory consumption and execution time taken for the process in the proposed system is much lesser than those of standalone system.
Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space.

PubMed

Loewenstein, Yaniv; Portugaly, Elon; Fromer, Menachem; Linial, Michal

2008-07-01

UPGMA (average linking) is probably the most popular algorithm for hierarchical data clustering, especially in computational biology. However, UPGMA requires the entire dissimilarity matrix in memory. Due to this prohibitive requirement, UPGMA is not scalable to very large datasets. We present a novel class of memory-constrained UPGMA (MC-UPGMA) algorithms. Given any practical memory size constraint, this framework guarantees the correct clustering solution without explicitly requiring all dissimilarities in memory. The algorithms are general and are applicable to any dataset. We present a data-dependent characterization of hardness and clustering efficiency. The presented concepts are applicable to any agglomerative clustering formulation. We apply our algorithm to the entire collection of protein sequences, to automatically build a comprehensive evolutionary-driven hierarchy of proteins from sequence alone. The newly created tree captures protein families better than state-of-the-art large-scale methods such as CluSTr, ProtoNet4 or single-linkage clustering. We demonstrate that leveraging the entire mass embodied in all sequence similarities allows to significantly improve on current protein family clusterings which are unable to directly tackle the sheer mass of this data. Furthermore, we argue that non-metric constraints are an inherent complexity of the sequence space and should not be overlooked. The robustness of UPGMA allows significant improvement, especially for multidomain proteins, and for large or divergent families. A comprehensive tree built from all UniProt sequence similarities, together with navigation and classification tools will be made available as part of the ProtoNet service. A C++ implementation of the algorithm is available on request.
Brain Activity and Network Interactions in the Impact of Internal Emotional Distraction.

PubMed

Iordan, A D; Dolcos, S; Dolcos, F

2018-06-14

Emotional distraction may come from the external world and from our mind, as internal distraction. Although external emotional distraction has been extensively investigated, less is known about the mechanisms associated with the impact of internal emotional distraction on cognitive performance, and those involved in coping with such distraction. These issues were investigated using a working memory task with emotional distraction, where recollected unpleasant autobiographical memories served as internal emotional distraction. Emotion regulation was manipulated by instructing participants to focus their attention either on or away from the emotional aspects of their memories. Behaviorally, focusing away from emotion was associated with better working memory performance than focusing on the recollected emotions. Functional MRI data showed reduced response in brain regions associated with the salience network, coupled with greater recruitment of executive prefrontal and memory-related temporoparietal regions, and with increased frontoparietal connectivity, when subjects focused on nonemotional contextual details of their memories. Finally, temporal dissociations were also identified between regions involved in self-referential (showing faster responses) versus context-related processing (showing delayed responses). These findings demonstrate that focused attention is an effective regulation strategy in coping with internal distraction, and are relevant for understanding clinical conditions where coping with distressing memories is dysfunctional.
Two-way shape memory behavior of semi-crystalline elastomer under stress-free condition

NASA Astrophysics Data System (ADS)

Qian, Chen; Dong, Yubing; Zhu, Yaofeng; Fu, Yaqin

2016-08-01

Semi-crystalline shape memory polymers exhibit two-way shape memory effect (2W-SME) under constant stresses through crystallization-induced elongation upon cooling and melting-induced constriction upon heating. The applied constant stress influenced the prediction and usability of 2W-SME in practical applications without any external force. Here the reversible shape transition in EVA-shaped memory polymer was quantitative analyzed under a suitable temperature range and external stress-free condition. The fraction of reversible strain increased with increasing upper temperature (T high) within the temperature range and reached the maximum value of 13.62% at 70 °C. However, reversible strain transition was almost lost when T high exceeded 80 °C because of complete melting of crystalline scaffold, known as the latent recrystallization template. The non-isothermal annealing of EVA 2W-SMP under changing circulating temperatures was confirmed. Moreover, the orientation of crystallization was retained at high temperatures. These findings may contribute to design an appropriate shape memory protocol based on application-specific requirements.
Shared Memory Parallelization of an Implicit ADI-type CFD Code

NASA Technical Reports Server (NTRS)

Hauser, Th.; Huang, P. G.

1999-01-01

A parallelization study designed for ADI-type algorithms is presented using the OpenMP specification for shared-memory multiprocessor programming. Details of optimizations specifically addressed to cache-based computer architectures are described and performance measurements for the single and multiprocessor implementation are summarized. The paper demonstrates that optimization of memory access on a cache-based computer architecture controls the performance of the computational algorithm. A hybrid MPI/OpenMP approach is proposed for clusters of shared memory machines to further enhance the parallel performance. The method is applied to develop a new LES/DNS code, named LESTool. A preliminary DNS calculation of a fully developed channel flow at a Reynolds number of 180, Re(sub tau) = 180, has shown good agreement with existing data.
Medial prefrontal cortex supports source memory accuracy for self-referenced items.

PubMed

Leshikar, Eric D; Duarte, Audrey

2012-01-01

Previous behavioral work suggests that processing information in relation to the self enhances subsequent item recognition. Neuroimaging evidence further suggests that regions along the cortical midline, particularly those of the medial prefrontal cortex (PFC), underlie this benefit. There has been little work to date, however, on the effects of self-referential encoding on source memory accuracy or whether the medial PFC might contribute to source memory for self-referenced materials. In the current study, we used fMRI to measure neural activity while participants studied and subsequently retrieved pictures of common objects superimposed on one of two background scenes (sources) under either self-reference or self-external encoding instructions. Both item recognition and source recognition were better for objects encoded self-referentially than self-externally. Neural activity predictive of source accuracy was observed in the medial PFC (Brodmann area 10) at the time of study for self-referentially but not self-externally encoded objects. The results of this experiment suggest that processing information in relation to the self leads to a mnemonic benefit for source level features, and that activity in the medial PFC contributes to this source memory benefit. This evidence expands the purported role that the medial PFC plays in self-referencing.
External locus of control contributes to racial disparities in memory and reasoning training gains in ACTIVE

PubMed Central

Zahodne, Laura B.; Meyer, Oanh L.; Choi, Eunhee; Thomas, Michael L.; Willis, Sherry L.; Marsiske, Michael; Gross, Alden L.; Rebok, George W.; Parisi, Jeanine M.

2015-01-01

Racial disparities in cognitive outcomes may be partly explained by differences in locus of control. African Americans report more external locus of control than non-Hispanic Whites, and external locus of control is associated with poorer health and cognition. The aims of this study were to compare cognitive training gains between African American and non-Hispanic White participants in the Advanced Cognitive Training for Independent and Vital Elderly (ACTIVE) study and determine whether racial differences in training gains are mediated by locus of control. The sample comprised 2,062 (26% African American) adults aged 65 and older who participated in memory, reasoning, or speed training. Latent growth curve models evaluated predictors of 10-year cognitive trajectories separately by training group. Multiple group modeling examined associations between training gains and locus of control across racial groups. Compared to non-Hispanic Whites, African Americans evidenced less improvement in memory and reasoning performance after training. These effects were partially mediated by locus of control, controlling for age, sex, education, health, depression, testing site, and initial cognitive ability. African Americans reported more external locus of control, which was associated with smaller training gains. External locus of control also had a stronger negative association with reasoning training gain for African Americans than for Whites. No racial difference in training gain was identified for speed training. Future intervention research with African Americans should test whether explicitly targeting external locus of control leads to greater cognitive improvement following cognitive training. PMID:26237116
Human Memory Organization for Computer Programs.

ERIC Educational Resources Information Center

Norcio, A. F.; Kerst, Stephen M.

1983-01-01

Results of study investigating human memory organization in processing of computer programming languages indicate that algorithmic logic segments form a cognitive organizational structure in memory for programs. Statement indentation and internal program documentation did not enhance organizational process of recall of statements in five Fortran…
Effects of external intermittency and mean shear on the spectral inertial-range exponent in a turbulent square jet

NASA Astrophysics Data System (ADS)

Zhang, J.; Xu, M.; Pollard, A.; Mi, J.

2013-05-01

This study investigates by experiment the dependence of the inertial-range exponent m of the streamwise velocity spectrum on the external intermittency factor γ (≡ the fraction of time the flow is fully turbulent) and the mean shear S in a turbulent square jet. Velocity measurements were made using hot-wire anemometry in the jet at 15 < x/De < 40, where De denotes the exit equivalent diameter, and for an exit Reynolds number of Re = 50 000. The Taylor microscale Reynolds number Rλ varies from about 70 to 450 in the present study. The TERA (turbulent energy recognition algorithm) method proposed by Falco and Gendrich [in Near-Wall Turbulence: 1988 Zoran Zariç Memorial Conference, edited by S. J. Kline and N. H. Afgan (Hemisphere Publishing Corp., Washington, DC, 1990), pp. 911-931] is discussed and applied to estimate the intermittency factor from velocity signals. It is shown that m depends strongly on γ but negligibly on S. More specifically, m varies with γ following m=mt+(lnγ-0.0173)1/2, where mt denotes the spectral exponent found in fully turbulent regions.

Detecting Anomalies in Process Control Networks

NASA Astrophysics Data System (ADS)

Rrushi, Julian; Kang, Kyoung-Don

This paper presents the estimation-inspection algorithm, a statistical algorithm for anomaly detection in process control networks. The algorithm determines if the payload of a network packet that is about to be processed by a control system is normal or abnormal based on the effect that the packet will have on a variable stored in control system memory. The estimation part of the algorithm uses logistic regression integrated with maximum likelihood estimation in an inductive machine learning process to estimate a series of statistical parameters; these parameters are used in conjunction with logistic regression formulas to form a probability mass function for each variable stored in control system memory. The inspection part of the algorithm uses the probability mass functions to estimate the normalcy probability of a specific value that a network packet writes to a variable. Experimental results demonstrate that the algorithm is very effective at detecting anomalies in process control networks.
A sample implementation for parallelizing Divide-and-Conquer algorithms on the GPU.

PubMed

Mei, Gang; Zhang, Jiayin; Xu, Nengxiong; Zhao, Kunyang

2018-01-01

The strategy of Divide-and-Conquer (D&C) is one of the frequently used programming patterns to design efficient algorithms in computer science, which has been parallelized on shared memory systems and distributed memory systems. Tzeng and Owens specifically developed a generic paradigm for parallelizing D&C algorithms on modern Graphics Processing Units (GPUs). In this paper, by following the generic paradigm proposed by Tzeng and Owens, we provide a new and publicly available GPU implementation of the famous D&C algorithm, QuickHull, to give a sample and guide for parallelizing D&C algorithms on the GPU. The experimental results demonstrate the practicality of our sample GPU implementation. Our research objective in this paper is to present a sample GPU implementation of a classical D&C algorithm to help interested readers to develop their own efficient GPU implementations with fewer efforts.
Investigation of fast initialization of spacecraft bubble memory systems

NASA Technical Reports Server (NTRS)

Looney, K. T.; Nichols, C. D.; Hayes, P. J.

1984-01-01

Bubble domain technology offers significant improvement in reliability and functionality for spacecraft onboard memory applications. In considering potential memory systems organizations, minimization of power in high capacity bubble memory systems necessitates the activation of only the desired portions of the memory. In power strobing arbitrary memory segments, a capability of fast turn on is required. Bubble device architectures, which provide redundant loop coding in the bubble devices, limit the initialization speed. Alternate initialization techniques are investigated to overcome this design limitation. An initialization technique using a small amount of external storage is demonstrated.
Compacting de Bruijn graphs from sequencing data quickly and in low memory.

PubMed

Chikhi, Rayan; Limasset, Antoine; Medvedev, Paul

2016-06-15

As the quantity of data per sequencing experiment increases, the challenges of fragment assembly are becoming increasingly computational. The de Bruijn graph is a widely used data structure in fragment assembly algorithms, used to represent the information from a set of reads. Compaction is an important data reduction step in most de Bruijn graph based algorithms where long simple paths are compacted into single vertices. Compaction has recently become the bottleneck in assembly pipelines, and improving its running time and memory usage is an important problem. We present an algorithm and a tool bcalm 2 for the compaction of de Bruijn graphs. bcalm 2 is a parallel algorithm that distributes the input based on a minimizer hashing technique, allowing for good balance of memory usage throughout its execution. For human sequencing data, bcalm 2 reduces the computational burden of compacting the de Bruijn graph to roughly an hour and 3 GB of memory. We also applied bcalm 2 to the 22 Gbp loblolly pine and 20 Gbp white spruce sequencing datasets. Compacted graphs were constructed from raw reads in less than 2 days and 40 GB of memory on a single machine. Hence, bcalm 2 is at least an order of magnitude more efficient than other available methods. Source code of bcalm 2 is freely available at: https://github.com/GATB/bcalm rayan.chikhi@univ-lille1.fr. © The Author 2016. Published by Oxford University Press.
Exploring the effect of sleep and reduced interference on different forms of declarative memory.

PubMed

Schönauer, Monika; Pawlizki, Annedore; Köck, Corinna; Gais, Steffen

2014-12-01

Many studies have found that sleep benefits declarative memory consolidation. However, fundamental questions on the specifics of this effect remain topics of discussion. It is not clear which forms of memory are affected by sleep and whether this beneficial effect is partly mediated by passive protection against interference. Moreover, a putative correlation between the structure of sleep and its memory-enhancing effects is still being discussed. In three experiments, we tested whether sleep differentially affects various forms of declarative memory. We varied verbal content (verbal/nonverbal), item type (single/associate), and recall mode (recall/recognition, cued/free recall) to examine the effect of sleep on specific memory subtypes. We compared within-subject differences in memory consolidation between intervals including sleep, active wakefulness, or quiet meditation, which reduced external as well as internal interference and rehearsal. Forty healthy adults aged 18-30 y, and 17 healthy adults aged 24-55 y with extensive meditation experience participated in the experiments. All types of memory were enhanced by sleep if the sample size provided sufficient statistical power. Smaller sample sizes showed an effect of sleep if a combined measure of different declarative memory scales was used. In a condition with reduced external and internal interference, performance was equal to one with high interference. Here, memory consolidation was significantly lower than in a sleep condition. We found no correlation between sleep structure and memory consolidation. Sleep does not preferentially consolidate a specific kind of declarative memory, but consistently promotes overall declarative memory formation. This effect is not mediated by reduced interference. © 2014 Associated Professional Sleep Societies, LLC.
Limited-memory adaptive snapshot selection for proper orthogonal decomposition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oxberry, Geoffrey M.; Kostova-Vassilevska, Tanya; Arrighi, Bill

2015-04-02

Reduced order models are useful for accelerating simulations in many-query contexts, such as optimization, uncertainty quantification, and sensitivity analysis. However, offline training of reduced order models can have prohibitively expensive memory and floating-point operation costs in high-performance computing applications, where memory per core is limited. To overcome this limitation for proper orthogonal decomposition, we propose a novel adaptive selection method for snapshots in time that limits offline training costs by selecting snapshots according an error control mechanism similar to that found in adaptive time-stepping ordinary differential equation solvers. The error estimator used in this work is related to theory boundingmore » the approximation error in time of proper orthogonal decomposition-based reduced order models, and memory usage is minimized by computing the singular value decomposition using a single-pass incremental algorithm. Results for a viscous Burgers’ test problem demonstrate convergence in the limit as the algorithm error tolerances go to zero; in this limit, the full order model is recovered to within discretization error. The resulting method can be used on supercomputers to generate proper orthogonal decomposition-based reduced order models, or as a subroutine within hyperreduction algorithms that require taking snapshots in time, or within greedy algorithms for sampling parameter space.« less
Speeding up local correlation methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kats, Daniel

2014-12-28

We present two techniques that can substantially speed up the local correlation methods. The first one allows one to avoid the expensive transformation of the electron-repulsion integrals from atomic orbitals to virtual space. The second one introduces an algorithm for the residual equations in the local perturbative treatment that, in contrast to the standard scheme, does not require holding the amplitudes or residuals in memory. It is shown that even an interpreter-based implementation of the proposed algorithm in the context of local MP2 method is faster and requires less memory than the highly optimized variants of conventional algorithms.
On the impact of communication complexity on the design of parallel numerical algorithms

NASA Technical Reports Server (NTRS)

Gannon, D. B.; Van Rosendale, J.

1984-01-01

This paper describes two models of the cost of data movement in parallel numerical alorithms. One model is a generalization of an approach due to Hockney, and is suitable for shared memory multiprocessors where each processor has vector capabilities. The other model is applicable to highly parallel nonshared memory MIMD systems. In this second model, algorithm performance is characterized in terms of the communication network design. Techniques used in VLSI complexity theory are also brought in, and algorithm-independent upper bounds on system performance are derived for several problems that are important to scientific computation.
Tiled architecture of a CNN-mostly IP system

NASA Astrophysics Data System (ADS)

Spaanenburg, Lambert; Malki, Suleyman

2009-05-01

Multi-core architectures have been popularized with the advent of the IBM CELL. On a finer grain the problems in scheduling multi-cores have already existed in the tiled architectures, such as the EPIC and Da Vinci. It is not easy to evaluate the performance of a schedule on such architecture as historical data are not available. One solution is to compile algorithms for which an optimal schedule is known by analysis. A typical example is an algorithm that is already defined in terms of many collaborating simple nodes, such as a Cellular Neural Network (CNN). A simple node with a local register stack together with a 'rotating wheel' internal communication mechanism has been proposed. Though the basic CNN allows for a tiled implementation of a tiled algorithm on a tiled structure, a practical CNN system will have to disturb this regularity by the additional need for arithmetical and logical operations. Arithmetic operations are needed for instance to accommodate for low-level image processing, while logical operations are needed to fork and merge different data streams without use of the external memory. It is found that the 'rotating wheel' internal communication mechanism still handles such mechanisms without the need for global control. Overall the CNN system provides for a practical network size as implemented on a FPGA, can be easily used as embedded IP and provides a clear benchmark for a multi-core compiler.
Outflow monitoring of a pneumatic ventricular assist device using external pressure sensors.

PubMed

Kang, Seong Min; Her, Keun; Choi, Seong Wook

2016-08-25

In this study, a new algorithm was developed for estimating the pump outflow of a pneumatic ventricular assist device (p-VAD). The pump outflow estimation algorithm was derived from the ideal gas equation and determined the change in blood-sac volume of a p-VAD using two external pressure sensors. Based on in vitro experiments, the algorithm was revised to consider the effects of structural compliance caused by volume changes in an implanted unit, an air driveline, and the pressure difference between the sensors and the implanted unit. In animal experiments, p-VADs were connected to the left ventricles and the descending aorta of three calves (70-100 kg). Their outflows were estimated using the new algorithm and compared to the results obtained using an ultrasonic blood flow meter (UBF) (TS-410, Transonic Systems Inc., Ithaca, NY, USA). The estimated and measured values had a Pearson's correlation coefficient of 0.864. The pressure sensors were installed at the external controller and connected to the air driveline on the same side as the external actuator, which made the sensors easy to manage.
An External Archive-Guided Multiobjective Particle Swarm Optimization Algorithm.

PubMed

Zhu, Qingling; Lin, Qiuzhen; Chen, Weineng; Wong, Ka-Chun; Coello Coello, Carlos A; Li, Jianqiang; Chen, Jianyong; Zhang, Jun

2017-09-01

The selection of swarm leaders (i.e., the personal best and global best), is important in the design of a multiobjective particle swarm optimization (MOPSO) algorithm. Such leaders are expected to effectively guide the swarm to approach the true Pareto optimal front. In this paper, we present a novel external archive-guided MOPSO algorithm (AgMOPSO), where the leaders for velocity update are all selected from the external archive. In our algorithm, multiobjective optimization problems (MOPs) are transformed into a set of subproblems using a decomposition approach, and then each particle is assigned accordingly to optimize each subproblem. A novel archive-guided velocity update method is designed to guide the swarm for exploration, and the external archive is also evolved using an immune-based evolutionary strategy. These proposed approaches speed up the convergence of AgMOPSO. The experimental results fully demonstrate the superiority of our proposed AgMOPSO in solving most of the test problems adopted, in terms of two commonly used performance measures. Moreover, the effectiveness of our proposed archive-guided velocity update method and immune-based evolutionary strategy is also experimentally validated on more than 30 test MOPs.
Broadband multiresonator quantum memory-interface.

PubMed

Moiseev, S A; Gerasimov, K I; Latypov, R R; Perminov, N S; Petrovnin, K V; Sherstyukov, O N

2018-03-05

In this paper we experimentally demonstrated a broadband scheme of the multiresonator quantum memory-interface. The microwave photonic scheme consists of the system of mini-resonators strongly interacting with a common broadband resonator coupled with the external waveguide. We have implemented the impedance matched quantum storage in this scheme via controllable tuning of the mini-resonator frequencies and coupling of the common resonator with the external waveguide. Proof-of-principal experiment has been demonstrated for broadband microwave pulses when the quantum efficiency of 16.3% was achieved at room temperature. By using the obtained experimental spectroscopic data, the dynamics of the signal retrieval has been simulated and promising results were found for high-Q mini-resonators in microwave and optical frequency ranges. The results pave the way for the experimental implementation of broadband quantum memory-interface with quite high efficiency η > 0.99 on the basis of modern technologies, including optical quantum memory at room temperature.
An efficient tensor transpose algorithm for multicore CPU, Intel Xeon Phi, and NVidia Tesla GPU

DOE PAGES

Lyakh, Dmitry I.

2015-01-05

An efficient parallel tensor transpose algorithm is suggested for shared-memory computing units, namely, multicore CPU, Intel Xeon Phi, and NVidia GPU. The algorithm operates on dense tensors (multidimensional arrays) and is based on the optimization of cache utilization on x86 CPU and the use of shared memory on NVidia GPU. From the applied side, the ultimate goal is to minimize the overhead encountered in the transformation of tensor contractions into matrix multiplications in computer implementations of advanced methods of quantum many-body theory (e.g., in electronic structure theory and nuclear physics). A particular accent is made on higher-dimensional tensors that typicallymore » appear in the so-called multireference correlated methods of electronic structure theory. Depending on tensor dimensionality, the presented optimized algorithms can achieve an order of magnitude speedup on x86 CPUs and 2-3 times speedup on NVidia Tesla K20X GPU with respect to the na ve scattering algorithm (no memory access optimization). Furthermore, the tensor transpose routines developed in this work have been incorporated into a general-purpose tensor algebra library (TAL-SH).« less
Minimal-memory realization of pearl-necklace encoders of general quantum convolutional codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Houshmand, Monireh; Hosseini-Khayat, Saied

2011-02-15

Quantum convolutional codes, like their classical counterparts, promise to offer higher error correction performance than block codes of equivalent encoding complexity, and are expected to find important applications in reliable quantum communication where a continuous stream of qubits is transmitted. Grassl and Roetteler devised an algorithm to encode a quantum convolutional code with a ''pearl-necklace'' encoder. Despite their algorithm's theoretical significance as a neat way of representing quantum convolutional codes, it is not well suited to practical realization. In fact, there is no straightforward way to implement any given pearl-necklace structure. This paper closes the gap between theoretical representation andmore » practical implementation. In our previous work, we presented an efficient algorithm to find a minimal-memory realization of a pearl-necklace encoder for Calderbank-Shor-Steane (CSS) convolutional codes. This work is an extension of our previous work and presents an algorithm for turning a pearl-necklace encoder for a general (non-CSS) quantum convolutional code into a realizable quantum convolutional encoder. We show that a minimal-memory realization depends on the commutativity relations between the gate strings in the pearl-necklace encoder. We find a realization by means of a weighted graph which details the noncommutative paths through the pearl necklace. The weight of the longest path in this graph is equal to the minimal amount of memory needed to implement the encoder. The algorithm has a polynomial-time complexity in the number of gate strings in the pearl-necklace encoder.« less
Checkpoint-Restart in User Space

DOE Office of Scientific and Technical Information (OSTI.GOV)

CRUISE implements a user-space file system that stores data in main memory and transparently spills over to other storage, like local flash memory or the parallel file system, as needed. CRUISE also exposes file contents fo remote direct memory access, allowing external tools to copy files to the parallel file system in the background with reduced CPU interruption.
Impact of Noise and Working Memory on Speech Processing in Adults with and without ADHD

ERIC Educational Resources Information Center

Michalek, Anne M. P.

2012-01-01

Auditory processing of speech is influenced by internal (i.e., attention, working memory) and external factors (i.e., background noise, visual information). This study examined the interplay among these factors in individuals with and without ADHD. All participants completed a listening in noise task, two working memory capacity tasks, and two…
A Proposal of 3-dimensional Self-organizing Memory and Its Application to Knowledge Extraction from Natural Language

NASA Astrophysics Data System (ADS)

Sakakibara, Kai; Hagiwara, Masafumi

In this paper, we propose a 3-dimensional self-organizing memory and describe its application to knowledge extraction from natural language. First, the proposed system extracts a relation between words by JUMAN (morpheme analysis system) and KNP (syntax analysis system), and stores it in short-term memory. In the short-term memory, the relations are attenuated with the passage of processing. However, the relations with high frequency of appearance are stored in the long-term memory without attenuation. The relations in the long-term memory are placed to the proposed 3-dimensional self-organizing memory. We used a new learning algorithm called ``Potential Firing'' in the learning phase. In the recall phase, the proposed system recalls relational knowledge from the learned knowledge based on the input sentence. We used a new recall algorithm called ``Waterfall Recall'' in the recall phase. We added a function to respond to questions in natural language with ``yes/no'' in order to confirm the validity of proposed system by evaluating the quantity of correct answers.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Katti, Amogh; Di Fatta, Giuseppe; Naughton, Thomas

Future extreme-scale high-performance computing systems will be required to work under frequent component failures. The MPI Forum s User Level Failure Mitigation proposal has introduced an operation, MPI Comm shrink, to synchronize the alive processes on the list of failed processes, so that applications can continue to execute even in the presence of failures by adopting algorithm-based fault tolerance techniques. This MPI Comm shrink operation requires a failure detection and consensus algorithm. This paper presents three novel failure detection and consensus algorithms using Gossiping. The proposed algorithms were implemented and tested using the Extreme-scale Simulator. The results show that inmore » all algorithms the number of Gossip cycles to achieve global consensus scales logarithmically with system size. The second algorithm also shows better scalability in terms of memory and network bandwidth usage and a perfect synchronization in achieving global consensus. The third approach is a three-phase distributed failure detection and consensus algorithm and provides consistency guarantees even in very large and extreme-scale systems while at the same time being memory and bandwidth efficient.« less
[A quick algorithm of dynamic spectrum photoelectric pulse wave detection based on LabVIEW].

PubMed

Lin, Ling; Li, Na; Li, Gang

2010-02-01

Dynamic spectrum (DS) detection is attractive among the numerous noninvasive blood component detection methods because of the elimination of the main interference of the individual discrepancy and measure conditions. DS is a kind of spectrum extracted from the photoelectric pulse wave and closely relative to the artery blood. It can be used in a noninvasive blood component concentration examination. The key issues in DS detection are high detection precision and high operation speed. The precision of measure can be advanced by making use of over-sampling and lock-in amplifying on the pick-up of photoelectric pulse wave in DS detection. In the present paper, the theory expression formula of the over-sampling and lock-in amplifying method was deduced firstly. Then in order to overcome the problems of great data and excessive operation brought on by this technology, a quick algorithm based on LabVIEW and a method of using external C code applied in the pick-up of photoelectric pulse wave were presented. Experimental verification was conducted in the environment of LabVIEW. The results show that by the method pres ented, the speed of operation was promoted rapidly and the data memory was reduced largely.
A CCTV system with SMS alert (CMDSA): An implementation of pixel processing algorithm for motion detection

NASA Astrophysics Data System (ADS)

Rahman, Nurul Hidayah Ab; Abdullah, Nurul Azma; Hamid, Isredza Rahmi A.; Wen, Chuah Chai; Jelani, Mohamad Shafiqur Rahman Mohd

2017-10-01

Closed-Circuit TV (CCTV) system is one of the technologies in surveillance field to solve the problem of detection and monitoring by providing extra features such as email alert or motion detection. However, detecting and alerting the admin on CCTV system may complicate due to the complexity to integrate the main program with an external Application Programming Interface (API). In this study, pixel processing algorithm is applied due to its efficiency and SMS alert is added as an alternative solution for users who opted out email alert system or have no Internet connection. A CCTV system with SMS alert (CMDSA) was developed using evolutionary prototyping methodology. The system interface was implemented using Microsoft Visual Studio while the backend components, which are database and coding, were implemented on SQLite database and C# programming language, respectively. The main modules of CMDSA are motion detection, capturing and saving video, image processing and Short Message Service (SMS) alert functions. Subsequently, the system is able to reduce the processing time making the detection process become faster, reduce the space and memory used to run the program and alerting the system admin instantly.

Is random access memory random?

NASA Technical Reports Server (NTRS)

Denning, P. J.

1986-01-01

Most software is contructed on the assumption that the programs and data are stored in random access memory (RAM). Physical limitations on the relative speeds of processor and memory elements lead to a variety of memory organizations that match processor addressing rate with memory service rate. These include interleaved and cached memory. A very high fraction of a processor's address requests can be satified from the cache without reference to the main memory. The cache requests information from main memory in blocks that can be transferred at the full memory speed. Programmers who organize algorithms for locality can realize the highest performance from these computers.
Low-Light Image Enhancement Using Adaptive Digital Pixel Binning

PubMed Central

Yoo, Yoonjong; Im, Jaehyun; Paik, Joonki

2015-01-01

This paper presents an image enhancement algorithm for low-light scenes in an environment with insufficient illumination. Simple amplification of intensity exhibits various undesired artifacts: noise amplification, intensity saturation, and loss of resolution. In order to enhance low-light images without undesired artifacts, a novel digital binning algorithm is proposed that considers brightness, context, noise level, and anti-saturation of a local region in the image. The proposed algorithm does not require any modification of the image sensor or additional frame-memory; it needs only two line-memories in the image signal processor (ISP). Since the proposed algorithm does not use an iterative computation, it can be easily embedded in an existing digital camera ISP pipeline containing a high-resolution image sensor. PMID:26121609
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Chao; Pouransari, Hadi; Rajamanickam, Sivasankaran

We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by everymore » processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.« less
Dynamic programming on a shared-memory multiprocessor

NASA Technical Reports Server (NTRS)

Edmonds, Phil; Chu, Eleanor; George, Alan

1993-01-01

Three new algorithms for solving dynamic programming problems on a shared-memory parallel computer are described. All three algorithms attempt to balance work load, while keeping synchronization cost low. In particular, for a multiprocessor having p processors, an analysis of the best algorithm shows that the arithmetic cost is O(n-cubed/6p) and that the synchronization cost is O(absolute value of log sub C n) if p much less than n, where C = (2p-1)/(2p + 1) and n is the size of the problem. The low synchronization cost is important for machines where synchronization is expensive. Analysis and experiments show that the best algorithm is effective in balancing the work load and producing high efficiency.
GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies.

PubMed

Kim, Jeremie S; Senol Cali, Damla; Xin, Hongyi; Lee, Donghyuk; Ghose, Saugata; Alser, Mohammed; Hassan, Hasan; Ergin, Oguz; Alkan, Can; Mutlu, Onur

2018-05-09

Seed location filtering is critical in DNA read mapping, a process where billions of DNA fragments (reads) sampled from a donor are mapped onto a reference genome to identify genomic variants of the donor. State-of-the-art read mappers 1) quickly generate possible mapping locations for seeds (i.e., smaller segments) within each read, 2) extract reference sequences at each of the mapping locations, and 3) check similarity between each read and its associated reference sequences with a computationally-expensive algorithm (i.e., sequence alignment) to determine the origin of the read. A seed location filter comes into play before alignment, discarding seed locations that alignment would deem a poor match. The ideal seed location filter would discard all poor match locations prior to alignment such that there is no wasted computation on unnecessary alignments. We propose a novel seed location filtering algorithm, GRIM-Filter, optimized to exploit 3D-stacked memory systems that integrate computation within a logic layer stacked under memory layers, to perform processing-in-memory (PIM). GRIM-Filter quickly filters seed locations by 1) introducing a new representation of coarse-grained segments of the reference genome, and 2) using massively-parallel in-memory operations to identify read presence within each coarse-grained segment. Our evaluations show that for a sequence alignment error tolerance of 0.05, GRIM-Filter 1) reduces the false negative rate of filtering by 5.59x-6.41x, and 2) provides an end-to-end read mapper speedup of 1.81x-3.65x, compared to a state-of-the-art read mapper employing the best previous seed location filtering algorithm. GRIM-Filter exploits 3D-stacked memory, which enables the efficient use of processing-in-memory, to overcome the memory bandwidth bottleneck in seed location filtering. We show that GRIM-Filter significantly improves the performance of a state-of-the-art read mapper. GRIM-Filter is a universal seed location filter that can be applied to any read mapper. We hope that our results provide inspiration for new works to design other bioinformatics algorithms that take advantage of emerging technologies and new processing paradigms, such as processing-in-memory using 3D-stacked memory devices.
Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures

DTIC Science & Technology

2017-10-04

Report: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures The views, opinions and/or findings contained in this...Chapel Hill Title: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures Report Term: 0-Other Email: dm...algorithms for scientific and geometric computing by exploiting the power and performance efficiency of heterogeneous shared memory architectures . These
CaLRS: A Critical-Aware Shared LLC Request Scheduling Algorithm on GPGPU

PubMed Central

Ma, Jianliang; Meng, Jinglei; Chen, Tianzhou; Wu, Minghui

2015-01-01

Ultra high thread-level parallelism in modern GPUs usually introduces numerous memory requests simultaneously. So there are always plenty of memory requests waiting at each bank of the shared LLC (L2 in this paper) and global memory. For global memory, various schedulers have already been developed to adjust the request sequence. But we find few work has ever focused on the service sequence on the shared LLC. We measured that a big number of GPU applications always queue at LLC bank for services, which provide opportunity to optimize the service order on LLC. Through adjusting the GPU memory request service order, we can improve the schedulability of SM. So we proposed a critical-aware shared LLC request scheduling algorithm (CaLRS) in this paper. The priority representative of memory request is critical for CaLRS. We use the number of memory requests that originate from the same warp but have not been serviced when they arrive at the shared LLC bank to represent the criticality of each warp. Experiments show that the proposed scheme can boost the SM schedulability effectively by promoting the scheduling priority of the memory requests with high criticality and improves the performance of GPU indirectly. PMID:25729772
Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space

PubMed Central

Loewenstein, Yaniv; Portugaly, Elon; Fromer, Menachem; Linial, Michal

2008-01-01

Motivation: UPGMA (average linking) is probably the most popular algorithm for hierarchical data clustering, especially in computational biology. However, UPGMA requires the entire dissimilarity matrix in memory. Due to this prohibitive requirement, UPGMA is not scalable to very large datasets. Application: We present a novel class of memory-constrained UPGMA (MC-UPGMA) algorithms. Given any practical memory size constraint, this framework guarantees the correct clustering solution without explicitly requiring all dissimilarities in memory. The algorithms are general and are applicable to any dataset. We present a data-dependent characterization of hardness and clustering efficiency. The presented concepts are applicable to any agglomerative clustering formulation. Results: We apply our algorithm to the entire collection of protein sequences, to automatically build a comprehensive evolutionary-driven hierarchy of proteins from sequence alone. The newly created tree captures protein families better than state-of-the-art large-scale methods such as CluSTr, ProtoNet4 or single-linkage clustering. We demonstrate that leveraging the entire mass embodied in all sequence similarities allows to significantly improve on current protein family clusterings which are unable to directly tackle the sheer mass of this data. Furthermore, we argue that non-metric constraints are an inherent complexity of the sequence space and should not be overlooked. The robustness of UPGMA allows significant improvement, especially for multidomain proteins, and for large or divergent families. Availability: A comprehensive tree built from all UniProt sequence similarities, together with navigation and classification tools will be made available as part of the ProtoNet service. A C++ implementation of the algorithm is available on request. Contact: lonshy@cs.huji.ac.il PMID:18586742
Multiprocessor and memory architecture of the neurocomputer SYNAPSE-1.

PubMed

Ramacher, U; Raab, W; Anlauf, J; Hachmann, U; Beichter, J; Brüls, N; Wesseling, M; Sicheneder, E; Männer, R; Glass, J

1993-12-01

A general purpose neurocomputer, SYNAPSE-1, which exhibits a multiprocessor and memory architecture is presented. It offers wide flexibility with respect to neural algorithms and a speed-up factor of several orders of magnitude--including learning. The computational power is provided by a 2-dimensional systolic array of neural signal processors. Since the weights are stored outside these NSPs, memory size and processing power can be adapted individually to the application needs. A neural algorithms programming language, embedded in C(+2) has been defined for the user to cope with the neurocomputer. In a benchmark test, the prototype of SYNAPSE-1 was 8000 times as fast as a standard workstation.
An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes

DOE PAGES

Vincenti, H.; Lobet, M.; Lehe, R.; ...

2016-09-19

In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈20pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD registermore » length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scat;ter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of ×2 to ×2.5 speed-up in double precision for particle shape factor of orders 1–3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles). Program summary Program Title: vec_deposition Program Files doi:http://dx.doi.org/10.17632/nh77fv9k8c.1 Licensing provisions: BSD 3-Clause Programming language: Fortran 90 External routines/libraries: OpenMP > 4.0 Nature of problem: Exascale architectures will have many-core processors per node with long vector data registers capable of performing one single instruction on multiple data during one clock cycle. Data register lengths are expected to double every four years and this pushes for new portable solutions for efficiently vectorizing Particle-In-Cell codes on these future many-core architectures. One of the main hotspot routines of the PIC algorithm is the current/charge deposition for which there is no efficient and portable vector algorithm. Solution method: Here we provide an efficient and portable vector algorithm of current/charge deposition routines that uses a new data structure, which significantly reduces gather/scatter operations. Vectorization is controlled using OpenMP 4.0 compiler directives for vectorization which ensures portability across different architectures. Restrictions: Here we do not provide the full PIC algorithm with an executable but only vector routines for current/charge deposition. These scalar/vector routines can be used as library routines in your 3D Particle-In-Cell code. However, to get the best performances out of vector routines you have to satisfy the two following requirements: (1) Your code should implement particle tiling (as explained in the manuscript) to allow for maximized cache reuse and reduce memory accesses that can hinder vector performances. The routines can be used directly on each particle tile. (2) You should compile your code with a Fortran 90 compiler (e.g Intel, gnu or cray) and provide proper alignment flags and compiler alignment directives (more details in README file).« less
An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vincenti, H.; Lobet, M.; Lehe, R.

In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈20pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD registermore » length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scat;ter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of ×2 to ×2.5 speed-up in double precision for particle shape factor of orders 1–3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles). Program summary Program Title: vec_deposition Program Files doi:http://dx.doi.org/10.17632/nh77fv9k8c.1 Licensing provisions: BSD 3-Clause Programming language: Fortran 90 External routines/libraries: OpenMP > 4.0 Nature of problem: Exascale architectures will have many-core processors per node with long vector data registers capable of performing one single instruction on multiple data during one clock cycle. Data register lengths are expected to double every four years and this pushes for new portable solutions for efficiently vectorizing Particle-In-Cell codes on these future many-core architectures. One of the main hotspot routines of the PIC algorithm is the current/charge deposition for which there is no efficient and portable vector algorithm. Solution method: Here we provide an efficient and portable vector algorithm of current/charge deposition routines that uses a new data structure, which significantly reduces gather/scatter operations. Vectorization is controlled using OpenMP 4.0 compiler directives for vectorization which ensures portability across different architectures. Restrictions: Here we do not provide the full PIC algorithm with an executable but only vector routines for current/charge deposition. These scalar/vector routines can be used as library routines in your 3D Particle-In-Cell code. However, to get the best performances out of vector routines you have to satisfy the two following requirements: (1) Your code should implement particle tiling (as explained in the manuscript) to allow for maximized cache reuse and reduce memory accesses that can hinder vector performances. The routines can be used directly on each particle tile. (2) You should compile your code with a Fortran 90 compiler (e.g Intel, gnu or cray) and provide proper alignment flags and compiler alignment directives (more details in README file).« less
[Multi-Target Recognition of Internal and External Defects of Potato by Semi-Transmission Hyperspectral Imaging and Manifold Learning Algorithm].

PubMed

Huang, Tao; Li, Xiao-yu; Jin, Rui; Ku, Jing; Xu, Sen-miao; Xu, Meng-ling; Wu, Zhen-zhong; Kong, De-guo

2015-04-01

The present paper put forward a non-destructive detection method which combines semi-transmission hyperspectral imaging technology with manifold learning dimension reduction algorithm and least squares support vector machine (LSSVM) to recognize internal and external defects in potatoes simultaneously. Three hundred fifteen potatoes were bought in farmers market as research object, and semi-transmission hyperspectral image acquisition system was constructed to acquire the hyperspectral images of normal external defects (bud and green rind) and internal defect (hollow heart) potatoes. In order to conform to the actual production, defect part is randomly put right, side and back to the acquisition probe when the hyperspectral images of external defects potatoes are acquired. The average spectrums (390-1,040 nm) were extracted from the region of interests for spectral preprocessing. Then three kinds of manifold learning algorithm were respectively utilized to reduce the dimension of spectrum data, including supervised locally linear embedding (SLLE), locally linear embedding (LLE) and isometric mapping (ISOMAP), the low-dimensional data gotten by manifold learning algorithms is used as model input, Error Correcting Output Code (ECOC) and LSSVM were combined to develop the multi-target classification model. By comparing and analyzing results of the three models, we concluded that SLLE is the optimal manifold learning dimension reduction algorithm, and the SLLE-LSSVM model is determined to get the best recognition rate for recognizing internal and external defects potatoes. For test set data, the single recognition rate of normal, bud, green rind and hollow heart potato reached 96.83%, 86.96%, 86.96% and 95% respectively, and he hybrid recognition rate was 93.02%. The results indicate that combining the semi-transmission hyperspectral imaging technology with SLLE-LSSVM is a feasible qualitative analytical method which can simultaneously recognize the internal and external defects potatoes and also provide technical reference for rapid on-line non-destructive detecting of the internal and external defects potatoes.
Distributed-Memory Breadth-First Search on Massive Graphs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Buluc, Aydin; Beamer, Scott; Madduri, Kamesh

This chapter studies the problem of traversing large graphs using the breadth-first search order on distributed-memory supercomputers. We consider both the traditional level-synchronous top-down algorithm as well as the recently discovered direction optimizing algorithm. We analyze the performance and scalability trade-offs in using different local data structures such as CSR and DCSC, enabling in-node multithreading, and graph decompositions such as 1D and 2D decomposition.
A discrete Fourier transform for virtual memory machines

NASA Technical Reports Server (NTRS)

Galant, David C.

1992-01-01

An algebraic theory of the Discrete Fourier Transform is developed in great detail. Examination of the details of the theory leads to a computationally efficient fast Fourier transform for the use on computers with virtual memory. Such an algorithm is of great use on modern desktop machines. A FORTRAN coded version of the algorithm is given for the case when the sequence of numbers to be transformed is a power of two.
Supercomputing '91; Proceedings of the 4th Annual Conference on High Performance Computing, Albuquerque, NM, Nov. 18-22, 1991

NASA Technical Reports Server (NTRS)

1991-01-01

Various papers on supercomputing are presented. The general topics addressed include: program analysis/data dependence, memory access, distributed memory code generation, numerical algorithms, supercomputer benchmarks, latency tolerance, parallel programming, applications, processor design, networks, performance tools, mapping and scheduling, characterization affecting performance, parallelism packaging, computing climate change, combinatorial algorithms, hardware and software performance issues, system issues. (No individual items are abstracted in this volume)
High-Performance Computation of Distributed-Memory Parallel 3D Voronoi and Delaunay Tessellation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peterka, Tom; Morozov, Dmitriy; Phillips, Carolyn

2014-11-14

Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets: N-body simulations, molecular dynamics codes, and LIDAR point clouds are just a few examples. Such computational geometry methods are common in data analysis and visualization; but as the scale of simulations and observations surpasses billions of particles, the existing serial and shared-memory algorithms no longer suffice. A distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this paper is a new parallel Delaunay and Voronoi tessellation algorithm that automatically determines which neighbormore » points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include periodic and wall boundary conditions, comparison of our method using two popular serial libraries, and application to numerous science datasets.« less
Exploration of depth modeling mode one lossless wedgelets storage strategies for 3D-high efficiency video coding

NASA Astrophysics Data System (ADS)

Sanchez, Gustavo; Marcon, César; Agostini, Luciano Volcan

2018-01-01

The 3D-high efficiency video coding has introduced tools to obtain higher efficiency in 3-D video coding, and most of them are related to the depth maps coding. Among these tools, the depth modeling mode-1 (DMM-1) focuses on better encoding edges regions of depth maps. The large memory required for storing all wedgelet patterns is one of the bottlenecks in the DMM-1 hardware design of both encoder and decoder since many patterns must be stored. Three algorithms to reduce the DMM-1 memory requirements and a hardware design targeting the most efficient among these algorithms are presented. Experimental results demonstrate that the proposed solutions surpass related works reducing up to 78.8% of the wedgelet memory, without degrading the encoding efficiency. Synthesis results demonstrate that the proposed algorithm reduces almost 75% of the power dissipation when compared to the standard approach.
A GPU-Accelerated Approach for Feature Tracking in Time-Varying Imagery Datasets.

PubMed

Peng, Chao; Sahani, Sandip; Rushing, John

2017-10-01

We propose a novel parallel connected component labeling (CCL) algorithm along with efficient out-of-core data management to detect and track feature regions of large time-varying imagery datasets. Our approach contributes to the big data field with parallel algorithms tailored for GPU architectures. We remove the data dependency between frames and achieve pixel-level parallelism. Due to the large size, the entire dataset cannot fit into cached memory. Frames have to be streamed through the memory hierarchy (disk to CPU main memory and then to GPU memory), partitioned, and processed as batches, where each batch is small enough to fit into the GPU. To reconnect the feature regions that are separated due to data partitioning, we present a novel batch merging algorithm to extract the region connection information across multiple batches in a parallel fashion. The information is organized in a memory-efficient structure and supports fast indexing on the GPU. Our experiment uses a commodity workstation equipped with a single GPU. The results show that our approach can efficiently process a weather dataset composed of terabytes of time-varying radar images. The advantages of our approach are demonstrated by comparing to the performance of an efficient CPU cluster implementation which is being used by the weather scientists.
Efficient Algorithms for Estimating the Absorption Spectrum within Linear Response TDDFT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brabec, Jiri; Lin, Lin; Shao, Meiyue

We present a special symmetric Lanczos algorithm and a kernel polynomial method (KPM) for approximating the absorption spectrum of molecules within the linear response time-dependent density functional theory (TDDFT) framework in the product form. In contrast to existing algorithms, the new algorithms are based on reformulating the original non-Hermitian eigenvalue problem as a product eigenvalue problem and the observation that the product eigenvalue problem is self-adjoint with respect to an appropriately chosen inner product. This allows a simple symmetric Lanczos algorithm to be used to compute the desired absorption spectrum. The use of a symmetric Lanczos algorithm only requires halfmore » of the memory compared with the nonsymmetric variant of the Lanczos algorithm. The symmetric Lanczos algorithm is also numerically more stable than the nonsymmetric version. The KPM algorithm is also presented as a low-memory alternative to the Lanczos approach, but the algorithm may require more matrix-vector multiplications in practice. We discuss the pros and cons of these methods in terms of their accuracy as well as their computational and storage cost. Applications to a set of small and medium-sized molecules are also presented.« less
Efficient Algorithms for Estimating the Absorption Spectrum within Linear Response TDDFT

DOE PAGES

Brabec, Jiri; Lin, Lin; Shao, Meiyue; ...

2015-10-06

We present a special symmetric Lanczos algorithm and a kernel polynomial method (KPM) for approximating the absorption spectrum of molecules within the linear response time-dependent density functional theory (TDDFT) framework in the product form. In contrast to existing algorithms, the new algorithms are based on reformulating the original non-Hermitian eigenvalue problem as a product eigenvalue problem and the observation that the product eigenvalue problem is self-adjoint with respect to an appropriately chosen inner product. This allows a simple symmetric Lanczos algorithm to be used to compute the desired absorption spectrum. The use of a symmetric Lanczos algorithm only requires halfmore » of the memory compared with the nonsymmetric variant of the Lanczos algorithm. The symmetric Lanczos algorithm is also numerically more stable than the nonsymmetric version. The KPM algorithm is also presented as a low-memory alternative to the Lanczos approach, but the algorithm may require more matrix-vector multiplications in practice. We discuss the pros and cons of these methods in terms of their accuracy as well as their computational and storage cost. Applications to a set of small and medium-sized molecules are also presented.« less

Scalable Parallel Density-based Clustering and Applications

NASA Astrophysics Data System (ADS)

Patwary, Mostofa Ali

2014-04-01

Recently, density-based clustering algorithms (DBSCAN and OPTICS) have gotten significant attention of the scientific community due to their unique capability of discovering arbitrary shaped clusters and eliminating noise data. These algorithms have several applications, which require high performance computing, including finding halos and subhalos (clusters) from massive cosmology data in astrophysics, analyzing satellite images, X-ray crystallography, and anomaly detection. However, parallelization of these algorithms are extremely challenging as they exhibit inherent sequential data access order, unbalanced workload resulting in low parallel efficiency. To break the data access sequentiality and to achieve high parallelism, we develop new parallel algorithms, both for DBSCAN and OPTICS, designed using graph algorithmic techniques. For example, our parallel DBSCAN algorithm exploits the similarities between DBSCAN and computing connected components. Using datasets containing up to a billion floating point numbers, we show that our parallel density-based clustering algorithms significantly outperform the existing algorithms, achieving speedups up to 27.5 on 40 cores on shared memory architecture and speedups up to 5,765 using 8,192 cores on distributed memory architecture. In our experiments, we found that while achieving the scalability, our algorithms produce clustering results with comparable quality to the classical algorithms.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lyakh, Dmitry I.

An efficient parallel tensor transpose algorithm is suggested for shared-memory computing units, namely, multicore CPU, Intel Xeon Phi, and NVidia GPU. The algorithm operates on dense tensors (multidimensional arrays) and is based on the optimization of cache utilization on x86 CPU and the use of shared memory on NVidia GPU. From the applied side, the ultimate goal is to minimize the overhead encountered in the transformation of tensor contractions into matrix multiplications in computer implementations of advanced methods of quantum many-body theory (e.g., in electronic structure theory and nuclear physics). A particular accent is made on higher-dimensional tensors that typicallymore » appear in the so-called multireference correlated methods of electronic structure theory. Depending on tensor dimensionality, the presented optimized algorithms can achieve an order of magnitude speedup on x86 CPUs and 2-3 times speedup on NVidia Tesla K20X GPU with respect to the na ve scattering algorithm (no memory access optimization). Furthermore, the tensor transpose routines developed in this work have been incorporated into a general-purpose tensor algebra library (TAL-SH).« less
Implementation of real-time digital signal processing systems

NASA Technical Reports Server (NTRS)

Narasimha, M.; Peterson, A.; Narayan, S.

1978-01-01

Special purpose hardware implementation of DFT Computers and digital filters is considered in the light of newly introduced algorithms and IC devices. Recent work by Winograd on high-speed convolution techniques for computing short length DFT's, has motivated the development of more efficient algorithms, compared to the FFT, for evaluating the transform of longer sequences. Among these, prime factor algorithms appear suitable for special purpose hardware implementations. Architectural considerations in designing DFT computers based on these algorithms are discussed. With the availability of monolithic multiplier-accumulators, a direct implementation of IIR and FIR filters, using random access memories in place of shift registers, appears attractive. The memory addressing scheme involved in such implementations is discussed. A simple counter set-up to address the data memory in the realization of FIR filters is also described. The combination of a set of simple filters (weighting network) and a DFT computer is shown to realize a bank of uniform bandpass filters. The usefulness of this concept in arriving at a modular design for a million channel spectrum analyzer, based on microprocessors, is discussed.
An optimized treatment for algorithmic differentiation of an important glaciological fixed-point problem

DOE PAGES

Goldberg, Daniel N.; Narayanan, Sri Hari Krishna; Hascoet, Laurent; ...

2016-05-20

We apply an optimized method to the adjoint generation of a time-evolving land ice model through algorithmic differentiation (AD). The optimization involves a special treatment of the fixed-point iteration required to solve the nonlinear stress balance, which differs from a straightforward application of AD software, and leads to smaller memory requirements and in some cases shorter computation times of the adjoint. The optimization is done via implementation of the algorithm of Christianson (1994) for reverse accumulation of fixed-point problems, with the AD tool OpenAD. For test problems, the optimized adjoint is shown to have far lower memory requirements, potentially enablingmore » larger problem sizes on memory-limited machines. In the case of the land ice model, implementation of the algorithm allows further optimization by having the adjoint model solve a sequence of linear systems with identical (as opposed to varying) matrices, greatly improving performance. Finally, the methods introduced here will be of value to other efforts applying AD tools to ice models, particularly ones which solve a hybrid shallow ice/shallow shelf approximation to the Stokes equations.« less
Some Improvements on Signed Window Algorithms for Scalar Multiplications in Elliptic Curve Cryptosystems

NASA Technical Reports Server (NTRS)

Vo, San C.; Biegel, Bryan (Technical Monitor)

2001-01-01

Scalar multiplication is an essential operation in elliptic curve cryptosystems because its implementation determines the speed and the memory storage requirements. This paper discusses some improvements on two popular signed window algorithms for implementing scalar multiplications of an elliptic curve point - Morain-Olivos's algorithm and Koyarna-Tsuruoka's algorithm.
External locus of control contributes to racial disparities in memory and reasoning training gains in ACTIVE.

PubMed

Zahodne, Laura B; Meyer, Oanh L; Choi, Eunhee; Thomas, Michael L; Willis, Sherry L; Marsiske, Michael; Gross, Alden L; Rebok, George W; Parisi, Jeanine M

2015-09-01

Racial disparities in cognitive outcomes may be partly explained by differences in locus of control. African Americans report more external locus of control than non-Hispanic Whites, and external locus of control is associated with poorer health and cognition. The aims of this study were to compare cognitive training gains between African American and non-Hispanic White participants in the Advanced Cognitive Training for Independent and Vital Elderly (ACTIVE) study and determine whether racial differences in training gains are mediated by locus of control. The sample comprised 2,062 (26% African American) adults aged 65 and older who participated in memory, reasoning, or speed training. Latent growth curve models evaluated predictors of 10-year cognitive trajectories separately by training group. Multiple group modeling examined associations between training gains and locus of control across racial groups. Compared to non-Hispanic Whites, African Americans evidenced less improvement in memory and reasoning performance after training. These effects were partially mediated by locus of control, controlling for age, sex, education, health, depression, testing site, and initial cognitive ability. African Americans reported more external locus of control, which was associated with smaller training gains. External locus of control also had a stronger negative association with reasoning training gain for African Americans than for Whites. No racial difference in training gain was identified for speed training. Future intervention research with African Americans should test whether explicitly targeting external locus of control leads to greater cognitive improvement following cognitive training. (c) 2015 APA, all rights reserved).
Solitonic Josephson-based meminductive systems

NASA Astrophysics Data System (ADS)

Guarcello, Claudio; Solinas, Paolo; di Ventra, Massimiliano; Giazotto, Francesco

2017-04-01

Memristors, memcapacitors, and meminductors represent an innovative generation of circuit elements whose properties depend on the state and history of the system. The hysteretic behavior of one of their constituent variables, is their distinctive fingerprint. This feature endows them with the ability to store and process information on the same physical location, a property that is expected to benefit many applications ranging from unconventional computing to adaptive electronics to robotics. Therefore, it is important to find appropriate memory elements that combine a wide range of memory states, long memory retention times, and protection against unavoidable noise. Although several physical systems belong to the general class of memelements, few of them combine these important physical features in a single component. Here, we demonstrate theoretically a superconducting memory based on solitonic long Josephson junctions. Moreover, since solitons are at the core of its operation, this system provides an intrinsic topological protection against external perturbations. We show that the Josephson critical current behaves hysteretically as an external magnetic field is properly swept. Accordingly, long Josephson junctions can be used as multi-state memories, with a controllable number of available states, and in other emerging areas such as memcomputing, i.e., computing directly in/by the memory.
Mobile phones as a new memory aid: a preliminary investigation using case studies.

PubMed

Wade, T K; Troy, J C

2001-04-01

Memory impairment is one of the most common concerns following a brain injury of any severity. The use of effective external memory aids can help minimize the devastating effects such memory impairment can have on an individual's everyday life. Reviewed in this report are case studies of five individuals suffering significant everyday memory problems that were given a new memory aid that utilizes standard mobile phones. Measurements included diary-format observations and qualitative feedback. The results of the study show promising outcomes for all of the cases, and have led to recent adaptations to allow for wider and more effective use of this memory aid.
$$\\mathscr{H}_2$$ optimal control techniques for resistive wall mode feedback in tokamaks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Clement, Mitchell; Hanson, Jeremy; Bialek, Jim

DIII-D experiments show that a new, advanced algorithm improves resistive wall mode (RWM) stability control in high performance discharges using external coils. DIII-D can excite strong, locked or nearly locked external kink modes whose rotation frequencies and growth rates are on the order of the magnetic ux di usion time of the vacuum vessel wall. The VALEN RWM model has been used to gauge the e ectiveness of RWM control algorithms in tokamaks. Simulations and experiments have shown that modern control techniques like Linear Quadratic Gaussian (LQG) control will perform better, using 77% less current, than classical techniques when usingmore » control coils external to DIII-D's vacuum vessel. Experiments were conducted to develop control of a rotating n = 1 perturbation using an LQG controller derived from VALEN and external coils. Feedback using this LQG algorithm outperformed a proportional gain only controller in these perturbation experiments over a range of frequencies. Results from high N experiments also show that advanced feedback techniques using external control coils may be as e ective as internal control coil feedback using classical control techniques.« less
$$\\mathscr{H}_2$$ optimal control techniques for resistive wall mode feedback in tokamaks

DOE PAGES

Clement, Mitchell; Hanson, Jeremy; Bialek, Jim; ...

2018-02-28

DIII-D experiments show that a new, advanced algorithm improves resistive wall mode (RWM) stability control in high performance discharges using external coils. DIII-D can excite strong, locked or nearly locked external kink modes whose rotation frequencies and growth rates are on the order of the magnetic ux di usion time of the vacuum vessel wall. The VALEN RWM model has been used to gauge the e ectiveness of RWM control algorithms in tokamaks. Simulations and experiments have shown that modern control techniques like Linear Quadratic Gaussian (LQG) control will perform better, using 77% less current, than classical techniques when usingmore » control coils external to DIII-D's vacuum vessel. Experiments were conducted to develop control of a rotating n = 1 perturbation using an LQG controller derived from VALEN and external coils. Feedback using this LQG algorithm outperformed a proportional gain only controller in these perturbation experiments over a range of frequencies. Results from high N experiments also show that advanced feedback techniques using external control coils may be as e ective as internal control coil feedback using classical control techniques.« less
Memory, metamemory, and social cues: Between conformity and resistance.

PubMed

Zawadzka, Katarzyna; Krogulska, Aleksandra; Button, Roberta; Higham, Philip A; Hanczakowski, Maciej

2016-02-01

When presented with responses of another person, people incorporate these responses into memory reports: a finding termed memory conformity. Research on memory conformity in recognition reveals that people rely on external social cues to guide their memory responses when their own ability to respond is at chance. In this way, conforming to a reliable source boosts recognition performance but conforming to a random source does not impair it. In the present study we assessed whether people would conform indiscriminately to reliable and unreliable (random) sources when they are given the opportunity to exercise metamemory control over their responding by withholding answers in a recognition test. In Experiments 1 and 2, we found the pattern of memory conformity to reliable and unreliable sources in 2 variants of a free-report recognition test, yet at the same time the provision of external cues did not affect the rate of response withholding. In Experiment 3, we provided participants with initial feedback on their recognition decisions, facilitating the discrimination between the reliable and unreliable source. This led to the reduction of memory conformity to the unreliable source, and at the same time modulated metamemory decisions concerning response withholding: participants displayed metamemory conformity to the reliable source, volunteering more responses in their memory report, and metamemory resistance to the random source, withholding more responses from the memory report. Together, the results show how metamemory decisions dissociate various types of memory conformity and that memory and metamemory decisions can be independent of each other. PsycINFO Database Record (c) 2016 APA, all rights reserved.
Natural Memory Beyond the Storage Model: Repression, Trauma, and the Construction of a Personal Past

PubMed Central

Axmacher, Nikolai; Do Lam, Anne T. A.; Kessler, Henrik; Fell, Juergen

2010-01-01

Naturally occurring memory processes show features which are difficult to investigate by conventional cognitive neuroscience paradigms. Distortions of memory for problematic contents are described both by psychoanalysis (internal conflicts) and research on post-traumatic stress disorder (PTSD; external traumata). Typically, declarative memory for these contents is impaired – possibly due to repression in the case of internal conflicts or due to dissociation in the case of external traumata – but they continue to exert an unconscious pathological influence: neurotic symptoms or psychosomatic disorders after repression or flashbacks and intrusions in PTSD after dissociation. Several experimental paradigms aim at investigating repression in healthy control subjects. We argue that these paradigms do not adequately operationalize the clinical process of repression, because they rely on an intentional inhibition of random stimuli (suppression). Furthermore, these paradigms ignore that memory distortions due to repression or dissociation are most accurately characterized by a lack of self-referential processing, resulting in an impaired integration of these contents into the self. This aspect of repression and dissociation cannot be captured by the concept of memory as a storage device which is usually employed in the cognitive neurosciences. It can only be assessed within the framework of a constructivist memory concept, according to which successful memory involves a reconstruction of experiences such that they fit into a representation of the self. We suggest several experimental paradigms that allow for the investigation of the neural correlates of repressed memories and trauma-induced memory distortions based on a constructivist memory concept. PMID:21151366
The impact of early shame memories in Binge Eating Disorder: The mediator effect of current body image shame and cognitive fusion.

PubMed

Duarte, Cristiana; Pinto-Gouveia, José

2017-12-01

This study examined the phenomenology of shame experiences from childhood and adolescence in a sample of women with Binge Eating Disorder. Moreover, a path analysis was investigated testing whether the association between shame-related memories which are traumatic and central to identity, and binge eating symptoms' severity, is mediated by current external shame, body image shame and body image cognitive fusion. Participants in this study were 114 patients, who were assessed through the Eating Disorder Examination and the Shame Experiences Interview, and through self-report measures of external shame, body image shame, body image cognitive fusion and binge eating symptoms. Shame experiences where physical appearance was negatively commented or criticized by others were the most frequently recalled. A path analysis showed a good fit between the hypothesised mediational model and the data. The traumatic and centrality qualities of shame-related memories predicted current external shame, especially body image shame. Current shame feelings were associated with body image cognitive fusion, which, in turn, predicted levels of binge eating symptomatology. Findings support the relevance of addressing early shame-related memories and negative affective and self-evaluative experiences, namely related to body image, in the understanding and management of binge eating. Copyright © 2017 Elsevier B.V. All rights reserved.
Medial prefrontal cortex supports source memory accuracy for self-referenced items

PubMed Central

Leshikar, Eric D.; Duarte, Audrey

2013-01-01

Previous behavioral work suggests that processing information in relation to the self enhances subsequent item recognition. Neuroimaging evidence further suggests that regions along the cortical midline, particularly those of the medial prefrontal cortex, underlie this benefit. There has been little work to date, however, on the effects of self-referential encoding on source memory accuracy or whether the medial prefrontal cortex might contribute to source memory for self-referenced materials. In the current study, we used fMRI to measure neural activity while participants studied and subsequently retrieved pictures of common objects superimposed on one of two background scenes (sources) under either self-reference or self-external encoding instructions. Both item recognition and source recognition were better for objects encoded self-referentially than self-externally. Neural activity predictive of source accuracy was observed in the medial prefrontal cortex (BA 10) at the time of study for self-referentially but not self-externally encoded objects. The results of this experiment suggest that processing information in relation to the self leads to a mnemonic benefit for source level features, and that activity in the medial prefrontal cortex contributes to this source memory benefit. This evidence expands the purported role that the medial prefrontal cortex plays in self-referencing. PMID:21936739
Stream-based Hebbian eigenfilter for real-time neuronal spike discrimination

PubMed Central

2012-01-01

Background Principal component analysis (PCA) has been widely employed for automatic neuronal spike sorting. Calculating principal components (PCs) is computationally expensive, and requires complex numerical operations and large memory resources. Substantial hardware resources are therefore needed for hardware implementations of PCA. General Hebbian algorithm (GHA) has been proposed for calculating PCs of neuronal spikes in our previous work, which eliminates the needs of computationally expensive covariance analysis and eigenvalue decomposition in conventional PCA algorithms. However, large memory resources are still inherently required for storing a large volume of aligned spikes for training PCs. The large size memory will consume large hardware resources and contribute significant power dissipation, which make GHA difficult to be implemented in portable or implantable multi-channel recording micro-systems. Method In this paper, we present a new algorithm for PCA-based spike sorting based on GHA, namely stream-based Hebbian eigenfilter, which eliminates the inherent memory requirements of GHA while keeping the accuracy of spike sorting by utilizing the pseudo-stationarity of neuronal spikes. Because of the reduction of large hardware storage requirements, the proposed algorithm can lead to ultra-low hardware resources and power consumption of hardware implementations, which is critical for the future multi-channel micro-systems. Both clinical and synthetic neural recording data sets were employed for evaluating the accuracy of the stream-based Hebbian eigenfilter. The performance of spike sorting using stream-based eigenfilter and the computational complexity of the eigenfilter were rigorously evaluated and compared with conventional PCA algorithms. Field programmable logic arrays (FPGAs) were employed to implement the proposed algorithm, evaluate the hardware implementations and demonstrate the reduction in both power consumption and hardware memories achieved by the streaming computing Results and discussion Results demonstrate that the stream-based eigenfilter can achieve the same accuracy and is 10 times more computationally efficient when compared with conventional PCA algorithms. Hardware evaluations show that 90.3% logic resources, 95.1% power consumption and 86.8% computing latency can be reduced by the stream-based eigenfilter when compared with PCA hardware. By utilizing the streaming method, 92% memory resources and 67% power consumption can be saved when compared with the direct implementation of GHA. Conclusion Stream-based Hebbian eigenfilter presents a novel approach to enable real-time spike sorting with reduced computational complexity and hardware costs. This new design can be further utilized for multi-channel neuro-physiological experiments or chronic implants. PMID:22490725
A direct comparison of short-term audiomotor and visuomotor memory.

PubMed

Ward, Amanda M; Loucks, Torrey M; Ofori, Edward; Sosnoff, Jacob J

2014-04-01

Audiomotor and visuomotor short-term memory are required for an important variety of skilled movements but have not been compared in a direct manner previously. Audiomotor memory capacity might be greater to accommodate auditory goals that are less directly related to movement outcome than for visually guided tasks. Subjects produced continuous isometric force with the right index finger under auditory and visual feedback. During the first 10 s of each trial, subjects received continuous auditory or visual feedback. For the following 15 s, feedback was removed but the force had to be maintained accurately. An internal effort condition was included to test memory capacity in the same manner but without external feedback. Similar decay times of ~5-6 s were found for vision and audition but the decay time for internal effort was ~4 s. External feedback thus provides an advantage in maintaining a force level after feedback removal, but may not exclude some contribution from a sense of effort. Short-term memory capacity appears longer than certain previous reports but there may not be strong distinctions in capacity across different sensory modalities, at least for isometric force.
High-order hydrodynamic algorithms for exascale computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morgan, Nathaniel Ray

Hydrodynamic algorithms are at the core of many laboratory missions ranging from simulating ICF implosions to climate modeling. The hydrodynamic algorithms commonly employed at the laboratory and in industry (1) typically lack requisite accuracy for complex multi- material vortical flows and (2) are not well suited for exascale computing due to poor data locality and poor FLOP/memory ratios. Exascale computing requires advances in both computer science and numerical algorithms. We propose to research the second requirement and create a new high-order hydrodynamic algorithm that has superior accuracy, excellent data locality, and excellent FLOP/memory ratios. This proposal will impact a broadmore » range of research areas including numerical theory, discrete mathematics, vorticity evolution, gas dynamics, interface instability evolution, turbulent flows, fluid dynamics and shock driven flows. If successful, the proposed research has the potential to radically transform simulation capabilities and help position the laboratory for computing at the exascale.« less
A comparison of select image-compression algorithms for an electronic still camera

NASA Technical Reports Server (NTRS)

Nerheim, Rosalee

1989-01-01

This effort is a study of image-compression algorithms for an electronic still camera. An electronic still camera can record and transmit high-quality images without the use of film, because images are stored digitally in computer memory. However, high-resolution images contain an enormous amount of information, and will strain the camera's data-storage system. Image compression will allow more images to be stored in the camera's memory. For the electronic still camera, a compression algorithm that produces a reconstructed image of high fidelity is most important. Efficiency of the algorithm is the second priority. High fidelity and efficiency are more important than a high compression ratio. Several algorithms were chosen for this study and judged on fidelity, efficiency and compression ratio. The transform method appears to be the best choice. At present, the method is compressing images to a ratio of 5.3:1 and producing high-fidelity reconstructed images.
FPGA implementation of Santos-Victor optical flow algorithm for real-time image processing: an useful attempt

NASA Astrophysics Data System (ADS)

Cobos Arribas, Pedro; Monasterio Huelin Macia, Felix

2003-04-01

A FPGA based hardware implementation of the Santos-Victor optical flow algorithm, useful in robot guidance applications, is described in this paper. The system used to do contains an ALTERA FPGA (20K100), an interface with a digital camera, three VRAM memories to contain the data input and some output memories (a VRAM and a EDO) to contain the results. The system have been used previously to develop and test other vision algorithms, such as image compression, optical flow calculation with differential and correlation methods. The designed system let connect the digital camera, or the FPGA output (results of algorithms) to a PC, throw its Firewire or USB port. The problems take place in this occasion have motivated to adopt another hardware structure for certain vision algorithms with special requirements, that need a very hard code intensive processing.
Hierarchical Traces for Reduced NSM Memory Requirements

NASA Astrophysics Data System (ADS)

Dahl, Torbjørn S.

This paper presents work on using hierarchical long term memory to reduce the memory requirements of nearest sequence memory (NSM) learning, a previously published, instance-based reinforcement learning algorithm. A hierarchical memory representation reduces the memory requirements by allowing traces to share common sub-sequences. We present moderated mechanisms for estimating discounted future rewards and for dealing with hidden state using hierarchical memory. We also present an experimental analysis of how the sub-sequence length affects the memory compression achieved and show that the reduced memory requirements do not effect the speed of learning. Finally, we analyse and discuss the persistence of the sub-sequences independent of specific trace instances.

RF assisted switching in magnetic Josephson junctions

NASA Astrophysics Data System (ADS)

Caruso, R.; Massarotti, D.; Bolginov, V. V.; Ben Hamida, A.; Karelina, L. N.; Miano, A.; Vernik, I. V.; Tafuri, F.; Ryazanov, V. V.; Mukhanov, O. A.; Pepe, G. P.

2018-04-01

We test the effect of an external RF field on the switching processes of magnetic Josephson junctions (MJJs) suitable for the realization of fast, scalable cryogenic memories compatible with Single Flux Quantum logic. We show that the combined application of microwaves and magnetic field pulses can improve the performances of the device, increasing the separation between the critical current levels corresponding to logical "0" and "1." The enhancement of the current level separation can be as high as 80% using an optimal set of parameters. We demonstrate that external RF fields can be used as an additional tool to manipulate the memory states, and we expect that this approach may lead to the development of new methods of selecting MJJs and manipulating their states in memory arrays for various applications.
A fully automated non-external marker 4D-CT sorting algorithm using a serial cine scanning protocol.

PubMed

Carnes, Greg; Gaede, Stewart; Yu, Edward; Van Dyk, Jake; Battista, Jerry; Lee, Ting-Yim

2009-04-07

Current 4D-CT methods require external marker data to retrospectively sort image data and generate CT volumes. In this work we develop an automated 4D-CT sorting algorithm that performs without the aid of data collected from an external respiratory surrogate. The sorting algorithm requires an overlapping cine scan protocol. The overlapping protocol provides a spatial link between couch positions. Beginning with a starting scan position, images from the adjacent scan position (which spatial match the starting scan position) are selected by maximizing the normalized cross correlation (NCC) of the images at the overlapping slice position. The process was continued by 'daisy chaining' all couch positions using the selected images until an entire 3D volume was produced. The algorithm produced 16 phase volumes to complete a 4D-CT dataset. Additional 4D-CT datasets were also produced using external marker amplitude and phase angle sorting methods. The image quality of the volumes produced by the different methods was quantified by calculating the mean difference of the sorted overlapping slices from adjacent couch positions. The NCC sorted images showed a significant decrease in the mean difference (p < 0.01) for the five patients.
Evolution of cellular automata with memory: The Density Classification Task.

PubMed

Stone, Christopher; Bull, Larry

2009-08-01

The Density Classification Task is a well known test problem for two-state discrete dynamical systems. For many years researchers have used a variety of evolutionary computation approaches to evolve solutions to this problem. In this paper, we investigate the evolvability of solutions when the underlying Cellular Automaton is augmented with a type of memory based on the Least Mean Square algorithm. To obtain high performance solutions using a simple non-hybrid genetic algorithm, we design a novel representation based on the ternary representation used for Learning Classifier Systems. The new representation is found able to produce superior performance to the bit string traditionally used for representing Cellular automata. Moreover, memory is shown to improve evolvability of solutions and appropriate memory settings are able to be evolved as a component part of these solutions.
Multiprocessor architecture: Synthesis and evaluation

NASA Technical Reports Server (NTRS)

Standley, Hilda M.

1990-01-01

Multiprocessor computed architecture evaluation for structural computations is the focus of the research effort described. Results obtained are expected to lead to more efficient use of existing architectures and to suggest designs for new, application specific, architectures. The brief descriptions given outline a number of related efforts directed toward this purpose. The difficulty is analyzing an existing architecture or in designing a new computer architecture lies in the fact that the performance of a particular architecture, within the context of a given application, is determined by a number of factors. These include, but are not limited to, the efficiency of the computation algorithm, the programming language and support environment, the quality of the program written in the programming language, the multiplicity of the processing elements, the characteristics of the individual processing elements, the interconnection network connecting processors and non-local memories, and the shared memory organization covering the spectrum from no shared memory (all local memory) to one global access memory. These performance determiners may be loosely classified as being software or hardware related. This distinction is not clear or even appropriate in many cases. The effect of the choice of algorithm is ignored by assuming that the algorithm is specified as given. Effort directed toward the removal of the effect of the programming language and program resulted in the design of a high-level parallel programming language. Two characteristics of the fundamental structure of the architecture (memory organization and interconnection network) are examined.
A fast Fourier transform on multipoles (FFTM) algorithm for solving Helmholtz equation in acoustics analysis.

PubMed

Ong, Eng Teo; Lee, Heow Pueh; Lim, Kian Meng

2004-09-01

This article presents a fast algorithm for the efficient solution of the Helmholtz equation. The method is based on the translation theory of the multipole expansions. Here, the speedup comes from the convolution nature of the translation operators, which can be evaluated rapidly using fast Fourier transform algorithms. Also, the computations of the translation operators are accelerated by using the recursive formulas developed recently by Gumerov and Duraiswami [SIAM J. Sci. Comput. 25, 1344-1381(2003)]. It is demonstrated that the algorithm can produce good accuracy with a relatively low order of expansion. Efficiency analyses of the algorithm reveal that it has computational complexities of O(Na), where a ranges from 1.05 to 1.24. However, this method requires substantially more memory to store the translation operators as compared to the fast multipole method. Hence, despite its simplicity in implementation, this memory requirement issue may limit the application of this algorithm to solving very large-scale problems.
Source Monitoring in Alzheimer's Disease

ERIC Educational Resources Information Center

El Haj, Mohamad; Fasotti, Luciano; Allain, Philippe

2012-01-01

Source monitoring is the process of making judgments about the origin of memories. There are three categories of source monitoring: reality monitoring (discrimination between self- versus other-generated sources), external monitoring (discrimination between several external sources), and internal monitoring (discrimination between two types of…
TH-E-17A-01: Internal Respiratory Surrogate for 4D CT Using Fourier Transform and Anatomical Features

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hui, C; Suh, Y; Robertson, D

Purpose: To develop a novel algorithm to generate internal respiratory signals for sorting of four-dimensional (4D) computed tomography (CT) images. Methods: The proposed algorithm extracted multiple time resolved features as potential respiratory signals. These features were taken from the 4D CT images and its Fourier transformed space. Several low-frequency locations in the Fourier space and selected anatomical features from the images were used as potential respiratory signals. A clustering algorithm was then used to search for the group of appropriate potential respiratory signals. The chosen signals were then normalized and averaged to form the final internal respiratory signal. Performance ofmore » the algorithm was tested in 50 4D CT data sets and results were compared with external signals from the real-time position management (RPM) system. Results: In almost all cases, the proposed algorithm generated internal respiratory signals that visibly matched the external respiratory signals from the RPM system. On average, the end inspiration times calculated by the proposed algorithm were within 0.1 s of those given by the RPM system. Less than 3% of the calculated end inspiration times were more than one time frame away from those given by the RPM system. In 3 out of the 50 cases, the proposed algorithm generated internal respiratory signals that were significantly smoother than the RPM signals. In these cases, images sorted using the internal respiratory signals showed fewer artifacts in locations corresponding to the discrepancy in the internal and external respiratory signals. Conclusion: We developed a robust algorithm that generates internal respiratory signals from 4D CT images. In some cases, it even showed the potential to outperform the RPM system. The proposed algorithm is completely automatic and generally takes less than 2 min to process. It can be easily implemented into the clinic and can potentially replace the use of external surrogates.« less
Towards robust algorithms for current deposition and dynamic load-balancing in a GPU particle in cell code

NASA Astrophysics Data System (ADS)

Rossi, Francesco; Londrillo, Pasquale; Sgattoni, Andrea; Sinigardi, Stefano; Turchetti, Giorgio

2012-12-01

We present `jasmine', an implementation of a fully relativistic, 3D, electromagnetic Particle-In-Cell (PIC) code, capable of running simulations in various laser plasma acceleration regimes on Graphics-Processing-Units (GPUs) HPC clusters. Standard energy/charge preserving FDTD-based algorithms have been implemented using double precision and quadratic (or arbitrary sized) shape functions for the particle weighting. When porting a PIC scheme to the GPU architecture (or, in general, a shared memory environment), the particle-to-grid operations (e.g. the evaluation of the current density) require special care to avoid memory inconsistencies and conflicts. Here we present a robust implementation of this operation that is efficient for any number of particles per cell and particle shape function order. Our algorithm exploits the exposed GPU memory hierarchy and avoids the use of atomic operations, which can hurt performance especially when many particles lay on the same cell. We show the code multi-GPU scalability results and present a dynamic load-balancing algorithm. The code is written using a python-based C++ meta-programming technique which translates in a high level of modularity and allows for easy performance tuning and simple extension of the core algorithms to various simulation schemes.
A depth-first search algorithm to compute elementary flux modes by linear programming.

PubMed

Quek, Lake-Ee; Nielsen, Lars K

2014-07-30

The decomposition of complex metabolic networks into elementary flux modes (EFMs) provides a useful framework for exploring reaction interactions systematically. Generating a complete set of EFMs for large-scale models, however, is near impossible. Even for moderately-sized models (<400 reactions), existing approaches based on the Double Description method must iterate through a large number of combinatorial candidates, thus imposing an immense processor and memory demand. Based on an alternative elementarity test, we developed a depth-first search algorithm using linear programming (LP) to enumerate EFMs in an exhaustive fashion. Constraints can be introduced to directly generate a subset of EFMs satisfying the set of constraints. The depth-first search algorithm has a constant memory overhead. Using flux constraints, a large LP problem can be massively divided and parallelized into independent sub-jobs for deployment into computing clusters. Since the sub-jobs do not overlap, the approach scales to utilize all available computing nodes with minimal coordination overhead or memory limitations. The speed of the algorithm was comparable to efmtool, a mainstream Double Description method, when enumerating all EFMs; the attrition power gained from performing flux feasibility tests offsets the increased computational demand of running an LP solver. Unlike the Double Description method, the algorithm enables accelerated enumeration of all EFMs satisfying a set of constraints.
A Cognitive Paradigm to Investigate Interference in Working Memory by Distractions and Interruptions

PubMed Central

Janowich, Jacki; Mishra, Jyoti; Gazzaley, Adam

2015-01-01

Goal-directed behavior is often impaired by interference from the external environment, either in the form of distraction by irrelevant information that one attempts to ignore, or by interrupting information that demands attention as part of another (secondary) task goal. Both forms of external interference have been shown to detrimentally impact the ability to maintain information in working memory (WM). Emerging evidence suggests that these different types of external interference exert different effects on behavior and may be mediated by distinct neural mechanisms. Better characterizing the distinct neuro-behavioral impact of irrelevant distractions versus attended interruptions is essential for advancing an understanding of top-down attention, resolution of external interference, and how these abilities become degraded in healthy aging and in neuropsychiatric conditions. This manuscript describes a novel cognitive paradigm developed the Gazzaley lab that has now been modified into several distinct versions used to elucidate behavioral and neural correlates of interference, by to-be-ignored distractors versus to-be-attended interruptors. Details are provided on variants of this paradigm for investigating interference in visual and auditory modalities, at multiple levels of stimulus complexity, and with experimental timing optimized for electroencephalography (EEG) or functional magnetic resonance imaging (fMRI) studies. In addition, data from younger and older adult participants obtained using this paradigm is reviewed and discussed in the context of its relationship with the broader literatures on external interference and age-related neuro-behavioral changes in resolving interference in working memory. PMID:26273742
Hardware Acceleration of Sparse Cognitive Algorithms

DTIC Science & Technology

2016-05-01

Processor in Memory (PiM) extensions and a 65 nm ASIC version. They were compared against a 28 nm GPU baseline using the KTH video action recognition...30 Table 17. Memory Requirement of Proposed ASIC...for improvement of performance per unit of power for customized implementations of the Sparsey and Numenta Hierarchical Temporal Memory (HTM
NAS Applications and Advanced Algorithms

NASA Technical Reports Server (NTRS)

Bailey, David H.; Biswas, Rupak; VanDerWijngaart, Rob; Kutler, Paul (Technical Monitor)

1997-01-01

This paper examines the applications most commonly run on the supercomputers at the Numerical Aerospace Simulation (NAS) facility. It analyzes the extent to which such applications are fundamentally oriented to vector computers, and whether or not they can be efficiently implemented on hierarchical memory machines, such as systems with cache memories and highly parallel, distributed memory systems.
Mental Capacity and Working Memory in Chemistry: Algorithmic "versus" Open-Ended Problem Solving

ERIC Educational Resources Information Center

St Clair-Thompson, Helen; Overton, Tina; Bugler, Myfanwy

2012-01-01

Previous research has revealed that problem solving and attainment in chemistry are constrained by mental capacity and working memory. However, the terms mental capacity and working memory come from different theories of cognitive resources, and are assessed using different tasks. The current study examined the relationships between mental…
Memory-Efficient Onboard Rock Segmentation

NASA Technical Reports Server (NTRS)

Burl, Michael C.; Thompson, David R.; Bornstein, Benjamin J.; deGranville, Charles K.

2013-01-01

Rockster-MER is an autonomous perception capability that was uploaded to the Mars Exploration Rover Opportunity in December 2009. This software provides the vision front end for a larger software system known as AEGIS (Autonomous Exploration for Gathering Increased Science), which was recently named 2011 NASA Software of the Year. As the first step in AEGIS, Rockster-MER analyzes an image captured by the rover, and detects and automatically identifies the boundary contours of rocks and regions of outcrop present in the scene. This initial segmentation step reduces the data volume from millions of pixels into hundreds (or fewer) of rock contours. Subsequent stages of AEGIS then prioritize the best rocks according to scientist- defined preferences and take high-resolution, follow-up observations. Rockster-MER has performed robustly from the outset on the Mars surface under challenging conditions. Rockster-MER is a specially adapted, embedded version of the original Rockster algorithm ("Rock Segmentation Through Edge Regrouping," (NPO- 44417) Software Tech Briefs, September 2008, p. 25). Although the new version performs the same basic task as the original code, the software has been (1) significantly upgraded to overcome the severe onboard re source limitations (CPU, memory, power, time) and (2) "bulletproofed" through code reviews and extensive testing and profiling to avoid the occurrence of faults. Because of the limited computational power of the RAD6000 flight processor on Opportunity (roughly two orders of magnitude slower than a modern workstation), the algorithm was heavily tuned to improve its speed. Several functional elements of the original algorithm were removed as a result of an extensive cost/benefit analysis conducted on a large set of archived rover images. The algorithm was also required to operate below a stringent 4MB high-water memory ceiling; hence, numerous tricks and strategies were introduced to reduce the memory footprint. Local filtering operations were re-coded to operate on horizontal data stripes across the image. Data types were reduced to smaller sizes where possible. Binary- valued intermediate results were squeezed into a more compact, one-bit-per-pixel representation through bit packing and bit manipulation macros. An estimated 16-fold reduction in memory footprint relative to the original Rockster algorithm was achieved. The resulting memory footprint is less than four times the base image size. Also, memory allocation calls were modified to draw from a static pool and consolidated to reduce memory management overhead and fragmentation. Rockster-MER has now been run onboard Opportunity numerous times as part of AEGIS with exceptional performance. Sample results are available on the AEGIS website at http://aegis.jpl.nasa.gov.
A hardware-oriented algorithm for floating-point function generation

NASA Technical Reports Server (NTRS)

O'Grady, E. Pearse; Young, Baek-Kyu

1991-01-01

An algorithm is presented for performing accurate, high-speed, floating-point function generation for univariate functions defined at arbitrary breakpoints. Rapid identification of the breakpoint interval, which includes the input argument, is shown to be the key operation in the algorithm. A hardware implementation which makes extensive use of read/write memories is used to illustrate the algorithm.
Efficient Approximation Algorithms for Weighted $b$-Matching

DOE Office of Scientific and Technical Information (OSTI.GOV)

Khan, Arif; Pothen, Alex; Mostofa Ali Patwary, Md.

2016-01-01

We describe a half-approximation algorithm, b-Suitor, for computing a b-Matching of maximum weight in a graph with weights on the edges. b-Matching is a generalization of the well-known Matching problem in graphs, where the objective is to choose a subset of M edges in the graph such that at most a specified number b(v) of edges in M are incident on each vertex v. Subject to this restriction we maximize the sum of the weights of the edges in M. We prove that the b-Suitor algorithm computes the same b-Matching as the one obtained by the greedy algorithm for themore » problem. We implement the algorithm on serial and shared-memory parallel processors, and compare its performance against a collection of approximation algorithms that have been proposed for the Matching problem. Our results show that the b-Suitor algorithm outperforms the Greedy and Locally Dominant edge algorithms by one to two orders of magnitude on a serial processor. The b-Suitor algorithm has a high degree of concurrency, and it scales well up to 240 threads on a shared memory multiprocessor. The b-Suitor algorithm outperforms the Locally Dominant edge algorithm by a factor of fourteen on 16 cores of an Intel Xeon multiprocessor.« less
Is a Responsive Default Mode Network Required for Successful Working Memory Task Performance?

PubMed Central

Čeko, Marta; Gracely, John L.; Fitzcharles, Mary-Ann; Seminowicz, David A.; Schweinhardt, Petra

2015-01-01

In studies of cognitive processing using tasks with externally directed attention, regions showing increased (external-task-positive) and decreased or “negative” [default-mode network (DMN)] fMRI responses during task performance are dynamically responsive to increasing task difficulty. Responsiveness (modulation of fMRI signal by increasing load) has been linked directly to successful cognitive task performance in external-task-positive regions but not in DMN regions. To investigate whether a responsive DMN is required for successful cognitive performance, we compared healthy human subjects (n = 23) with individuals shown to have decreased DMN engagement (chronic pain patients, n = 28). Subjects performed a multilevel working-memory task (N-back) during fMRI. If a responsive DMN is required for successful performance, patients having reduced DMN responsiveness should show worsened performance; if performance is not reduced, their brains should show compensatory activation in external-task-positive regions or elsewhere. All subjects showed decreased accuracy and increased reaction times with increasing task level, with no significant group differences on either measure at any level. Patients had significantly reduced negative fMRI response (deactivation) of DMN regions (posterior cingulate/precuneus, medial prefrontal cortex). Controls showed expected modulation of DMN deactivation with increasing task difficulty. Patients showed significantly reduced modulation of DMN deactivation by task difficulty, despite their successful task performance. We found no evidence of compensatory neural recruitment in external-task-positive regions or elsewhere. Individual responsiveness of the external-task-positive ventrolateral prefrontal cortex, but not of DMN regions, correlated with task accuracy. These findings suggest that a responsive DMN may not be required for successful cognitive performance; a responsive external-task-positive network may be sufficient. SIGNIFICANCE STATEMENT We studied the relationship between responsiveness of the brain to increasing task demand and successful cognitive performance, using chronic pain patients as a probe. fMRI working memory studies show that two main cognitive networks [“external-task positive” and “default-mode network” (DMN)] are responsive to increasing task difficulty. The responsiveness of both of these brain networks is suggested to be required for successful task performance. The responsiveness of external-task-positive regions has been linked directly to successful cognitive task performance, as we also show here. However, pain patients show decreased engagement and responsiveness of the DMN but can perform a working memory task as well as healthy subjects, without demonstrable compensatory neural recruitment. Therefore, a responsive DMN might not be needed for successful cognitive performance. PMID:26290236
Is a Responsive Default Mode Network Required for Successful Working Memory Task Performance?

PubMed

Čeko, Marta; Gracely, John L; Fitzcharles, Mary-Ann; Seminowicz, David A; Schweinhardt, Petra; Bushnell, M Catherine

2015-08-19

In studies of cognitive processing using tasks with externally directed attention, regions showing increased (external-task-positive) and decreased or "negative" [default-mode network (DMN)] fMRI responses during task performance are dynamically responsive to increasing task difficulty. Responsiveness (modulation of fMRI signal by increasing load) has been linked directly to successful cognitive task performance in external-task-positive regions but not in DMN regions. To investigate whether a responsive DMN is required for successful cognitive performance, we compared healthy human subjects (n = 23) with individuals shown to have decreased DMN engagement (chronic pain patients, n = 28). Subjects performed a multilevel working-memory task (N-back) during fMRI. If a responsive DMN is required for successful performance, patients having reduced DMN responsiveness should show worsened performance; if performance is not reduced, their brains should show compensatory activation in external-task-positive regions or elsewhere. All subjects showed decreased accuracy and increased reaction times with increasing task level, with no significant group differences on either measure at any level. Patients had significantly reduced negative fMRI response (deactivation) of DMN regions (posterior cingulate/precuneus, medial prefrontal cortex). Controls showed expected modulation of DMN deactivation with increasing task difficulty. Patients showed significantly reduced modulation of DMN deactivation by task difficulty, despite their successful task performance. We found no evidence of compensatory neural recruitment in external-task-positive regions or elsewhere. Individual responsiveness of the external-task-positive ventrolateral prefrontal cortex, but not of DMN regions, correlated with task accuracy. These findings suggest that a responsive DMN may not be required for successful cognitive performance; a responsive external-task-positive network may be sufficient. We studied the relationship between responsiveness of the brain to increasing task demand and successful cognitive performance, using chronic pain patients as a probe. fMRI working memory studies show that two main cognitive networks ["external-task positive" and "default-mode network" (DMN)] are responsive to increasing task difficulty. The responsiveness of both of these brain networks is suggested to be required for successful task performance. The responsiveness of external-task-positive regions has been linked directly to successful cognitive task performance, as we also show here. However, pain patients show decreased engagement and responsiveness of the DMN but can perform a working memory task as well as healthy subjects, without demonstrable compensatory neural recruitment. Therefore, a responsive DMN might not be needed for successful cognitive performance. Copyright © 2015 the authors 0270-6474/15/3511596-11$15.00/0.
GPU color space conversion

NASA Astrophysics Data System (ADS)

Chase, Patrick; Vondran, Gary

2011-01-01

Tetrahedral interpolation is commonly used to implement continuous color space conversions from sparse 3D and 4D lookup tables. We investigate the implementation and optimization of tetrahedral interpolation algorithms for GPUs, and compare to the best known CPU implementations as well as to a well known GPU-based trilinear implementation. We show that a 500 NVIDIA GTX-580 GPU is 3x faster than a 1000 Intel Core i7 980X CPU for 3D interpolation, and 9x faster for 4D interpolation. Performance-relevant GPU attributes are explored including thread scheduling, local memory characteristics, global memory hierarchy, and cache behaviors. We consider existing tetrahedral interpolation algorithms and tune based on the structure and branching capabilities of current GPUs. Global memory performance is improved by reordering and expanding the lookup table to ensure optimal access behaviors. Per multiprocessor local memory is exploited to implement optimally coalesced global memory accesses, and local memory addressing is optimized to minimize bank conflicts. We explore the impacts of lookup table density upon computation and memory access costs. Also presented are CPU-based 3D and 4D interpolators, using SSE vector operations that are faster than any previously published solution.
Execution time supports for adaptive scientific algorithms on distributed memory machines

NASA Technical Reports Server (NTRS)

Berryman, Harry; Saltz, Joel; Scroggs, Jeffrey

1990-01-01

Optimizations are considered that are required for efficient execution of code segments that consists of loops over distributed data structures. The PARTI (Parallel Automated Runtime Toolkit at ICASE) execution time primitives are designed to carry out these optimizations and can be used to implement a wide range of scientific algorithms on distributed memory machines. These primitives allow the user to control array mappings in a way that gives an appearance of shared memory. Computations can be based on a global index set. Primitives are used to carry out gather and scatter operations on distributed arrays. Communications patterns are derived at runtime, and the appropriate send and receive messages are automatically generated.

Content addressable memory project

NASA Technical Reports Server (NTRS)

Hall, Josh; Levy, Saul; Smith, D.; Wei, S.; Miyake, K.; Murdocca, M.

1991-01-01

The progress on the Rutgers CAM (Content Addressable Memory) Project is described. The overall design of the system is completed at the architectural level and described. The machine is composed of two kinds of cells: (1) the CAM cells which include both memory and processor, and support local processing within each cell; and (2) the tree cells, which have smaller instruction set, and provide global processing over the CAM cells. A parameterized design of the basic CAM cell is completed. Progress was made on the final specification of the CPS. The machine architecture was driven by the design of algorithms whose requirements are reflected in the resulted instruction set(s). A few of these algorithms are described.
Control of Finite-State, Finite Memory Stochastic Systems

NASA Technical Reports Server (NTRS)

Sandell, Nils R.

1974-01-01

A generalized problem of stochastic control is discussed in which multiple controllers with different data bases are present. The vehicle for the investigation is the finite state, finite memory (FSFM) stochastic control problem. Optimality conditions are obtained by deriving an equivalent deterministic optimal control problem. A FSFM minimum principle is obtained via the equivalent deterministic problem. The minimum principle suggests the development of a numerical optimization algorithm, the min-H algorithm. The relationship between the sufficiency of the minimum principle and the informational properties of the problem are investigated. A problem of hypothesis testing with 1-bit memory is investigated to illustrate the application of control theoretic techniques to information processing problems.
Vascular system modeling in parallel environment - distributed and shared memory approaches

PubMed Central

Jurczuk, Krzysztof; Kretowski, Marek; Bezy-Wendling, Johanne

2011-01-01

The paper presents two approaches in parallel modeling of vascular system development in internal organs. In the first approach, new parts of tissue are distributed among processors and each processor is responsible for perfusing its assigned parts of tissue to all vascular trees. Communication between processors is accomplished by passing messages and therefore this algorithm is perfectly suited for distributed memory architectures. The second approach is designed for shared memory machines. It parallelizes the perfusion process during which individual processing units perform calculations concerning different vascular trees. The experimental results, performed on a computing cluster and multi-core machines, show that both algorithms provide a significant speedup. PMID:21550891
On the Suitability of Suffix Arrays for Lempel-Ziv Data Compression

NASA Astrophysics Data System (ADS)

Ferreira, Artur J.; Oliveira, Arlindo L.; Figueiredo, Mário A. T.

Lossless compression algorithms of the Lempel-Ziv (LZ) family are widely used nowadays. Regarding time and memory requirements, LZ encoding is much more demanding than decoding. In order to speed up the encoding process, efficient data structures, like suffix trees, have been used. In this paper, we explore the use of suffix arrays to hold the dictionary of the LZ encoder, and propose an algorithm to search over it. We show that the resulting encoder attains roughly the same compression ratios as those based on suffix trees. However, the amount of memory required by the suffix array is fixed, and much lower than the variable amount of memory used by encoders based on suffix trees (which depends on the text to encode). We conclude that suffix arrays, when compared to suffix trees in terms of the trade-off among time, memory, and compression ratio, may be preferable in scenarios (e.g., embedded systems) where memory is at a premium and high speed is not critical.
An Implicit Algorithm for the Numerical Simulation of Shape-Memory Alloys

DOE Office of Scientific and Technical Information (OSTI.GOV)

Becker, R; Stolken, J; Jannetti, C

Shape-memory alloys (SMA) have the potential to be used in a variety of interesting applications due to their unique properties of pseudoelasticity and the shape-memory effect. However, in order to design SMA devices efficiently, a physics-based constitutive model is required to accurately simulate the behavior of shape-memory alloys. The scope of this work is to extend the numerical capabilities of the SMA constitutive model developed by Jannetti et. al. (2003), to handle large-scale polycrystalline simulations. The constitutive model is implemented within the finite-element software ABAQUS/Standard using a user defined material subroutine, or UMAT. To improve the efficiency of the numericalmore » simulations, so that polycrystalline specimens of shape-memory alloys can be modeled, a fully implicit algorithm has been implemented to integrate the constitutive equations. Using an implicit integration scheme increases the efficiency of the UMAT over the previously implemented explicit integration method by a factor of more than 100 for single crystal simulations.« less
Distributed Saturation

NASA Technical Reports Server (NTRS)

Chung, Ming-Ying; Ciardo, Gianfranco; Siminiceanu, Radu I.

2007-01-01

The Saturation algorithm for symbolic state-space generation, has been a recent break-through in the exhaustive veri cation of complex systems, in particular globally-asyn- chronous/locally-synchronous systems. The algorithm uses a very compact Multiway Decision Diagram (MDD) encoding for states and the fastest symbolic exploration algo- rithm to date. The distributed version of Saturation uses the overall memory available on a network of workstations (NOW) to efficiently spread the memory load during the highly irregular exploration. A crucial factor in limiting the memory consumption during the symbolic state-space generation is the ability to perform garbage collection to free up the memory occupied by dead nodes. However, garbage collection over a NOW requires a nontrivial communication overhead. In addition, operation cache policies become critical while analyzing large-scale systems using the symbolic approach. In this technical report, we develop a garbage collection scheme and several operation cache policies to help on solving extremely complex systems. Experiments show that our schemes improve the performance of the original distributed implementation, SmArTNow, in terms of time and memory efficiency.
Workflow of the Grover algorithm simulation incorporating CUDA and GPGPU

NASA Astrophysics Data System (ADS)

Lu, Xiangwen; Yuan, Jiabin; Zhang, Weiwei

2013-09-01

The Grover quantum search algorithm, one of only a few representative quantum algorithms, can speed up many classical algorithms that use search heuristics. No true quantum computer has yet been developed. For the present, simulation is one effective means of verifying the search algorithm. In this work, we focus on the simulation workflow using a compute unified device architecture (CUDA). Two simulation workflow schemes are proposed. These schemes combine the characteristics of the Grover algorithm and the parallelism of general-purpose computing on graphics processing units (GPGPU). We also analyzed the optimization of memory space and memory access from this perspective. We implemented four programs on CUDA to evaluate the performance of schemes and optimization. Through experimentation, we analyzed the organization of threads suited to Grover algorithm simulations, compared the storage costs of the four programs, and validated the effectiveness of optimization. Experimental results also showed that the distinguished program on CUDA outperformed the serial program of libquantum on a CPU with a speedup of up to 23 times (12 times on average), depending on the scale of the simulation.
Epidemic failure detection and consensus for extreme parallelism

DOE PAGES

Katti, Amogh; Di Fatta, Giuseppe; Naughton, Thomas; ...

2017-02-01

Future extreme-scale high-performance computing systems will be required to work under frequent component failures. The MPI Forum s User Level Failure Mitigation proposal has introduced an operation, MPI Comm shrink, to synchronize the alive processes on the list of failed processes, so that applications can continue to execute even in the presence of failures by adopting algorithm-based fault tolerance techniques. This MPI Comm shrink operation requires a failure detection and consensus algorithm. This paper presents three novel failure detection and consensus algorithms using Gossiping. The proposed algorithms were implemented and tested using the Extreme-scale Simulator. The results show that inmore » all algorithms the number of Gossip cycles to achieve global consensus scales logarithmically with system size. The second algorithm also shows better scalability in terms of memory and network bandwidth usage and a perfect synchronization in achieving global consensus. The third approach is a three-phase distributed failure detection and consensus algorithm and provides consistency guarantees even in very large and extreme-scale systems while at the same time being memory and bandwidth efficient.« less
A comparison of PCA/ICA for data preprocessing in remote sensing imagery classification

NASA Astrophysics Data System (ADS)

He, Hui; Yu, Xianchuan

2005-10-01

In this paper a performance comparison of a variety of data preprocessing algorithms in remote sensing image classification is presented. These selected algorithms are principal component analysis (PCA) and three different independent component analyses, ICA (Fast-ICA (Aapo Hyvarinen, 1999), Kernel-ICA (KCCA and KGV (Bach & Jordan, 2002), EFFICA (Aiyou Chen & Peter Bickel, 2003). These algorithms were applied to a remote sensing imagery (1600×1197), obtained from Shunyi, Beijing. For classification, a MLC method is used for the raw and preprocessed data. The results show that classification with the preprocessed data have more confident results than that with raw data and among the preprocessing algorithms, ICA algorithms improve on PCA and EFFICA performs better than the others. The convergence of these ICA algorithms (for data points more than a million) are also studied, the result shows EFFICA converges much faster than the others. Furthermore, because EFFICA is a one-step maximum likelihood estimate (MLE) which reaches asymptotic Fisher efficiency (EFFICA), it computers quite small so that its demand of memory come down greatly, which settled the "out of memory" problem occurred in the other algorithms.
MOSFET analog memory circuit achieves long duration signal storage

NASA Technical Reports Server (NTRS)

1966-01-01

Memory circuit maintains the signal voltage at the output of an analog signal amplifier when the input signal is interrupted or removed. The circuit uses MOSFET /Metal Oxide Semiconductor Field Effect Transistor/ devices as voltage-controlled switches, triggered by an external voltage-sensing device.
What–where–when memory and encoding strategies in healthy aging

PubMed Central

2016-01-01

Older adults exhibit disproportionate impairments in memory for item-associations. These impairments may stem from an inability to self-initiate deep encoding strategies. The present study investigates this using the “treasure-hunt task”; a what–where–when style episodic memory test that requires individuals to “hide” items around complex scenes. This task separately assesses memory for item, location, and temporal order, as well as bound what–where–when information. The results suggest that older adults are able to ameliorate integration memory deficits by using self-initiated encoding strategies when these are externally located and therefore place reduced demands on working memory and attentional resources. PMID:26884230
A finite-state, finite-memory minimum principle, part 2

NASA Technical Reports Server (NTRS)

Sandell, N. R., Jr.; Athans, M.

1975-01-01

In part 1 of this paper, a minimum principle was found for the finite-state, finite-memory (FSFM) stochastic control problem. In part 2, conditions for the sufficiency of the minimum principle are stated in terms of the informational properties of the problem. This is accomplished by introducing the notion of a signaling strategy. Then a min-H algorithm based on the FSFM minimum principle is presented. This algorithm converges, after a finite number of steps, to a person - by - person extremal solution.
MCMAC-cVT: a novel on-line associative memory based CVT transmission control system.

PubMed

Ang, K K; Quek, C; Wahab, A

2002-03-01

This paper describes a novel application of an associative memory called the Modified Cerebellar Articulation Controller (MCMAC) (Int. J. Artif. Intell. Engng, 10 (1996) 135) in a continuous variable transmission (CVT) control system. It allows the on-line tuning of the associative memory and produces an effective gain-schedule for the automatic selection of the CVT gear ratio. Various control algorithms are investigated to control the CVT gear ratio to maintain the engine speed within a narrow range of efficient operating speed independently of the vehicle velocity. Extensive simulation results are presented to evaluate the control performance of a direct digital PID control algorithm with auto-tuning (Trans. ASME, 64 (1942)) and anti-windup mechanism. In particular, these results are contrasted against the control performance produced using the MCMAC (Int. J. Artif. Intell. Engng, 10 (1996) 135) with momentum, neighborhood learning and Averaged Trapezoidal Output (MCMAC-ATO) as the neural control algorithm for controlling the CVT. Simulation results are presented that show the reduced control fluctuations and improved learning capability of the MCMAC-ATO without incurring greater memory requirement. In particular, MCMAC-ATO is able to learn and control the CVT simultaneously while still maintaining acceptable control performance.
I/O efficient algorithms and applications in geographic information systems

NASA Astrophysics Data System (ADS)

Danner, Andrew

Modern remote sensing methods such a laser altimetry (lidar) and Interferometric Synthetic Aperture Radar (IfSAR) produce georeferenced elevation data at unprecedented rates. Many Geographic Information System (GIS) algorithms designed for terrain modelling applications cannot process these massive data sets. The primary problem is that these data sets are too large to fit in the main internal memory of modern computers and must therefore reside on larger, but considerably slower disks. In these applications, the transfer of data between disk and main memory, or I/O, becomes the primary bottleneck. Working in a theoretical model that more accurately represents this two level memory hierarchy, we can develop algorithms that are I/O-efficient and reduce the amount of disk I/O needed to solve a problem. In this thesis we aim to modernize GIS algorithms and develop a number of I/O-efficient algorithms for processing geographic data derived from massive elevation data sets. For each application, we convert a geographic question to an algorithmic question, develop an I/O-efficient algorithm that is theoretically efficient, implement our approach and verify its performance using real-world data. The applications we consider include constructing a gridded digital elevation model (DEM) from an irregularly spaced point cloud, removing topological noise from a DEM, modeling surface water flow over a terrain, extracting river networks and watershed hierarchies from the terrain, and locating polygons containing query points in a planar subdivision. We initially developed solutions to each of these applications individually. However, we also show how to combine individual solutions to form a scalable geo-processing pipeline that seamlessly solves a sequence of sub-problems with little or no manual intervention. We present experimental results that demonstrate orders of magnitude improvement over previously known algorithms.
Usefulness of a single item in a mail survey to identify persons with possible dementia: a new strategy for finding high-risk elders.

PubMed

Brody, Kathleen K; Maslow, Katie; Perrin, Nancy A; Crooks, Valerie; DellaPenna, Richard; Kuang, Daniel

2005-04-01

The objective of this study was to examine the characteristics of elderly persons who responded positively to a question about "severe memory problems" on a mailed health questionnaire yet were missed by the existing health risk algorithm to identify vulnerable elderly persons. A total of 324,471 respondents aged 65 and older completed a primary care health status questionnaire that gathered clinical information to quickly identify members with functional impairment, multiple chronic diseases, and higher medical care needs. The respondents were part of a large, integrated, not-for-profit managed care organization that implemented a model of care for elders using a uniform risk identification method across eight regions. Respondents with severe memory problems were compared to general respondents by morbidity, geriatric syndromes, functional impairments, service utilization, sensory impairments, sociodemographic characteristics, and activities of daily living. Of the respondents, 13,902 persons (4.3%) reported severe memory problems; the existing health risk algorithm missed 47.1% of these. When severe memory problems were included in the risk algorithm, identification increased from 11% to 13%, and risk prevalence by age groups ranged from 4.4% to 40.5%; one third had severe memory problems, a finding that was fairly consistent within age groups (28.4% to 36.5%). A question about severe memory problems should be incorporated into population risk-identification techniques. While false-negative rates are unknown, the false-positive rate of a self-report mail survey appears to be minimal. Persons reporting severe memory problems clearly have multiple comorbidities, higher prevalence of geriatric syndromes, and greater functional and sensory impairments.
Efficient L1 regularization-based reconstruction for fluorescent molecular tomography using restarted nonlinear conjugate gradient.

PubMed

Shi, Junwei; Zhang, Bin; Liu, Fei; Luo, Jianwen; Bai, Jing

2013-09-15

For the ill-posed fluorescent molecular tomography (FMT) inverse problem, the L1 regularization can protect the high-frequency information like edges while effectively reduce the image noise. However, the state-of-the-art L1 regularization-based algorithms for FMT reconstruction are expensive in memory, especially for large-scale problems. An efficient L1 regularization-based reconstruction algorithm based on nonlinear conjugate gradient with restarted strategy is proposed to increase the computational speed with low memory consumption. The reconstruction results from phantom experiments demonstrate that the proposed algorithm can obtain high spatial resolution and high signal-to-noise ratio, as well as high localization accuracy for fluorescence targets.
Parallel k-means++ for Multiple Shared-Memory Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mackey, Patrick S.; Lewis, Robert R.

2016-09-22

In recent years k-means++ has become a popular initialization technique for improved k-means clustering. To date, most of the work done to improve its performance has involved parallelizing algorithms that are only approximations of k-means++. In this paper we present a parallelization of the exact k-means++ algorithm, with a proof of its correctness. We develop implementations for three distinct shared-memory architectures: multicore CPU, high performance GPU, and the massively multithreaded Cray XMT platform. We demonstrate the scalability of the algorithm on each platform. In addition we present a visual approach for showing which platform performed k-means++ the fastest for varyingmore » data sizes.« less
Parallel language constructs for tensor product computations on loosely coupled architectures

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush; Vanrosendale, John

1989-01-01

Distributed memory architectures offer high levels of performance and flexibility, but have proven awkard to program. Current languages for nonshared memory architectures provide a relatively low level programming environment, and are poorly suited to modular programming, and to the construction of libraries. A set of language primitives designed to allow the specification of parallel numerical algorithms at a higher level is described. Tensor product array computations are focused on along with a simple but important class of numerical algorithms. The problem of programming 1-D kernal routines is focused on first, such as parallel tridiagonal solvers, and then how such parallel kernels can be combined to form parallel tensor product algorithms is examined.
Artificial intelligence tools for pattern recognition

NASA Astrophysics Data System (ADS)

Acevedo, Elena; Acevedo, Antonio; Felipe, Federico; Avilés, Pedro

2017-06-01

In this work, we present a system for pattern recognition that combines the power of genetic algorithms for solving problems and the efficiency of the morphological associative memories. We use a set of 48 tire prints divided into 8 brands of tires. The images have dimensions of 200 x 200 pixels. We applied Hough transform to obtain lines as main features. The number of lines obtained is 449. The genetic algorithm reduces the number of features to ten suitable lines that give thus the 100% of recognition. Morphological associative memories were used as evaluation function. The selection algorithms were Tournament and Roulette wheel. For reproduction, we applied one-point, two-point and uniform crossover.
An improved algorithm for evaluating trellis phase codes

NASA Technical Reports Server (NTRS)

Mulligan, M. G.; Wilson, S. G.

1982-01-01

A method is described for evaluating the minimum distance parameters of trellis phase codes, including CPFSK, partial response FM, and more importantly, coded CPM (continuous phase modulation) schemes. The algorithm provides dramatically faster execution times and lesser memory requirements than previous algorithms. Results of sample calculations and timing comparisons are included.

An improved algorithm for evaluating trellis phase codes

NASA Technical Reports Server (NTRS)

Mulligan, M. G.; Wilson, S. G.

1984-01-01

A method is described for evaluating the minimum distance parameters of trellis phase codes, including CPFSK, partial response FM, and more importantly, coded CPM (continuous phase modulation) schemes. The algorithm provides dramatically faster execution times and lesser memory requirements than previous algorithms. Results of sample calculations and timing comparisons are included.
A Neural Network Model of Retrieval-Induced Forgetting

ERIC Educational Resources Information Center

Norman, Kenneth A.; Newman, Ehren L.; Detre, Greg

2007-01-01

Retrieval-induced forgetting (RIF) refers to the finding that retrieving a memory can impair subsequent recall of related memories. Here, the authors present a new model of how the brain gives rise to RIF in both semantic and episodic memory. The core of the model is a recently developed neural network learning algorithm that leverages regular…
Effects of cacheing on multitasking efficiency and programming strategy on an ELXSI 6400

DOE Office of Scientific and Technical Information (OSTI.GOV)

Montry, G.R.; Benner, R.E.

1985-12-01

The impact of a cache/shared memory architecture, and, in particular, the cache coherency problem, upon concurrent algorithm and program development is discussed. In this context, a simple set of programming strategies are proposed which streamline code development and improve code performance when multitasking in a cache/shared memory or distributed memory environment.
Multiple Memory Stores and Operant Conditioning: A Rationale for Memory's Complexity

ERIC Educational Resources Information Center

Meeter, Martijn; Veldkamp, Rob; Jin, Yaochu

2009-01-01

Why does the brain contain more than one memory system? Genetic algorithms can play a role in elucidating this question. Here, model animals were constructed containing a dorsal striatal layer that controlled actions, and a ventral striatal layer that controlled a dopaminergic learning signal. Both layers could gain access to three modeled memory…
A distributed-memory approximation algorithm for maximum weight perfect bipartite matching

DOE Office of Scientific and Technical Information (OSTI.GOV)

Azad, Ariful; Buluc, Aydin; Li, Xiaoye S.

We design and implement an efficient parallel approximation algorithm for the problem of maximum weight perfect matching in bipartite graphs, i.e. the problem of finding a set of non-adjacent edges that covers all vertices and has maximum weight. This problem differs from the maximum weight matching problem, for which scalable approximation algorithms are known. It is primarily motivated by finding good pivots in scalable sparse direct solvers before factorization where sequential implementations of maximum weight perfect matching algorithms, such as those available in MC64, are widely used due to the lack of scalable alternatives. To overcome this limitation, we proposemore » a fully parallel distributed memory algorithm that first generates a perfect matching and then searches for weightaugmenting cycles of length four in parallel and iteratively augments the matching with a vertex disjoint set of such cycles. For most practical problems the weights of the perfect matchings generated by our algorithm are very close to the optimum. An efficient implementation of the algorithm scales up to 256 nodes (17,408 cores) on a Cray XC40 supercomputer and can solve instances that are too large to be handled by a single node using the sequential algorithm.« less
Benchmarking Memory Performance with the Data Cube Operator

NASA Technical Reports Server (NTRS)

Frumkin, Michael A.; Shabanov, Leonid V.

2004-01-01

Data movement across a computer memory hierarchy and across computational grids is known to be a limiting factor for applications processing large data sets. We use the Data Cube Operator on an Arithmetic Data Set, called ADC, to benchmark capabilities of computers and of computational grids to handle large distributed data sets. We present a prototype implementation of a parallel algorithm for computation of the operatol: The algorithm follows a known approach for computing views from the smallest parent. The ADC stresses all levels of grid memory and storage by producing some of 2d views of an Arithmetic Data Set of d-tuples described by a small number of integers. We control data intensity of the ADC by selecting the tuple parameters, the sizes of the views, and the number of realized views. Benchmarking results of memory performance of a number of computer architectures and of a small computational grid are presented.
Inductive reasoning and implicit memory: evidence from intact and impaired memory systems.

PubMed

Girelli, Luisa; Semenza, Carlo; Delazer, Margarete

2004-01-01

In this study, we modified a classic problem solving task, number series completion, in order to explore the contribution of implicit memory to inductive reasoning. Participants were required to complete number series sharing the same underlying algorithm (e.g., +2), differing in both constituent elements (e.g., 2468 versus 57911) and correct answers (e.g., 10 versus 13). In Experiment 1, reliable priming effects emerged, whether primes and targets were separated by four or ten fillers. Experiment 2 provided direct evidence that the observed facilitation arises at central stages of problem solving, namely the identification of the algorithm and its subsequent extrapolation. The observation of analogous priming effects in a severely amnesic patient strongly supports the hypothesis that the facilitation in number series completion was largely determined by implicit memory processes. These findings demonstrate that the influence of implicit processes extends to higher level cognitive domain such as induction reasoning.
Reed Solomon codes for error control in byte organized computer memory systems

NASA Technical Reports Server (NTRS)

Lin, S.; Costello, D. J., Jr.

1984-01-01

A problem in designing semiconductor memories is to provide some measure of error control without requiring excessive coding overhead or decoding time. In LSI and VLSI technology, memories are often organized on a multiple bit (or byte) per chip basis. For example, some 256K-bit DRAM's are organized in 32Kx8 bit-bytes. Byte oriented codes such as Reed Solomon (RS) codes can provide efficient low overhead error control for such memories. However, the standard iterative algorithm for decoding RS codes is too slow for these applications. Some special decoding techniques for extended single-and-double-error-correcting RS codes which are capable of high speed operation are presented. These techniques are designed to find the error locations and the error values directly from the syndrome without having to use the iterative algorithm to find the error locator polynomial.
On improving linear solver performance: a block variant of GMRES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, A H; Dennis, J M; Jessup, E R

2004-05-10

The increasing gap between processor performance and memory access time warrants the re-examination of data movement in iterative linear solver algorithms. For this reason, we explore and establish the feasibility of modifying a standard iterative linear solver algorithm in a manner that reduces the movement of data through memory. In particular, we present an alternative to the restarted GMRES algorithm for solving a single right-hand side linear system Ax = b based on solving the block linear system AX = B. Algorithm performance, i.e. time to solution, is improved by using the matrix A in operations on groups of vectors.more » Experimental results demonstrate the importance of implementation choices on data movement as well as the effectiveness of the new method on a variety of problems from different application areas.« less
A generalized LSTM-like training algorithm for second-order recurrent neural networks

PubMed Central

Monner, Derek; Reggia, James A.

2011-01-01

The Long Short Term Memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM’s original training algorithm provides the important properties of spatial and temporal locality, which are missing from other training approaches, at the cost of limiting it’s applicability to a small set of network architectures. Here we introduce the Generalized Long Short-Term Memory (LSTM-g) training algorithm, which provides LSTM-like locality while being applicable without modification to a much wider range of second-order network architectures. With LSTM-g, all units have an identical set of operating instructions for both activation and learning, subject only to the configuration of their local environment in the network; this is in contrast to the original LSTM training algorithm, where each type of unit has its own activation and training instructions. When applied to LSTM architectures with peephole connections, LSTM-g takes advantage of an additional source of back-propagated error which can enable better performance than the original algorithm. Enabled by the broad architectural applicability of LSTM-g, we demonstrate that training recurrent networks engineered for specific tasks can produce better results than single-layer networks. We conclude that LSTM-g has the potential to both improve the performance and broaden the applicability of spatially and temporally local gradient-based training algorithms for recurrent neural networks. PMID:21803542
A Damping Grid Strapdown Inertial Navigation System Based on a Kalman Filter for Ships in Polar Regions.

PubMed

Huang, Weiquan; Fang, Tao; Luo, Li; Zhao, Lin; Che, Fengzhu

2017-07-03

The grid strapdown inertial navigation system (SINS) used in polar navigation also includes three kinds of periodic oscillation errors as common SINS are based on a geographic coordinate system. Aiming ships which have the external information to conduct a system reset regularly, suppressing the Schuler periodic oscillation is an effective way to enhance navigation accuracy. The Kalman filter based on the grid SINS error model which applies to the ship is established in this paper. The errors of grid-level attitude angles can be accurately estimated when the external velocity contains constant error, and then correcting the errors of the grid-level attitude angles through feedback correction can effectively dampen the Schuler periodic oscillation. The simulation results show that with the aid of external reference velocity, the proposed external level damping algorithm based on the Kalman filter can suppress the Schuler periodic oscillation effectively. Compared with the traditional external level damping algorithm based on the damping network, the algorithm proposed in this paper can reduce the overshoot errors when the state of grid SINS is switched from the non-damping state to the damping state, and this effectively improves the navigation accuracy of the system.
Solitonic Josephson-based meminductive systems

DOE PAGES

Guarcello, Claudio; Solinas, Paolo; Di Ventra, Massimiliano; ...

2017-04-24

Memristors, memcapacitors, and meminductors represent an innovative generation of circuit elements whose properties depend on the state and history of the system. The hysteretic behavior of one of their constituent variables, is their distinctive fingerprint. This feature endows them with the ability to store and process information on the same physical location, a property that is expected to benefit many applications ranging from unconventional computing to adaptive electronics to robotics. Therefore, it is important to find appropriate memory elements that combine a wide range of memory states, long memory retention times, and protection against unavoidable noise. Although several physical systemsmore » belong to the general class of memelements, few of them combine these important physical features in a single component. Here in this paper, we demonstrate theoretically a superconducting memory based on solitonic long Josephson junctions. Moreover, since solitons are at the core of its operation, this system provides an intrinsic topological protection against external perturbations. We show that the Josephson critical current behaves hysteretically as an external magnetic field is properly swept. Accordingly, long Josephson junctions can be used as multi-state memories, with a controllable number of available states, and in other emerging areas such as memcomputing, i.e., computing directly in/by the memory.« less
Solitonic Josephson-based meminductive systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Guarcello, Claudio; Solinas, Paolo; Di Ventra, Massimiliano

Memristors, memcapacitors, and meminductors represent an innovative generation of circuit elements whose properties depend on the state and history of the system. The hysteretic behavior of one of their constituent variables, is their distinctive fingerprint. This feature endows them with the ability to store and process information on the same physical location, a property that is expected to benefit many applications ranging from unconventional computing to adaptive electronics to robotics. Therefore, it is important to find appropriate memory elements that combine a wide range of memory states, long memory retention times, and protection against unavoidable noise. Although several physical systemsmore » belong to the general class of memelements, few of them combine these important physical features in a single component. Here in this paper, we demonstrate theoretically a superconducting memory based on solitonic long Josephson junctions. Moreover, since solitons are at the core of its operation, this system provides an intrinsic topological protection against external perturbations. We show that the Josephson critical current behaves hysteretically as an external magnetic field is properly swept. Accordingly, long Josephson junctions can be used as multi-state memories, with a controllable number of available states, and in other emerging areas such as memcomputing, i.e., computing directly in/by the memory.« less
Parametric dense stereovision implementation on a system-on chip (SoC).

PubMed

Gardel, Alfredo; Montejo, Pablo; García, Jorge; Bravo, Ignacio; Lázaro, José L

2012-01-01

This paper proposes a novel hardware implementation of a dense recovery of stereovision 3D measurements. Traditionally 3D stereo systems have imposed the maximum number of stereo correspondences, introducing a large restriction on artificial vision algorithms. The proposed system-on-chip (SoC) provides great performance and efficiency, with a scalable architecture available for many different situations, addressing real time processing of stereo image flow. Using double buffering techniques properly combined with pipelined processing, the use of reconfigurable hardware achieves a parametrisable SoC which gives the designer the opportunity to decide its right dimension and features. The proposed architecture does not need any external memory because the processing is done as image flow arrives. Our SoC provides 3D data directly without the storage of whole stereo images. Our goal is to obtain high processing speed while maintaining the accuracy of 3D data using minimum resources. Configurable parameters may be controlled by later/parallel stages of the vision algorithm executed on an embedded processor. Considering hardware FPGA clock of 100 MHz, image flows up to 50 frames per second (fps) of dense stereo maps of more than 30,000 depth points could be obtained considering 2 Mpix images, with a minimum initial latency. The implementation of computer vision algorithms on reconfigurable hardware, explicitly low level processing, opens up the prospect of its use in autonomous systems, and they can act as a coprocessor to reconstruct 3D images with high density information in real time.
Exploring the Effect of Sleep and Reduced Interference on Different Forms of Declarative Memory

PubMed Central

Schönauer, Monika; Pawlizki, Annedore; Köck, Corinna; Gais, Steffen

2014-01-01

Study Objectives: Many studies have found that sleep benefits declarative memory consolidation. However, fundamental questions on the specifics of this effect remain topics of discussion. It is not clear which forms of memory are affected by sleep and whether this beneficial effect is partly mediated by passive protection against interference. Moreover, a putative correlation between the structure of sleep and its memory-enhancing effects is still being discussed. Design: In three experiments, we tested whether sleep differentially affects various forms of declarative memory. We varied verbal content (verbal/nonverbal), item type (single/associate), and recall mode (recall/recognition, cued/free recall) to examine the effect of sleep on specific memory subtypes. We compared within-subject differences in memory consolidation between intervals including sleep, active wakefulness, or quiet meditation, which reduced external as well as internal interference and rehearsal. Participants: Forty healthy adults aged 18–30 y, and 17 healthy adults aged 24–55 y with extensive meditation experience participated in the experiments. Results: All types of memory were enhanced by sleep if the sample size provided sufficient statistical power. Smaller sample sizes showed an effect of sleep if a combined measure of different declarative memory scales was used. In a condition with reduced external and internal interference, performance was equal to one with high interference. Here, memory consolidation was significantly lower than in a sleep condition. We found no correlation between sleep structure and memory consolidation. Conclusions: Sleep does not preferentially consolidate a specific kind of declarative memory, but consistently promotes overall declarative memory formation. This effect is not mediated by reduced interference. Citation: Schönauer M, Pawlizki A, Köck C, Gais S. Exploring the effect of sleep and reduced interference on different forms of declarative memory. SLEEP 2014;37(12):1995-2007. PMID:25325490
Internalism, Active Externalism, and Nonconceptual Content: The Ins and Outs of Cognition

ERIC Educational Resources Information Center

Dartnall, Terry

2007-01-01

Active externalism (also known as the extended mind hypothesis) says that we use objects and situations in the world as external memory stores that we consult as needs dictate. This gives us economies of storage: We do not need to remember that Bill has blue eyes and wavy hair if we can acquire this information by looking at Bill. I argue for a…
Towards representation of a perceptual color manifold using associative memory for color constancy.

PubMed

Seow, Ming-Jung; Asari, Vijayan K

2009-01-01

In this paper, we propose the concept of a manifold of color perception through empirical observation that the center-surround properties of images in a perceptually similar environment define a manifold in the high dimensional space. Such a manifold representation can be learned using a novel recurrent neural network based learning algorithm. Unlike the conventional recurrent neural network model in which the memory is stored in an attractive fixed point at discrete locations in the state space, the dynamics of the proposed learning algorithm represent memory as a nonlinear line of attraction. The region of convergence around the nonlinear line is defined by the statistical characteristics of the training data. This learned manifold can then be used as a basis for color correction of the images having different color perception to the learned color perception. Experimental results show that the proposed recurrent neural network learning algorithm is capable of color balance the lighting variations in images captured in different environments successfully.
A Screen Space GPGPU Surface LIC Algorithm for Distributed Memory Data Parallel Sort Last Rendering Infrastructures

NASA Astrophysics Data System (ADS)

Loring, B.; Karimabadi, H.; Rortershteyn, V.

2015-10-01

The surface line integral convolution(LIC) visualization technique produces dense visualization of vector fields on arbitrary surfaces. We present a screen space surface LIC algorithm for use in distributed memory data parallel sort last rendering infrastructures. The motivations for our work are to support analysis of datasets that are too large to fit in the main memory of a single computer and compatibility with prevalent parallel scientific visualization tools such as ParaView and VisIt. By working in screen space using OpenGL we can leverage the computational power of GPUs when they are available and run without them when they are not. We address efficiency and performance issues that arise from the transformation of data from physical to screen space by selecting an alternate screen space domain decomposition. We analyze the algorithm's scaling behavior with and without GPUs on two high performance computing systems using data from turbulent plasma simulations.
Parallel computing of a digital hologram and particle searching for microdigital-holographic particle-tracking velocimetry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Satake, Shin-ichi; Kanamori, Hiroyuki; Kunugi, Tomoaki

2007-02-01

We have developed a parallel algorithm for microdigital-holographic particle-tracking velocimetry. The algorithm is used in (1) numerical reconstruction of a particle image computer using a digital hologram, and (2) searching for particles. The numerical reconstruction from the digital hologram makes use of the Fresnel diffraction equation and the FFT (fast Fourier transform),whereas the particle search algorithm looks for local maximum graduation in a reconstruction field represented by a 3D matrix. To achieve high performance computing for both calculations (reconstruction and particle search), two memory partitions are allocated to the 3D matrix. In this matrix, the reconstruction part consists of horizontallymore » placed 2D memory partitions on the x-y plane for the FFT, whereas, the particle search part consists of vertically placed 2D memory partitions set along the z axes.Consequently, the scalability can be obtained for the proportion of processor elements,where the benchmarks are carried out for parallel computation by a SGI Altix machine.« less
A Screen Space GPGPU Surface LIC Algorithm for Distributed Memory Data Parallel Sort Last Rendering Infrastructures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Loring, Burlen; Karimabadi, Homa; Rortershteyn, Vadim

2014-07-01

The surface line integral convolution(LIC) visualization technique produces dense visualization of vector fields on arbitrary surfaces. We present a screen space surface LIC algorithm for use in distributed memory data parallel sort last rendering infrastructures. The motivations for our work are to support analysis of datasets that are too large to fit in the main memory of a single computer and compatibility with prevalent parallel scientific visualization tools such as ParaView and VisIt. By working in screen space using OpenGL we can leverage the computational power of GPUs when they are available and run without them when they are not.more » We address efficiency and performance issues that arise from the transformation of data from physical to screen space by selecting an alternate screen space domain decomposition. We analyze the algorithm's scaling behavior with and without GPUs on two high performance computing systems using data from turbulent plasma simulations.« less

Stochastic quasi-Newton molecular simulations

NASA Astrophysics Data System (ADS)

Chau, C. D.; Sevink, G. J. A.; Fraaije, J. G. E. M.

2010-08-01

We report a new and efficient factorized algorithm for the determination of the adaptive compound mobility matrix B in a stochastic quasi-Newton method (S-QN) that does not require additional potential evaluations. For one-dimensional and two-dimensional test systems, we previously showed that S-QN gives rise to efficient configurational space sampling with good thermodynamic consistency [C. D. Chau, G. J. A. Sevink, and J. G. E. M. Fraaije, J. Chem. Phys. 128, 244110 (2008)10.1063/1.2943313]. Potential applications of S-QN are quite ambitious, and include structure optimization, analysis of correlations and automated extraction of cooperative modes. However, the potential can only be fully exploited if the computational and memory requirements of the original algorithm are significantly reduced. In this paper, we consider a factorized mobility matrix B=JJT and focus on the nontrivial fundamentals of an efficient algorithm for updating the noise multiplier J . The new algorithm requires O(n2) multiplications per time step instead of the O(n3) multiplications in the original scheme due to Choleski decomposition. In a recursive form, the update scheme circumvents matrix storage and enables limited-memory implementation, in the spirit of the well-known limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) method, allowing for a further reduction of the computational effort to O(n) . We analyze in detail the performance of the factorized (FSU) and limited-memory (L-FSU) algorithms in terms of convergence and (multiscale) sampling, for an elementary but relevant system that involves multiple time and length scales. Finally, we use this analysis to formulate conditions for the simulation of the complex high-dimensional potential energy landscapes of interest.
The effects of aging on ERP correlates of source memory retrieval for self-referential information.

PubMed

Dulas, Michael R; Newsome, Rachel N; Duarte, Audrey

2011-03-04

Numerous behavioral studies have suggested that normal aging negatively affects source memory accuracy for various kinds of associations. Neuroimaging evidence suggests that less efficient retrieval processing (temporally delayed and attenuated) may contribute to these impairments. Previous aging studies have not compared source memory accuracy and corresponding neural activity for different kinds of source details; namely, those that have been encoded via a more or less effective strategy. Thus, it is not yet known whether encoding source details in a self-referential manner, a strategy suggested to promote successful memory in the young and old, may enhance source memory accuracy and reduce the commonly observed age-related changes in neural activity associated with source memory retrieval. Here, we investigated these issues by using event-related potentials (ERPs) to measure the effects of aging on the neural correlates of successful source memory retrieval ("old-new effects") for objects encoded either self-referentially or self-externally. Behavioral results showed that both young and older adults demonstrated better source memory accuracy for objects encoded self-referentially. ERP results showed that old-new effects onsetted earlier for self-referentially encoded items in both groups and that age-related differences in the onset latency of these effects were reduced for self-referentially, compared to self-externally, encoded items. These results suggest that the implementation of an effective encoding strategy, like self-referential processing, may lead to more efficient retrieval, which in turn may improve source memory accuracy in both young and older adults. Published by Elsevier B.V.
An algorithm for 4D CT image sorting using spatial continuity.

PubMed

Li, Chen; Liu, Jie

2013-01-01

4D CT, which could locate the position of the movement of the tumor in the entire respiratory cycle and reduce image artifacts effectively, has been widely used in making radiation therapy of tumors. The current 4D CT methods required external surrogates of respiratory motion obtained from extra instruments. However, respiratory signals recorded by these external makers may not always accurately represent the internal tumor and organ movements, especially when irregular breathing patterns happened. In this paper we have proposed a novel automatic 4D CT sorting algorithm that performs without these external surrogates. The sorting algorithm requires collecting the image data with a cine scan protocol. Beginning with the first couch position, images from the adjacent couch position are selected out according to spatial continuity. The process is continued until images from all couch positions are sorted and the entire 3D volume is produced. The algorithm is verified by respiratory phantom image data and clinical image data. The primary test results show that the 4D CT images created by our algorithm have eliminated the motion artifacts effectively and clearly demonstrated the movement of tumor and organ in the breath period.
Learning STEM through Integrative Visual Representations

ERIC Educational Resources Information Center

Virk, Satyugjit Singh

2013-01-01

Previous cognitive models of memory have not comprehensively taken into account the internal cognitive load of chunking isolated information and have emphasized the external cognitive load of visual presentation only. Under the Virk Long Term Working Memory Multimedia Model of cognitive load, drawing from the Cowan model, students presented with…
Data systems and computer science space data systems: Onboard memory and storage

NASA Technical Reports Server (NTRS)

Shull, Tom

1991-01-01

The topics are presented in viewgraph form and include the following: technical objectives; technology challenges; state-of-the-art assessment; mass storage comparison; SODR drive and system concepts; program description; vertical Bloch line (VBL) device concept; relationship to external programs; and backup charts for memory and storage.
The Construction of Semantic Memory: Grammar-Based Representations Learned from Relational Episodic Information

PubMed Central

Battaglia, Francesco P.; Pennartz, Cyriel M. A.

2011-01-01

After acquisition, memories underlie a process of consolidation, making them more resistant to interference and brain injury. Memory consolidation involves systems-level interactions, most importantly between the hippocampus and associated structures, which takes part in the initial encoding of memory, and the neocortex, which supports long-term storage. This dichotomy parallels the contrast between episodic memory (tied to the hippocampal formation), collecting an autobiographical stream of experiences, and semantic memory, a repertoire of facts and statistical regularities about the world, involving the neocortex at large. Experimental evidence points to a gradual transformation of memories, following encoding, from an episodic to a semantic character. This may require an exchange of information between different memory modules during inactive periods. We propose a theory for such interactions and for the formation of semantic memory, in which episodic memory is encoded as relational data. Semantic memory is modeled as a modified stochastic grammar, which learns to parse episodic configurations expressed as an association matrix. The grammar produces tree-like representations of episodes, describing the relationships between its main constituents at multiple levels of categorization, based on its current knowledge of world regularities. These regularities are learned by the grammar from episodic memory information, through an expectation-maximization procedure, analogous to the inside–outside algorithm for stochastic context-free grammars. We propose that a Monte-Carlo sampling version of this algorithm can be mapped on the dynamics of “sleep replay” of previously acquired information in the hippocampus and neocortex. We propose that the model can reproduce several properties of semantic memory such as decontextualization, top-down processing, and creation of schemata. PMID:21887143
Quantum random access memory.

PubMed

Giovannetti, Vittorio; Lloyd, Seth; Maccone, Lorenzo

2008-04-25

A random access memory (RAM) uses n bits to randomly address N=2(n) distinct memory cells. A quantum random access memory (QRAM) uses n qubits to address any quantum superposition of N memory cells. We present an architecture that exponentially reduces the requirements for a memory call: O(logN) switches need be thrown instead of the N used in conventional (classical or quantum) RAM designs. This yields a more robust QRAM algorithm, as it in general requires entanglement among exponentially less gates, and leads to an exponential decrease in the power needed for addressing. A quantum optical implementation is presented.
The default mode network and the working memory network are not anti-correlated during all phases of a working memory task.

PubMed

Piccoli, Tommaso; Valente, Giancarlo; Linden, David E J; Re, Marta; Esposito, Fabrizio; Sack, Alexander T; Di Salle, Francesco

2015-01-01

The default mode network and the working memory network are known to be anti-correlated during sustained cognitive processing, in a load-dependent manner. We hypothesized that functional connectivity among nodes of the two networks could be dynamically modulated by task phases across time. To address the dynamic links between default mode network and the working memory network, we used a delayed visuo-spatial working memory paradigm, which allowed us to separate three different phases of working memory (encoding, maintenance, and retrieval), and analyzed the functional connectivity during each phase within and between the default mode network and the working memory network networks. We found that the two networks are anti-correlated only during the maintenance phase of working memory, i.e. when attention is focused on a memorized stimulus in the absence of external input. Conversely, during the encoding and retrieval phases, when the external stimulation is present, the default mode network is positively coupled with the working memory network, suggesting the existence of a dynamically switching of functional connectivity between "task-positive" and "task-negative" brain networks. Our results demonstrate that the well-established dichotomy of the human brain (anti-correlated networks during rest and balanced activation-deactivation during cognition) has a more nuanced organization than previously thought and engages in different patterns of correlation and anti-correlation during specific sub-phases of a cognitive task. This nuanced organization reinforces the hypothesis of a direct involvement of the default mode network in cognitive functions, as represented by a dynamic rather than static interaction with specific task-positive networks, such as the working memory network.
The Default Mode Network and the Working Memory Network Are Not Anti-Correlated during All Phases of a Working Memory Task

PubMed Central

Piccoli, Tommaso; Valente, Giancarlo; Linden, David E. J.; Re, Marta; Esposito, Fabrizio; Sack, Alexander T.; Salle, Francesco Di

2015-01-01

Introduction The default mode network and the working memory network are known to be anti-correlated during sustained cognitive processing, in a load-dependent manner. We hypothesized that functional connectivity among nodes of the two networks could be dynamically modulated by task phases across time. Methods To address the dynamic links between default mode network and the working memory network, we used a delayed visuo-spatial working memory paradigm, which allowed us to separate three different phases of working memory (encoding, maintenance, and retrieval), and analyzed the functional connectivity during each phase within and between the default mode network and the working memory network networks. Results We found that the two networks are anti-correlated only during the maintenance phase of working memory, i.e. when attention is focused on a memorized stimulus in the absence of external input. Conversely, during the encoding and retrieval phases, when the external stimulation is present, the default mode network is positively coupled with the working memory network, suggesting the existence of a dynamically switching of functional connectivity between “task-positive” and “task-negative” brain networks. Conclusions Our results demonstrate that the well-established dichotomy of the human brain (anti-correlated networks during rest and balanced activation-deactivation during cognition) has a more nuanced organization than previously thought and engages in different patterns of correlation and anti-correlation during specific sub-phases of a cognitive task. This nuanced organization reinforces the hypothesis of a direct involvement of the default mode network in cognitive functions, as represented by a dynamic rather than static interaction with specific task-positive networks, such as the working memory network. PMID:25848951
A depth-first search algorithm to compute elementary flux modes by linear programming

PubMed Central

2014-01-01

Background The decomposition of complex metabolic networks into elementary flux modes (EFMs) provides a useful framework for exploring reaction interactions systematically. Generating a complete set of EFMs for large-scale models, however, is near impossible. Even for moderately-sized models (<400 reactions), existing approaches based on the Double Description method must iterate through a large number of combinatorial candidates, thus imposing an immense processor and memory demand. Results Based on an alternative elementarity test, we developed a depth-first search algorithm using linear programming (LP) to enumerate EFMs in an exhaustive fashion. Constraints can be introduced to directly generate a subset of EFMs satisfying the set of constraints. The depth-first search algorithm has a constant memory overhead. Using flux constraints, a large LP problem can be massively divided and parallelized into independent sub-jobs for deployment into computing clusters. Since the sub-jobs do not overlap, the approach scales to utilize all available computing nodes with minimal coordination overhead or memory limitations. Conclusions The speed of the algorithm was comparable to efmtool, a mainstream Double Description method, when enumerating all EFMs; the attrition power gained from performing flux feasibility tests offsets the increased computational demand of running an LP solver. Unlike the Double Description method, the algorithm enables accelerated enumeration of all EFMs satisfying a set of constraints. PMID:25074068
Efficiently measuring dimensions of the externalizing spectrum model: Development of the Externalizing Spectrum Inventory-Computerized Adaptive Test (ESI-CAT).

PubMed

Sunderland, Matthew; Slade, Tim; Krueger, Robert F; Markon, Kristian E; Patrick, Christopher J; Kramer, Mark D

2017-07-01

The development of the Externalizing Spectrum Inventory (ESI) was motivated by the need to comprehensively assess the interrelated nature of externalizing psychopathology and personality using an empirically driven framework. The ESI measures 23 theoretically distinct yet related unidimensional facets of externalizing, which are structured under 3 superordinate factors representing general externalizing, callous aggression, and substance abuse. One limitation of the ESI is its length at 415 items. To facilitate the use of the ESI in busy clinical and research settings, the current study sought to examine the efficiency and accuracy of a computerized adaptive version of the ESI. Data were collected over 3 waves and totaled 1,787 participants recruited from undergraduate psychology courses as well as male and female state prisons. A series of 6 algorithms with different termination rules were simulated to determine the efficiency and accuracy of each test under 3 different assumed distributions. Scores generated using an optimal adaptive algorithm evidenced high correlations (r > .9) with scores generated using the full ESI, brief ESI item-based factor scales, and the 23 facet scales. The adaptive algorithms for each facet administered a combined average of 115 items, a 72% decrease in comparison to the full ESI. Similarly, scores on the item-based factor scales of the ESI-brief form (57 items) were generated using on average of 17 items, a 70% decrease. The current study successfully demonstrates that an adaptive algorithm can generate similar scores for the ESI and the 3 item-based factor scales using a fraction of the total item pool. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
An NN-Based SRD Decomposition Algorithm and Its Application in Nonlinear Compensation

PubMed Central

Yan, Honghang; Deng, Fang; Sun, Jian; Chen, Jie

2014-01-01

In this study, a neural network-based square root of descending (SRD) order decomposition algorithm for compensating for nonlinear data generated by sensors is presented. The study aims at exploring the optimized decomposition of data 1.00,0.00,0.00 and minimizing the computational complexity and memory space of the training process. A linear decomposition algorithm, which automatically finds the optimal decomposition of N subparts and reduces the training time to 1N and memory cost to 1N, has been implemented on nonlinear data obtained from an encoder. Particular focus is given to the theoretical access of estimating the numbers of hidden nodes and the precision of varying the decomposition method. Numerical experiments are designed to evaluate the effect of this algorithm. Moreover, a designed device for angular sensor calibration is presented. We conduct an experiment that samples the data of an encoder and compensates for the nonlinearity of the encoder to testify this novel algorithm. PMID:25232912
Compact, high-speed algorithm for laying out printed circuit board runs

NASA Astrophysics Data System (ADS)

Zapolotskiy, D. Y.

1985-09-01

A high speed printed circuit connection layout algorithm is described which was developed within the framework of an interactive system for designing two-sided printed circuit broads. For this reason, algorithm speed was considered, a priori, as a requirement equally as important as the inherent demand for minimizing circuit run lengths and the number of junction openings. This resulted from the fact that, in order to provide psychological man/machine compatibility in the design process, real-time dialog during the layout phase is possible only within limited time frames (on the order of several seconds) for each circuit run. The work was carried out for use on an ARM-R automated work site complex based on an SM-4 minicomputer with a 32K-word memory. This limited memory capacity heightened the demand for algorithm speed and also tightened data file structure and size requirements. The layout algorithm's design logic is analyzed. The structure and organization of the data files are described.
HPC-NMF: A High-Performance Parallel Algorithm for Nonnegative Matrix Factorization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kannan, Ramakrishnan; Sukumar, Sreenivas R.; Ballard, Grey M.

NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient distributed algorithms to solve the problem for big data sets. We propose a high-performance distributed-memory parallel algorithm that computes the factorization by iteratively solving alternating non-negative least squares (NLS) subproblems formore » $$\\WW$$ and $$\\HH$$. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). As opposed to previous implementation, our algorithm is also flexible: It performs well for both dense and sparse matrices, and allows the user to choose any one of the multiple algorithms for solving the updates to low rank factors $$\\WW$$ and $$\\HH$$ within the alternating iterations.« less
A Node Linkage Approach for Sequential Pattern Mining

PubMed Central

Navarro, Osvaldo; Cumplido, René; Villaseñor-Pineda, Luis; Feregrino-Uribe, Claudia; Carrasco-Ochoa, Jesús Ariel

2014-01-01

Sequential Pattern Mining is a widely addressed problem in data mining, with applications such as analyzing Web usage, examining purchase behavior, and text mining, among others. Nevertheless, with the dramatic increase in data volume, the current approaches prove inefficient when dealing with large input datasets, a large number of different symbols and low minimum supports. In this paper, we propose a new sequential pattern mining algorithm, which follows a pattern-growth scheme to discover sequential patterns. Unlike most pattern growth algorithms, our approach does not build a data structure to represent the input dataset, but instead accesses the required sequences through pseudo-projection databases, achieving better runtime and reducing memory requirements. Our algorithm traverses the search space in a depth-first fashion and only preserves in memory a pattern node linkage and the pseudo-projections required for the branch being explored at the time. Experimental results show that our new approach, the Node Linkage Depth-First Traversal algorithm (NLDFT), has better performance and scalability in comparison with state of the art algorithms. PMID:24933123
A novel global Harmony Search method based on Ant Colony Optimisation algorithm

NASA Astrophysics Data System (ADS)

Fouad, Allouani; Boukhetala, Djamel; Boudjema, Fares; Zenger, Kai; Gao, Xiao-Zhi

2016-03-01

The Global-best Harmony Search (GHS) is a stochastic optimisation algorithm recently developed, which hybridises the Harmony Search (HS) method with the concept of swarm intelligence in the particle swarm optimisation (PSO) to enhance its performance. In this article, a new optimisation algorithm called GHSACO is developed by incorporating the GHS with the Ant Colony Optimisation algorithm (ACO). Our method introduces a novel improvisation process, which is different from that of the GHS in the following aspects. (i) A modified harmony memory (HM) representation and conception. (ii) The use of a global random switching mechanism to monitor the choice between the ACO and GHS. (iii) An additional memory consideration selection rule using the ACO random proportional transition rule with a pheromone trail update mechanism. The proposed GHSACO algorithm has been applied to various benchmark functions and constrained optimisation problems. Simulation results demonstrate that it can find significantly better solutions when compared with the original HS and some of its variants.
Hybrid-optimization strategy for the communication of large-scale Kinetic Monte Carlo simulation

NASA Astrophysics Data System (ADS)

Wu, Baodong; Li, Shigang; Zhang, Yunquan; Nie, Ningming

2017-02-01

The parallel Kinetic Monte Carlo (KMC) algorithm based on domain decomposition has been widely used in large-scale physical simulations. However, the communication overhead of the parallel KMC algorithm is critical, and severely degrades the overall performance and scalability. In this paper, we present a hybrid optimization strategy to reduce the communication overhead for the parallel KMC simulations. We first propose a communication aggregation algorithm to reduce the total number of messages and eliminate the communication redundancy. Then, we utilize the shared memory to reduce the memory copy overhead of the intra-node communication. Finally, we optimize the communication scheduling using the neighborhood collective operations. We demonstrate the scalability and high performance of our hybrid optimization strategy by both theoretical and experimental analysis. Results show that the optimized KMC algorithm exhibits better performance and scalability than the well-known open-source library-SPPARKS. On 32-node Xeon E5-2680 cluster (total 640 cores), the optimized algorithm reduces the communication time by 24.8% compared with SPPARKS.
How generation affects source memory.

PubMed

Geghman, Kindiya D; Multhaup, Kristi S

2004-07-01

Generation effects (better memory for self-produced items than for provided items) typically occur in item memory. Jurica and Shimamura (1999) reported a negative generation effect in source memory, but their procedure did not test participants on the items they had generated. In Experiment 1, participants answered questions and read statements made by a face on a computer screen. The target word was unscrambled, or letters were filled in. Generation effects were found for target recall and source recognition (which person did which task). Experiment 2 extended these findings to a condition in which the external sources were two different faces. Generation had a positive effect on source memory, supporting an overlap in the underlying mechanisms of item and source memory.
Injecting Errors for Testing Built-In Test Software

NASA Technical Reports Server (NTRS)

Gender, Thomas K.; Chow, James

2010-01-01

Two algorithms have been conceived to enable automated, thorough testing of Built-in test (BIT) software. The first algorithm applies to BIT routines that define pass/fail criteria based on values of data read from such hardware devices as memories, input ports, or registers. This algorithm simulates effects of errors in a device under test by (1) intercepting data from the device and (2) performing AND operations between the data and the data mask specific to the device. This operation yields values not expected by the BIT routine. This algorithm entails very small, permanent instrumentation of the software under test (SUT) for performing the AND operations. The second algorithm applies to BIT programs that provide services to users application programs via commands or callable interfaces and requires a capability for test-driver software to read and write the memory used in execution of the SUT. This algorithm identifies all SUT code execution addresses where errors are to be injected, then temporarily replaces the code at those addresses with small test code sequences to inject latent severe errors, then determines whether, as desired, the SUT detects the errors and recovers
Evolutionary algorithm optimization of biological learning parameters in a biomimetic neuroprosthesis

PubMed Central

Dura-Bernal, S.; Neymotin, S. A.; Kerr, C. C.; Sivagnanam, S.; Majumdar, A.; Francis, J. T.; Lytton, W. W.

2017-01-01

Biomimetic simulation permits neuroscientists to better understand the complex neuronal dynamics of the brain. Embedding a biomimetic simulation in a closed-loop neuroprosthesis, which can read and write signals from the brain, will permit applications for amelioration of motor, psychiatric, and memory-related brain disorders. Biomimetic neuroprostheses require real-time adaptation to changes in the external environment, thus constituting an example of a dynamic data-driven application system. As model fidelity increases, so does the number of parameters and the complexity of finding appropriate parameter configurations. Instead of adapting synaptic weights via machine learning, we employed major biological learning methods: spike-timing dependent plasticity and reinforcement learning. We optimized the learning metaparameters using evolutionary algorithms, which were implemented in parallel and which used an island model approach to obtain sufficient speed. We employed these methods to train a cortical spiking model to utilize macaque brain activity, indicating a selected target, to drive a virtual musculoskeletal arm with realistic anatomical and biomechanical properties to reach to that target. The optimized system was able to reproduce macaque data from a comparable experimental motor task. These techniques can be used to efficiently tune the parameters of multiscale systems, linking realistic neuronal dynamics to behavior, and thus providing a useful tool for neuroscience and neuroprosthetics. PMID:29200477

Real-time high-level video understanding using data warehouse

NASA Astrophysics Data System (ADS)

Lienard, Bruno; Desurmont, Xavier; Barrie, Bertrand; Delaigle, Jean-Francois

2006-02-01

High-level Video content analysis such as video-surveillance is often limited by computational aspects of automatic image understanding, i.e. it requires huge computing resources for reasoning processes like categorization and huge amount of data to represent knowledge of objects, scenarios and other models. This article explains how to design and develop a "near real-time adaptive image datamart", used, as a decisional support system for vision algorithms, and then as a mass storage system. Using RDF specification as storing format of vision algorithms meta-data, we can optimise the data warehouse concepts for video analysis, add some processes able to adapt the current model and pre-process data to speed-up queries. In this way, when new data is sent from a sensor to the data warehouse for long term storage, using remote procedure call embedded in object-oriented interfaces to simplified queries, they are processed and in memory data-model is updated. After some processing, possible interpretations of this data can be returned back to the sensor. To demonstrate this new approach, we will present typical scenarios applied to this architecture such as people tracking and events detection in a multi-camera network. Finally we will show how this system becomes a high-semantic data container for external data-mining.
Including Memory Friction in Single- and Two-State Quantum Dynamics Simulations.

PubMed

Brown, Paul A; Messina, Michael

2016-03-03

We present a simple computational algorithm that allows for the inclusion of memory friction in a quantum dynamics simulation of a small, quantum, primary system coupled to many atoms in the surroundings. We show how including a memory friction operator, F̂, in the primary quantum system's Hamiltonian operator builds memory friction into the dynamics of the primary quantum system. We show that, in the harmonic, semi-classical limit, this friction operator causes the classical phase-space centers of a wavepacket to evolve exactly as if it were a classical particle experiencing memory friction. We also show that this friction operator can be used to include memory friction in the quantum dynamics of an anharmonic primary system. We then generalize the algorithm so that it can be used to treat a primary quantum system that is evolving, non-adiabatically on two coupled potential energy surfaces, i.e., a model that can be used to model H atom transfer, for example. We demonstrate this approach's computational ease and flexibility by showing numerical results for both harmonic and anharmonic primary quantum systems in the single surface case. Finally, we present numerical results for a model of non-adiabatic H atom transfer between a reactant and product state that includes memory friction on one or both of the non-adiabatic potential energy surfaces and uncover some interesting dynamical effects of non-memory friction on the H atom transfer process.
PELEC

DOE Office of Scientific and Technical Information (OSTI.GOV)

2017-05-17

PeleC is an adaptive-mesh compressible hydrodynamics code for reacting flows. It solves the compressible Navier-Stokes with multispecies transport in a block structured framework. The resulting algorithm is well suited for flows with localized resolution requirements and robust to discontinuities. User controllable refinement crieteria has the potential to result in extremely small numerical dissipation and dispersion, making this code appropriate for both research and applied usage. The code is built on the AMReX library which facilitates hierarchical parallelism and manages distributed memory parallism. PeleC algorithms are implemented to express shared memory parallelism.
IoT security with one-time pad secure algorithm based on the double memory technique

NASA Astrophysics Data System (ADS)

Wiśniewski, Remigiusz; Grobelny, Michał; Grobelna, Iwona; Bazydło, Grzegorz

2017-11-01

Secure encryption of data in Internet of Things is especially important as many information is exchanged every day and the number of attack vectors on IoT elements still increases. In the paper a novel symmetric encryption method is proposed. The idea bases on the one-time pad technique. The proposed solution applies double memory concept to secure transmitted data. The presented algorithm is considered as a part of communication protocol and it has been initially validated against known security issues.
TESS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dmitriy Morozov, Tom Peterka

2014-07-29

Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets. As the scale of simulations and observations surpasses billions of particles, a distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this software is a distributed-memory parallel Delaunay and Voronoi tessellation algorithm based on existing serial computational geometry libraries that automatically determines which neighbor points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include the addition of periodic and wall boundary conditions.
Modeling and Bayesian parameter estimation for shape memory alloy bending actuators

NASA Astrophysics Data System (ADS)

Crews, John H.; Smith, Ralph C.

2012-04-01

In this paper, we employ a homogenized energy model (HEM) for shape memory alloy (SMA) bending actuators. Additionally, we utilize a Bayesian method for quantifying parameter uncertainty. The system consists of a SMA wire attached to a flexible beam. As the actuator is heated, the beam bends, providing endoscopic motion. The model parameters are fit to experimental data using an ordinary least-squares approach. The uncertainty in the fit model parameters is then quantified using Markov Chain Monte Carlo (MCMC) methods. The MCMC algorithm provides bounds on the parameters, which will ultimately be used in robust control algorithms. One purpose of the paper is to test the feasibility of the Random Walk Metropolis algorithm, the MCMC method used here.
Managing coherence via put/get windows

DOEpatents

Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton on Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Yorktown Heights, NY

2011-01-11

A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
Managing coherence via put/get windows

DOEpatents

Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton on Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Yorktown Heights, NY

2012-02-21

A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
SLIC superpixels compared to state-of-the-art superpixel methods.

PubMed

Achanta, Radhakrishna; Shaji, Appu; Smith, Kevin; Lucchi, Aurelien; Fua, Pascal; Süsstrunk, Sabine

2012-11-01

Computer vision applications have come to rely increasingly on superpixels in recent years, but it is not always clear what constitutes a good superpixel algorithm. In an effort to understand the benefits and drawbacks of existing methods, we empirically compare five state-of-the-art superpixel algorithms for their ability to adhere to image boundaries, speed, memory efficiency, and their impact on segmentation performance. We then introduce a new superpixel algorithm, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels. Despite its simplicity, SLIC adheres to boundaries as well as or better than previous methods. At the same time, it is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.
a Gross Error Elimination Method for Point Cloud Data Based on Kd-Tree

NASA Astrophysics Data System (ADS)

Kang, Q.; Huang, G.; Yang, S.

2018-04-01

Point cloud data has been one type of widely used data sources in the field of remote sensing. Key steps of point cloud data's pro-processing focus on gross error elimination and quality control. Owing to the volume feature of point could data, existed gross error elimination methods need spend massive memory both in space and time. This paper employed a new method which based on Kd-tree algorithm to construct, k-nearest neighbor algorithm to search, settled appropriate threshold to determine with result turns out a judgement that whether target point is or not an outlier. Experimental results show that, our proposed algorithm will help to delete gross error in point cloud data and facilitate to decrease memory consumption, improve efficiency.
Paradeisos: A perfect hashing algorithm for many-body eigenvalue problems

NASA Astrophysics Data System (ADS)

Jia, C. J.; Wang, Y.; Mendl, C. B.; Moritz, B.; Devereaux, T. P.

2018-03-01

We describe an essentially perfect hashing algorithm for calculating the position of an element in an ordered list, appropriate for the construction and manipulation of many-body Hamiltonian, sparse matrices. Each element of the list corresponds to an integer value whose binary representation reflects the occupation of single-particle basis states for each element in the many-body Hilbert space. The algorithm replaces conventional methods, such as binary search, for locating the elements of the ordered list, eliminating the need to store the integer representation for each element, without increasing the computational complexity. Combined with the "checkerboard" decomposition of the Hamiltonian matrix for distribution over parallel computing environments, this leads to a substantial savings in aggregate memory. While the algorithm can be applied broadly to many-body, correlated problems, we demonstrate its utility in reducing total memory consumption for a series of fermionic single-band Hubbard model calculations on small clusters with progressively larger Hilbert space dimension.
Strategies for concurrent processing of complex algorithms in data driven architectures

NASA Technical Reports Server (NTRS)

Stoughton, John W.; Mielke, Roland R.

1988-01-01

The purpose is to document research to develop strategies for concurrent processing of complex algorithms in data driven architectures. The problem domain consists of decision-free algorithms having large-grained, computationally complex primitive operations. Such are often found in signal processing and control applications. The anticipated multiprocessor environment is a data flow architecture containing between two and twenty computing elements. Each computing element is a processor having local program memory, and which communicates with a common global data memory. A new graph theoretic model called ATAMM which establishes rules for relating a decomposed algorithm to its execution in a data flow architecture is presented. The ATAMM model is used to determine strategies to achieve optimum time performance and to develop a system diagnostic software tool. In addition, preliminary work on a new multiprocessor operating system based on the ATAMM specifications is described.
Adaptive efficient compression of genomes

PubMed Central

2012-01-01

Modern high-throughput sequencing technologies are able to generate DNA sequences at an ever increasing rate. In parallel to the decreasing experimental time and cost necessary to produce DNA sequences, computational requirements for analysis and storage of the sequences are steeply increasing. Compression is a key technology to deal with this challenge. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. However, memory requirements of the current algorithms are high and run times often are slow. In this paper, we propose an adaptive, parallel and highly efficient referential sequence compression method which allows fine-tuning of the trade-off between required memory and compression speed. When using 12 MB of memory, our method is for human genomes on-par with the best previous algorithms in terms of compression ratio (400:1) and compression speed. In contrast, it compresses a complete human genome in just 11 seconds when provided with 9 GB of main memory, which is almost three times faster than the best competitor while using less main memory. PMID:23146997
BLESS 2: accurate, memory-efficient and fast error correction method.

PubMed

Heo, Yun; Ramachandran, Anand; Hwu, Wen-Mei; Ma, Jian; Chen, Deming

2016-08-01

The most important features of error correction tools for sequencing data are accuracy, memory efficiency and fast runtime. The previous version of BLESS was highly memory-efficient and accurate, but it was too slow to handle reads from large genomes. We have developed a new version of BLESS to improve runtime and accuracy while maintaining a small memory usage. The new version, called BLESS 2, has an error correction algorithm that is more accurate than BLESS, and the algorithm has been parallelized using hybrid MPI and OpenMP programming. BLESS 2 was compared with five top-performing tools, and it was found to be the fastest when it was executed on two computing nodes using MPI, with each node containing twelve cores. Also, BLESS 2 showed at least 11% higher gain while retaining the memory efficiency of the previous version for large genomes. Freely available at https://sourceforge.net/projects/bless-ec dchen@illinois.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
An on-line calibration algorithm for external parameters of visual system based on binocular stereo cameras

NASA Astrophysics Data System (ADS)

Wang, Liqiang; Liu, Zhen; Zhang, Zhonghua

2014-11-01

Stereo vision is the key in the visual measurement, robot vision, and autonomous navigation. Before performing the system of stereo vision, it needs to calibrate the intrinsic parameters for each camera and the external parameters of the system. In engineering, the intrinsic parameters remain unchanged after calibrating cameras, and the positional relationship between the cameras could be changed because of vibration, knocks and pressures in the vicinity of the railway or motor workshops. Especially for large baselines, even minute changes in translation or rotation can affect the epipolar geometry and scene triangulation to such a degree that visual system becomes disabled. A technology including both real-time examination and on-line recalibration for the external parameters of stereo system becomes particularly important. This paper presents an on-line method for checking and recalibrating the positional relationship between stereo cameras. In epipolar geometry, the external parameters of cameras can be obtained by factorization of the fundamental matrix. Thus, it offers a method to calculate the external camera parameters without any special targets. If the intrinsic camera parameters are known, the external parameters of system can be calculated via a number of random matched points. The process is: (i) estimating the fundamental matrix via the feature point correspondences; (ii) computing the essential matrix from the fundamental matrix; (iii) obtaining the external parameters by decomposition of the essential matrix. In the step of computing the fundamental matrix, the traditional methods are sensitive to noise and cannot ensure the estimation accuracy. We consider the feature distribution situation in the actual scene images and introduce a regional weighted normalization algorithm to improve accuracy of the fundamental matrix estimation. In contrast to traditional algorithms, experiments on simulated data prove that the method improves estimation robustness and accuracy of the fundamental matrix. Finally, we take an experiment for computing the relationship of a pair of stereo cameras to demonstrate accurate performance of the algorithm.
Stochastic memory: Memory enhancement due to noise

NASA Astrophysics Data System (ADS)

Stotland, Alexander; di Ventra, Massimiliano

2012-01-01

There are certain classes of resistors, capacitors, and inductors that, when subject to a periodic input of appropriate frequency, develop hysteresis loops in their characteristic response. Here we show that the hysteresis of such memory elements can also be induced by white noise of appropriate intensity even at very low frequencies of the external driving field. We illustrate this phenomenon using a physical model of memory resistor realized by TiO2 thin films sandwiched between metallic electrodes and discuss under which conditions this effect can be observed experimentally. We also discuss its implications on existing memory systems described in the literature and the role of colored noise.
Quantum lattice representations for vector solitons in external potentials

NASA Astrophysics Data System (ADS)

Vahala, George; Vahala, Linda; Yepez, Jeffrey

2006-03-01

A quantum lattice algorithm is developed to examine the effect of an external potential well on exactly integrable vector Manakov solitons. It is found that the exact solutions to the coupled nonlinear Schrodinger equations act like quasi-solitons in weak potentials, leading to mode-locking, trapping and untrapping. Stronger potential wells will lead to the emission of radiation modes from the quasi-soliton initial conditions. If the external potential is applied to that particular mode polarization, then the radiation will be trapped within the potential well. The algorithm developed leads to a finite difference scheme that is unconditionally stable. The Manakov system in an external potential is very closely related to the Gross-Pitaevskii equation for the ground state wave functions of a coupled BEC state at T=0 K.
Automatic mesh adaptivity for hybrid Monte Carlo/deterministic neutronics modeling of difficult shielding problems

DOE PAGES

Ibrahim, Ahmad M.; Wilson, Paul P.H.; Sawan, Mohamed E.; ...

2015-06-30

The CADIS and FW-CADIS hybrid Monte Carlo/deterministic techniques dramatically increase the efficiency of neutronics modeling, but their use in the accurate design analysis of very large and geometrically complex nuclear systems has been limited by the large number of processors and memory requirements for their preliminary deterministic calculations and final Monte Carlo calculation. Three mesh adaptivity algorithms were developed to reduce the memory requirements of CADIS and FW-CADIS without sacrificing their efficiency improvement. First, a macromaterial approach enhances the fidelity of the deterministic models without changing the mesh. Second, a deterministic mesh refinement algorithm generates meshes that capture as muchmore » geometric detail as possible without exceeding a specified maximum number of mesh elements. Finally, a weight window coarsening algorithm decouples the weight window mesh and energy bins from the mesh and energy group structure of the deterministic calculations in order to remove the memory constraint of the weight window map from the deterministic mesh resolution. The three algorithms were used to enhance an FW-CADIS calculation of the prompt dose rate throughout the ITER experimental facility. Using these algorithms resulted in a 23.3% increase in the number of mesh tally elements in which the dose rates were calculated in a 10-day Monte Carlo calculation and, additionally, increased the efficiency of the Monte Carlo simulation by a factor of at least 3.4. The three algorithms enabled this difficult calculation to be accurately solved using an FW-CADIS simulation on a regular computer cluster, eliminating the need for a world-class super computer.« less
Memory-induced resonancelike suppression of spike generation in a resonate-and-fire neuron model

NASA Astrophysics Data System (ADS)

Mankin, Romi; Paekivi, Sander

2018-01-01

The behavior of a stochastic resonate-and-fire neuron model based on a reduction of a fractional noise-driven generalized Langevin equation (GLE) with a power-law memory kernel is considered. The effect of temporally correlated random activity of synaptic inputs, which arise from other neurons forming local and distant networks, is modeled as an additive fractional Gaussian noise in the GLE. Using a first-passage-time formulation, in certain system parameter domains exact expressions for the output interspike interval (ISI) density and for the survival probability (the probability that a spike is not generated) are derived and their dependence on input parameters, especially on the memory exponent, is analyzed. In the case of external white noise, it is shown that at intermediate values of the memory exponent the survival probability is significantly enhanced in comparison with the cases of strong and weak memory, which causes a resonancelike suppression of the probability of spike generation as a function of the memory exponent. Moreover, an examination of the dependence of multimodality in the ISI distribution on input parameters shows that there exists a critical memory exponent αc≈0.402 , which marks a dynamical transition in the behavior of the system. That phenomenon is illustrated by a phase diagram describing the emergence of three qualitatively different structures of the ISI distribution. Similarities and differences between the behavior of the model at internal and external noises are also discussed.
Bernard Stiegler's Philosophy of Technology: Invention, Decision, and Education in Times of Digitization

ERIC Educational Resources Information Center

Kouppanou, Anna

2015-01-01

Bernard Stiegler's concept of individuation suggests that the human being is co-constituted with technology. Technology precedes the individual in the respect that the latter is thrown in a technological world that always already contains externally inscribed memories--what he calls tertiary memories--that selectively form the individual and the…

Recognition Confidence under Violated and Confirmed Memory Expectations

ERIC Educational Resources Information Center

Jaeger, Antonio; Cox, Justin C.; Dobbins, Ian G.

2012-01-01

Individuals' memory experiences typically covary with those of others' around them, and on average, an item is more likely to be familiar if a companion recommends it as such. Although it would be ideal if observers could use the external recommendations of others' as statistical priors during recognition decisions, it is currently unclear how or…
Remembering the past and imagining the future: attachment effects on production of episodic details in close relationships.

PubMed

Cao, Xiancai; Madore, Kevin P; Wang, Dahua; Schacter, Daniel L

2018-09-01

Attachment theories and studies have shown that Internal Working Models (IWMs) can impact autobiographical memory and future-oriented information processing relevant to close relationships. According to the constructive episodic simulation hypothesis (CESH), both remembering the past and imagining the future rely on episodic memory. We hypothesised that one way IWMs may bridge past experiences and future adaptations is via episodic memory. The present study investigated the association between attachment and episodic specificity in attachment-relevant and attachment-irrelevant memory and imagination among young and older adults. We measured the attachment style of 37 young adults and 40 older adults, and then asked them to remember or imagine attachment-relevant and attachment-irrelevant events. Participants' narratives were coded for internal details (i.e., episodic) and external details (e.g., semantic, repetitions). The results showed that across age group, secure individuals generated more internal details and fewer external details in attachment-relevant tasks compared to attachment-irrelevant tasks; these differences were not observed in insecure individuals. These findings support the CESH and provide a new perspective to understand the function of IWMs.
Lack of color integration in visual short-term memory binding.

PubMed

Parra, Mario A; Cubelli, Roberto; Della Sala, Sergio

2011-10-01

Bicolored objects are retained in visual short-term memory (VSTM) less efficiently than unicolored objects. This is unlike shape-color combinations, whose retention in VSTM does not differ from that observed for shapes only. It is debated whether this is due to a lack of color integration and whether this may reflect the function of separate memory mechanisms. Participants judged whether the colors of bicolored objects (each with an external and an internalcolor) were the same or different across two consecutive screens. Colors had to be remembered either individually or in combination. In Experiment 1, external colors in the combined colors condition were remembered better than the internal colors, and performance for both was worse than that in the individual colors condition. The lack of color integration observed in Experiment 1 was further supported by a reduced capacity of VSTM to retain color combinations, relative to individual colors (Experiment 2). An additional account was found in Experiment 3, which showed spared color-color binding in the presence of impaired shape-color binding in a brain-damaged patient, thus suggesting that these two memory mechanisms are different.
Light-weight cyptography for resource constrained environments

NASA Astrophysics Data System (ADS)

Baier, Patrick; Szu, Harold

2006-04-01

We give a survey of "light-weight" encryption algorithms designed to maximise security within tight resource constraints (limited memory, power consumption, processor speed, chip area, etc.) The target applications of such algorithms are RFIDs, smart cards, mobile phones, etc., which may store, process and transmit sensitive data, but at the same time do not always support conventional strong algorithms. A survey of existing algorithms is given and new proposal is introduced.
A fast optimization algorithm for multicriteria intensity modulated proton therapy planning.

PubMed

Chen, Wei; Craft, David; Madden, Thomas M; Zhang, Kewu; Kooy, Hanne M; Herman, Gabor T

2010-09-01

To describe a fast projection algorithm for optimizing intensity modulated proton therapy (IMPT) plans and to describe and demonstrate the use of this algorithm in multicriteria IMPT planning. The authors develop a projection-based solver for a class of convex optimization problems and apply it to IMPT treatment planning. The speed of the solver permits its use in multicriteria optimization, where several optimizations are performed which span the space of possible treatment plans. The authors describe a plan database generation procedure which is customized to the requirements of the solver. The optimality precision of the solver can be specified by the user. The authors apply the algorithm to three clinical cases: A pancreas case, an esophagus case, and a tumor along the rib cage case. Detailed analysis of the pancreas case shows that the algorithm is orders of magnitude faster than industry-standard general purpose algorithms (MOSEK'S interior point optimizer, primal simplex optimizer, and dual simplex optimizer). Additionally, the projection solver has almost no memory overhead. The speed and guaranteed accuracy of the algorithm make it suitable for use in multicriteria treatment planning, which requires the computation of several diverse treatment plans. Additionally, given the low memory overhead of the algorithm, the method can be extended to include multiple geometric instances and proton range possibilities, for robust optimization.
High-speed parallel implementation of a modified PBR algorithm on DSP-based EH topology

NASA Astrophysics Data System (ADS)

Rajan, K.; Patnaik, L. M.; Ramakrishna, J.

1997-08-01

Algebraic Reconstruction Technique (ART) is an age-old method used for solving the problem of three-dimensional (3-D) reconstruction from projections in electron microscopy and radiology. In medical applications, direct 3-D reconstruction is at the forefront of investigation. The simultaneous iterative reconstruction technique (SIRT) is an ART-type algorithm with the potential of generating in a few iterations tomographic images of a quality comparable to that of convolution backprojection (CBP) methods. Pixel-based reconstruction (PBR) is similar to SIRT reconstruction, and it has been shown that PBR algorithms give better quality pictures compared to those produced by SIRT algorithms. In this work, we propose a few modifications to the PBR algorithms. The modified algorithms are shown to give better quality pictures compared to PBR algorithms. The PBR algorithm and the modified PBR algorithms are highly compute intensive, Not many attempts have been made to reconstruct objects in the true 3-D sense because of the high computational overhead. In this study, we have developed parallel two-dimensional (2-D) and 3-D reconstruction algorithms based on modified PBR. We attempt to solve the two problems encountered by the PBR and modified PBR algorithms, i.e., the long computational time and the large memory requirements, by parallelizing the algorithm on a multiprocessor system. We investigate the possible task and data partitioning schemes by exploiting the potential parallelism in the PBR algorithm subject to minimizing the memory requirement. We have implemented an extended hypercube (EH) architecture for the high-speed execution of the 3-D reconstruction algorithm using the commercially available fast floating point digital signal processor (DSP) chips as the processing elements (PEs) and dual-port random access memories (DPR) as channels between the PEs. We discuss and compare the performances of the PBR algorithm on an IBM 6000 RISC workstation, on a Silicon Graphics Indigo 2 workstation, and on an EH system. The results show that an EH(3,1) using DSP chips as PEs executes the modified PBR algorithm about 100 times faster than an LBM 6000 RISC workstation. We have executed the algorithms on a 4-node IBM SP2 parallel computer. The results show that execution time of the algorithm on an EH(3,1) is better than that of a 4-node IBM SP2 system. The speed-up of an EH(3,1) system with eight PEs and one network controller is approximately 7.85.
FPGA Vision Data Architecture

NASA Technical Reports Server (NTRS)

Morfopoulos, Arin C.; Pham, Thang D.

2013-01-01

JPL has produced a series of FPGA (field programmable gate array) vision algorithms that were written with custom interfaces to get data in and out of each vision module. Each module has unique requirements on the data interface, and further vision modules are continually being developed, each with their own custom interfaces. Each memory module had also been designed for direct access to memory or to another memory module.
An episodic specificity induction enhances means-end problem solving in young and older adults.

PubMed

Madore, Kevin P; Schacter, Daniel L

2014-12-01

Episodic memory plays an important role not only in remembering past experiences, but also in constructing simulations of future experiences and solving means-end social problems. We recently found that an episodic specificity induction-brief training in recollecting details of past experiences-enhances performance of young and older adults on memory and imagination tasks. Here we tested the hypothesis that this specificity induction would also positively impact a means-end problem-solving task on which age-related changes have been linked to impaired episodic memory. Young and older adults received the specificity induction or a control induction before completing a means-end problem-solving task, as well as memory and imagination tasks. Consistent with previous findings, older adults provided fewer relevant steps on problem solving than did young adults, and their responses also contained fewer internal (i.e., episodic) details across the 3 tasks. There was no difference in the number of other (e.g., irrelevant) steps on problem solving or external (i.e., semantic) details generated on the 3 tasks as a function of age. Critically, the specificity induction increased the number of relevant steps and internal details (but not other steps or external details) that both young and older adults generated in problem solving compared with the control induction, as well as the number of internal details (but not external details) generated for memory and imagination. Our findings support the idea that episodic retrieval processes are involved in means-end problem solving, extend the range of tasks on which a specificity induction targets these processes, and show that the problem-solving performance of older adults can benefit from a specificity induction as much as that of young adults. (PsycINFO Database Record (c) 2014 APA, all rights reserved).
An episodic specificity induction enhances means-end problem solving in young and older adults

PubMed Central

Madore, Kevin P.; Schacter, Daniel L.

2014-01-01

Episodic memory plays an important role not only in remembering past experiences, but also in constructing simulations of future experiences and solving means-end social problems. We recently found that an episodic specificity induction- brief training in recollecting details of past experiences- enhances performance of young and older adults on memory and imagination tasks. Here we tested the hypothesis that this specificity induction would also positively impact a means-end problem solving task on which age-related changes have been linked to impaired episodic memory. Young and older adults received the specificity induction or a control induction before completing a means-end problem solving task as well as memory and imagination tasks. Consistent with previous findings, older adults provided fewer relevant steps on problem solving than did young adults, and their responses also contained fewer internal (i.e., episodic) details across the three tasks. There was no difference in the number of other (e.g., irrelevant) steps on problem solving or external (i.e., semantic) details generated on the three tasks as a function of age. Critically, the specificity induction increased the number of relevant steps and internal details (but not other steps or external details) that both young and older adults generated in problem solving compared with the control induction, as well as the number of internal details (but not external details) generated for memory and imagination. Our findings support the idea that episodic retrieval processes are involved in means-end problem solving, extend the range of tasks on which a specificity induction targets these processes, and show that the problem solving performance of older adults can benefit from a specificity induction as much as that of young adults. PMID:25365688
Memory-Intensive Benchmarks: IRAM vs. Cache-Based Machines

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Gaeke, Brian R.; Husbands, Parry; Li, Xiaoye S.; Oliker, Leonid; Yelick, Katherine A.; Biegel, Bryan (Technical Monitor)

2002-01-01

The increasing gap between processor and memory performance has lead to new architectural models for memory-intensive applications. In this paper, we explore the performance of a set of memory-intensive benchmarks and use them to compare the performance of conventional cache-based microprocessors to a mixed logic and DRAM processor called VIRAM. The benchmarks are based on problem statements, rather than specific implementations, and in each case we explore the fundamental hardware requirements of the problem, as well as alternative algorithms and data structures that can help expose fine-grained parallelism or simplify memory access patterns. The benchmarks are characterized by their memory access patterns, their basic control structures, and the ratio of computation to memory operation.
Flexible Organic Tribotronic Transistor Memory for a Visible and Wearable Touch Monitoring System.

PubMed

Li, Jing; Zhang, Chi; Duan, Lian; Zhang, Li Min; Wang, Li Duo; Dong, Gui Fang; Wang, Zhong Lin

2016-01-06

A new type of flexible organic tribotronic transistor memory is proposed, which can be written and erased by externally applied touch actions as an active memory. By further coupling with an organic light-emitting diode (OLED), a visible and wearable touch monitoring system is achieved, in which touch triggering can be memorized and shown as the emission from the OLED. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Differential Effects of Paced and Unpaced Responding on delayed Serial Order Recall in Schizophrenia

PubMed Central

Hill, S. Kristian; Griffin, Ginny B.; Houk, James C.; Sweeney, John A.

2011-01-01

Working memory for temporal order is a component of working memory that is especially dependent on striatal systems, but has not been extensively studied in schizophrenia. This study was designed to characterize serial order reproduction by adapting a spatial serial order task developed for nonhuman primate studies, while controlling for working memory load and whether responses were initiated freely (unpaced) or in an externally paced format. Clinically stable schizophrenia patients (n=27) and psychiatrically healthy individuals (n=25) were comparable on demographic variables and performance on standardized tests of immediate serial order recall (Digit Span, Spatial Span). No group differences were observed for serial order recall when read sequence reproduction was unpaced. However, schizophrenia patients exhibited significant impairments when responding was paced, regardless of sequence length or retention delay. Intact performance by schizophrenia patients during the unpaced condition indicates that prefrontal storage and striatal output systems are sufficiently intact to learn novel response sequences and hold them in working memory to perform serial order tasks. However, retention for newly learned response sequences was disrupted in schizophrenia patients by paced responding, when read-out of each element in the response sequence was externally controlled. The disruption of memory for serial order in paced read-out condition indicates a deficit in frontostriatal interaction characterized by an inability to update working memory stores and deconstruct ‘chunked’ information. PMID:21705197
Bundle block adjustment of large-scale remote sensing data with Block-based Sparse Matrix Compression combined with Preconditioned Conjugate Gradient

NASA Astrophysics Data System (ADS)

Zheng, Maoteng; Zhang, Yongjun; Zhou, Shunping; Zhu, Junfeng; Xiong, Xiaodong

2016-07-01

In recent years, new platforms and sensors in photogrammetry, remote sensing and computer vision areas have become available, such as Unmanned Aircraft Vehicles (UAV), oblique camera systems, common digital cameras and even mobile phone cameras. Images collected by all these kinds of sensors could be used as remote sensing data sources. These sensors can obtain large-scale remote sensing data which consist of a great number of images. Bundle block adjustment of large-scale data with conventional algorithm is very time and space (memory) consuming due to the super large normal matrix arising from large-scale data. In this paper, an efficient Block-based Sparse Matrix Compression (BSMC) method combined with the Preconditioned Conjugate Gradient (PCG) algorithm is chosen to develop a stable and efficient bundle block adjustment system in order to deal with the large-scale remote sensing data. The main contribution of this work is the BSMC-based PCG algorithm which is more efficient in time and memory than the traditional algorithm without compromising the accuracy. Totally 8 datasets of real data are used to test our proposed method. Preliminary results have shown that the BSMC method can efficiently decrease the time and memory requirement of large-scale data.
A portable approach for PIC on emerging architectures

NASA Astrophysics Data System (ADS)

Decyk, Viktor

2016-03-01

A portable approach for designing Particle-in-Cell (PIC) algorithms on emerging exascale computers, is based on the recognition that 3 distinct programming paradigms are needed. They are: low level vector (SIMD) processing, middle level shared memory parallel programing, and high level distributed memory programming. In addition, there is a memory hierarchy associated with each level. Such algorithms can be initially developed using vectorizing compilers, OpenMP, and MPI. This is the approach recommended by Intel for the Phi processor. These algorithms can then be translated and possibly specialized to other programming models and languages, as needed. For example, the vector processing and shared memory programming might be done with CUDA instead of vectorizing compilers and OpenMP, but generally the algorithm itself is not greatly changed. The UCLA PICKSC web site at http://www.idre.ucla.edu/ contains example open source skeleton codes (mini-apps) illustrating each of these three programming models, individually and in combination. Fortran2003 now supports abstract data types, and design patterns can be used to support a variety of implementations within the same code base. Fortran2003 also supports interoperability with C so that implementations in C languages are also easy to use. Finally, main codes can be translated into dynamic environments such as Python, while still taking advantage of high performing compiled languages. Parallel languages are still evolving with interesting developments in co-Array Fortran, UPC, and OpenACC, among others, and these can also be supported within the same software architecture. Work supported by NSF and DOE Grants.
Experimental Simulation of Active Control With On-line System Identification on Sound Transmission Through an Elastic Plate

NASA Technical Reports Server (NTRS)

1998-01-01

An adaptive control algorithm with on-line system identification capability has been developed. One of the great advantages of this scheme is that an additional system identification mechanism such as an additional uncorrelated random signal generator as the source of system identification is not required. A time-varying plate-cavity system is used to demonstrate the control performance of this algorithm. The time-varying system consists of a stainless-steel plate which is bolted down on a rigid cavity opening where the cavity depth was changed with respect to time. For a given externally located harmonic sound excitation, the system identification and the control are simultaneously executed to minimize the transmitted sound in the cavity. The control performance of the algorithm is examined for two cases. First, all the water was drained, the external disturbance frequency is swept with 1 Hz/sec. The result shows an excellent frequency tracking capability with cavity internal sound suppression of 40 dB. For the second case, the water level is initially empty and then raised to 3/20 full in 60 seconds while the external sound excitation is fixed with a frequency. Hence, the cavity resonant frequency decreases and passes the external sound excitation frequency. The algorithm shows 40 dB transmitted noise suppression without compromising the system identification tracking capability.
High Performance Implementation of 3D Convolutional Neural Networks on a GPU.

PubMed

Lan, Qiang; Wang, Zelong; Wen, Mei; Zhang, Chunyuan; Wang, Yijie

2017-01-01

Convolutional neural networks have proven to be highly successful in applications such as image classification, object tracking, and many other tasks based on 2D inputs. Recently, researchers have started to apply convolutional neural networks to video classification, which constitutes a 3D input and requires far larger amounts of memory and much more computation. FFT based methods can reduce the amount of computation, but this generally comes at the cost of an increased memory requirement. On the other hand, the Winograd Minimal Filtering Algorithm (WMFA) can reduce the number of operations required and thus can speed up the computation, without increasing the required memory. This strategy was shown to be successful for 2D neural networks. We implement the algorithm for 3D convolutional neural networks and apply it to a popular 3D convolutional neural network which is used to classify videos and compare it to cuDNN. For our highly optimized implementation of the algorithm, we observe a twofold speedup for most of the 3D convolution layers of our test network compared to the cuDNN version.
High Performance Implementation of 3D Convolutional Neural Networks on a GPU

PubMed Central

Wang, Zelong; Wen, Mei; Zhang, Chunyuan; Wang, Yijie

2017-01-01

Convolutional neural networks have proven to be highly successful in applications such as image classification, object tracking, and many other tasks based on 2D inputs. Recently, researchers have started to apply convolutional neural networks to video classification, which constitutes a 3D input and requires far larger amounts of memory and much more computation. FFT based methods can reduce the amount of computation, but this generally comes at the cost of an increased memory requirement. On the other hand, the Winograd Minimal Filtering Algorithm (WMFA) can reduce the number of operations required and thus can speed up the computation, without increasing the required memory. This strategy was shown to be successful for 2D neural networks. We implement the algorithm for 3D convolutional neural networks and apply it to a popular 3D convolutional neural network which is used to classify videos and compare it to cuDNN. For our highly optimized implementation of the algorithm, we observe a twofold speedup for most of the 3D convolution layers of our test network compared to the cuDNN version. PMID:29250109
Numerical arc segmentation algorithm for a radio conference-NASARC (version 2.0) technical manual

NASA Technical Reports Server (NTRS)

Whyte, Wayne A., Jr.; Heyward, Ann O.; Ponchak, Denise S.; Spence, Rodney L.; Zuzek, John E.

1987-01-01

The information contained in the NASARC (Version 2.0) Technical Manual (NASA TM-100160) and NASARC (Version 2.0) User's Manual (NASA TM-100161) relates to the state of NASARC software development through October 16, 1987. The Technical Manual describes the Numerical Arc Segmentation Algorithm for a Radio Conference (NASARC) concept and the algorithms used to implement the concept. The User's Manual provides information on computer system considerations, installation instructions, description of input files, and program operating instructions. Significant revisions have been incorporated in the Version 2.0 software. These revisions have enhanced the modeling capabilities of the NASARC procedure while greatly reducing the computer run time and memory requirements. Array dimensions within the software have been structured to fit within the currently available 6-megabyte memory capacity of the International Frequency Registration Board (IFRB) computer facility. A piecewise approach to predetermined arc generation in NASARC (Version 2.0) allows worldwide scenarios to be accommodated within these memory constraints while at the same time effecting an overall reduction in computer run time.
Numerical Arc Segmentation Algorithm for a Radio Conference-NASARC, Version 2.0: User's Manual

NASA Technical Reports Server (NTRS)

Whyte, Wayne A., Jr.; Heyward, Ann O.; Ponchak, Denise S.; Spence, Rodney L.; Zuzek, John E.

1987-01-01

The information contained in the NASARC (Version 2.0) Technical Manual (NASA TM-100160) and the NASARC (Version 2.0) User's Manual (NASA TM-100161) relates to the state of the Numerical Arc Segmentation Algorithm for a Radio Conference (NASARC) software development through October 16, 1987. The technical manual describes the NASARC concept and the algorithms which are used to implement it. The User's Manual provides information on computer system considerations, installation instructions, description of input files, and program operation instructions. Significant revisions have been incorporated in the Version 2.0 software over prior versions. These revisions have enhanced the modeling capabilities of the NASARC procedure while greatly reducing the computer run time and memory requirements. Array dimensions within the software have been structured to fit into the currently available 6-megabyte memory capacity of the International Frequency Registration Board (IFRB) computer facility. A piecewise approach to predetermined arc generation in NASARC (Version 2.0) allows worldwide scenarios to be accommodated within these memory constraints while at the same time reducing computer run time.
[Development of a video image system for wireless capsule endoscopes based on DSP].

PubMed

Yang, Li; Peng, Chenglin; Wu, Huafeng; Zhao, Dechun; Zhang, Jinhua

2008-02-01

A video image recorder to record video picture for wireless capsule endoscopes was designed. TMS320C6211 DSP of Texas Instruments Inc. is the core processor of this system. Images are periodically acquired from Composite Video Broadcast Signal (CVBS) source and scaled by video decoder (SAA7114H). Video data is transported from high speed buffer First-in First-out (FIFO) to Digital Signal Processor (DSP) under the control of Complex Programmable Logic Device (CPLD). This paper adopts JPEG algorithm for image coding, and the compressed data in DSP was stored to Compact Flash (CF) card. TMS320C6211 DSP is mainly used for image compression and data transporting. Fast Discrete Cosine Transform (DCT) algorithm and fast coefficient quantization algorithm are used to accelerate operation speed of DSP and decrease the executing code. At the same time, proper address is assigned for each memory, which has different speed;the memory structure is also optimized. In addition, this system uses plenty of Extended Direct Memory Access (EDMA) to transport and process image data, which results in stable and high performance.

Loci-STREAM Version 0.9

NASA Technical Reports Server (NTRS)

Wright, Jeffrey; Thakur, Siddharth

2006-01-01

Loci-STREAM is an evolving computational fluid dynamics (CFD) software tool for simulating possibly chemically reacting, possibly unsteady flows in diverse settings, including rocket engines, turbomachines, oil refineries, etc. Loci-STREAM implements a pressure- based flow-solving algorithm that utilizes unstructured grids. (The benefit of low memory usage by pressure-based algorithms is well recognized by experts in the field.) The algorithm is robust for flows at all speeds from zero to hypersonic. The flexibility of arbitrary polyhedral grids enables accurate, efficient simulation of flows in complex geometries, including those of plume-impingement problems. The present version - Loci-STREAM version 0.9 - includes an interface with the Portable, Extensible Toolkit for Scientific Computation (PETSc) library for access to enhanced linear-equation-solving programs therein that accelerate convergence toward a solution. The name "Loci" reflects the creation of this software within the Loci computational framework, which was developed at Mississippi State University for the primary purpose of simplifying the writing of complex multidisciplinary application programs to run in distributed-memory computing environments including clusters of personal computers. Loci has been designed to relieve application programmers of the details of programming for distributed-memory computers.
On ways to overcome the magical capacity limit of working memory.

PubMed

Turi, Zsolt; Alekseichuk, Ivan; Paulus, Walter

2018-04-01

The ability to simultaneously process and maintain multiple pieces of information is limited. Over the past 50 years, observational methods have provided a large amount of insight regarding the neural mechanisms that underpin the mental capacity that we refer to as "working memory." More than 20 years ago, a neural coding scheme was proposed for working memory. As a result of technological developments, we can now not only observe but can also influence brain rhythms in humans. Building on these novel developments, we have begun to externally control brain oscillations in order to extend the limits of working memory.
Multimodal properties and dynamics of gradient echo quantum memory.

PubMed

Hétet, G; Longdell, J J; Sellars, M J; Lam, P K; Buchler, B C

2008-11-14

We investigate the properties of a recently proposed gradient echo memory (GEM) scheme for information mapping between optical and atomic systems. We show that GEM can be described by the dynamic formation of polaritons in k space. This picture highlights the flexibility and robustness with regards to the external control of the storage process. Our results also show that, as GEM is a frequency-encoding memory, it can accurately preserve the shape of signals that have large time-bandwidth products, even at moderate optical depths. At higher optical depths, we show that GEM is a high fidelity multimode quantum memory.
Rapid classification of hippocampal replay content for real-time applications

PubMed Central

Liu, Daniel F.; Karlsson, Mattias P.; Frank, Loren M.; Eden, Uri T.

2016-01-01

Sharp-wave ripple (SWR) events in the hippocampus replay millisecond-timescale patterns of place cell activity related to the past experience of an animal. Interrupting SWR events leads to learning and memory impairments, but how the specific patterns of place cell spiking seen during SWRs contribute to learning and memory remains unclear. A deeper understanding of this issue will require the ability to manipulate SWR events based on their content. Accurate real-time decoding of SWR replay events requires new algorithms that are able to estimate replay content and the associated uncertainty, along with software and hardware that can execute these algorithms for biological interventions on a millisecond timescale. Here we develop an efficient estimation algorithm to categorize the content of replay from multiunit spiking activity. Specifically, we apply real-time decoding methods to each SWR event and then compute the posterior probability of the replay feature. We illustrate this approach by classifying SWR events from data recorded in the hippocampus of a rat performing a spatial memory task into four categories: whether they represent outbound or inbound trajectories and whether the activity is replayed forward or backward in time. We show that our algorithm can classify the majority of SWR events in a recording epoch within 20 ms of the replay onset with high certainty, which makes the algorithm suitable for a real-time implementation with short latencies to incorporate into content-based feedback experiments. PMID:27535369
Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs.

PubMed

Kundeti, Vamsi K; Rajasekaran, Sanguthevar; Dinh, Hieu; Vaughn, Matthew; Thapar, Vishal

2010-11-15

Assembling genomic sequences from a set of overlapping reads is one of the most fundamental problems in computational biology. Algorithms addressing the assembly problem fall into two broad categories - based on the data structures which they employ. The first class uses an overlap/string graph and the second type uses a de Bruijn graph. However with the recent advances in short read sequencing technology, de Bruijn graph based algorithms seem to play a vital role in practice. Efficient algorithms for building these massive de Bruijn graphs are very essential in large sequencing projects based on short reads. In an earlier work, an O(n/p) time parallel algorithm has been given for this problem. Here n is the size of the input and p is the number of processors. This algorithm enumerates all possible bi-directed edges which can overlap with a node and ends up generating Θ(nΣ) messages (Σ being the size of the alphabet). In this paper we present a Θ(n/p) time parallel algorithm with a communication complexity that is equal to that of parallel sorting and is not sensitive to Σ. The generality of our algorithm makes it very easy to extend it even to the out-of-core model and in this case it has an optimal I/O complexity of Θ(nlog(n/B)Blog(M/B)) (M being the main memory size and B being the size of the disk block). We demonstrate the scalability of our parallel algorithm on a SGI/Altix computer. A comparison of our algorithm with the previous approaches reveals that our algorithm is faster--both asymptotically and practically. We demonstrate the scalability of our sequential out-of-core algorithm by comparing it with the algorithm used by VELVET to build the bi-directed de Bruijn graph. Our experiments reveal that our algorithm can build the graph with a constant amount of memory, which clearly outperforms VELVET. We also provide efficient algorithms for the bi-directed chain compaction problem. The bi-directed de Bruijn graph is a fundamental data structure for any sequence assembly program based on Eulerian approach. Our algorithms for constructing Bi-directed de Bruijn graphs are efficient in parallel and out of core settings. These algorithms can be used in building large scale bi-directed de Bruijn graphs. Furthermore, our algorithms do not employ any all-to-all communications in a parallel setting and perform better than the prior algorithms. Finally our out-of-core algorithm is extremely memory efficient and can replace the existing graph construction algorithm in VELVET.
Neural bases of prospective memory: a meta-analysis and the "Attention to Delayed Intention" (AtoDI) model.

PubMed

Cona, Giorgia; Scarpazza, Cristina; Sartori, Giuseppe; Moscovitch, Morris; Bisiacchi, Patrizia Silvia

2015-05-01

Remembering to realize delayed intentions is a multi-phase process, labelled as prospective memory (PM), and involves a plurality of neural networks. The present study utilized the activation likelihood estimation method of meta-analysis to provide a complete overview of the brain regions that are consistently activated in each PM phase. We formulated the 'Attention to Delayed Intention' (AtoDI) model to explain the neural dissociation found between intention maintenance and retrieval phases. The dorsal frontoparietal network is involved mainly in the maintenance phase and seems to mediate the strategic monitoring processes, such as the allocation of top-down attention both towards external stimuli, to monitor for the occurrence of the PM cues, and to internal memory contents, to maintain the intention active in memory. The ventral frontoparietal network is recruited in the retrieval phase and might subserve the bottom-up attention captured externally by the PM cues and, internally, by the intention stored in memory. Together with other brain regions (i.e., insula and posterior cingulate cortex), the ventral frontoparietal network would support the spontaneous retrieval processes. The functional contribution of the anterior prefrontal cortex is discussed extensively for each PM phase. Copyright © 2015 Elsevier Ltd. All rights reserved.
Noise reduction improves memory for target language speech in competing native but not foreign language speech.

PubMed

Ng, Elaine Hoi Ning; Rudner, Mary; Lunner, Thomas; Rönnberg, Jerker

2015-01-01

A hearing aid noise reduction (NR) algorithm reduces the adverse effect of competing speech on memory for target speech for individuals with hearing impairment with high working memory capacity. In the present study, we investigated whether the positive effect of NR could be extended to individuals with low working memory capacity, as well as how NR influences recall performance for target native speech when the masker language is non-native. A sentence-final word identification and recall (SWIR) test was administered to 26 experienced hearing aid users. In this test, target spoken native language (Swedish) sentence lists were presented in competing native (Swedish) or foreign (Cantonese) speech with or without binary masking NR algorithm. After each sentence list, free recall of sentence final words was prompted. Working memory capacity was measured using a reading span (RS) test. Recall performance was associated with RS. However, the benefit obtained from NR was not associated with RS. Recall performance was more disrupted by native than foreign speech babble and NR improved recall performance in native but not foreign competing speech. Noise reduction improved memory for speech heard in competing speech for hearing aid users. Memory for native speech was more disrupted by native babble than foreign babble, but the disruptive effect of native speech babble was reduced to that of foreign babble when there was NR.
Computational mechanics analysis tools for parallel-vector supercomputers

NASA Technical Reports Server (NTRS)

Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.; Qin, J.

1993-01-01

Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigen-solution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization algorithm and domain decomposition. The source code for many of these algorithms is available from NASA Langley.
The Linked Neighbour List (LNL) method for fast off-lattice Monte Carlo simulations of fluids

NASA Astrophysics Data System (ADS)

Mazzeo, M. D.; Ricci, M.; Zannoni, C.

2010-03-01

We present a new algorithm, called linked neighbour list (LNL), useful to substantially speed up off-lattice Monte Carlo simulations of fluids by avoiding the computation of the molecular energy before every attempted move. We introduce a few variants of the LNL method targeted to minimise memory footprint or augment memory coherence and cache utilisation. Additionally, we present a few algorithms which drastically accelerate neighbour finding. We test our methods on the simulation of a dense off-lattice Gay-Berne fluid subjected to periodic boundary conditions observing a speedup factor of about 2.5 with respect to a well-coded implementation based on a conventional link-cell. We provide several implementation details of the different key data structures and algorithms used in this work.
Simplifying and speeding the management of intra-node cache coherence

DOEpatents

Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton on Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Phillip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Yorktown Heights, NY

2012-04-17

A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
A Parallel Saturation Algorithm on Shared Memory Architectures

NASA Technical Reports Server (NTRS)

Ezekiel, Jonathan; Siminiceanu

2007-01-01

Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.
Managing coherence via put/get windows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blumrich, Matthias A; Chen, Dong; Coteus, Paul W

A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an areamore » of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.« less
An efficient parallel termination detection algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, A. H.; Crivelli, S.; Jessup, E. R.

2004-05-27

Information local to any one processor is insufficient to monitor the overall progress of most distributed computations. Typically, a second distributed computation for detecting termination of the main computation is necessary. In order to be a useful computational tool, the termination detection routine must operate concurrently with the main computation, adding minimal overhead, and it must promptly and correctly detect termination when it occurs. In this paper, we present a new algorithm for detecting the termination of a parallel computation on distributed-memory MIMD computers that satisfies all of those criteria. A variety of termination detection algorithms have been devised. Ofmore » these, the algorithm presented by Sinha, Kale, and Ramkumar (henceforth, the SKR algorithm) is unique in its ability to adapt to the load conditions of the system on which it runs, thereby minimizing the impact of termination detection on performance. Because their algorithm also detects termination quickly, we consider it to be the most efficient practical algorithm presently available. The termination detection algorithm presented here was developed for use in the PMESC programming library for distributed-memory MIMD computers. Like the SKR algorithm, our algorithm adapts to system loads and imposes little overhead. Also like the SKR algorithm, ours is tree-based, and it does not depend on any assumptions about the physical interconnection topology of the processors or the specifics of the distributed computation. In addition, our algorithm is easier to implement and requires only half as many tree traverses as does the SKR algorithm. This paper is organized as follows. In section 2, we define our computational model. In section 3, we review the SKR algorithm. We introduce our new algorithm in section 4, and prove its correctness in section 5. We discuss its efficiency and present experimental results in section 6.« less
Synaptic Correlates of Working Memory Capacity.

PubMed

Mi, Yuanyuan; Katkov, Mikhail; Tsodyks, Misha

2017-01-18

Psychological studies indicate that human ability to keep information in readily accessible working memory is limited to four items for most people. This extremely low capacity severely limits execution of many cognitive tasks, but its neuronal underpinnings remain unclear. Here we show that in the framework of synaptic theory of working memory, capacity can be analytically estimated to scale with characteristic time of short-term synaptic depression relative to synaptic current time constant. The number of items in working memory can be regulated by external excitation, enabling the system to be tuned to the desired load and to clear the working memory of currently held items to make room for new ones. Copyright © 2017 Elsevier Inc. All rights reserved.
Brief Report: Memory Performance on the California Verbal Learning Test-Children's Version in Autism Spectrum Disorder

ERIC Educational Resources Information Center

Phelan, Heather L.; Filliter, Jillian H.; Johnson, Shannon A.

2011-01-01

According to the Task Support Hypothesis (TSH; Bowler et al. in Neuropsychologia 35:65-70, 1997) individuals with autism spectrum disorder (ASD) perform more similarly to their typically developing peers on learning and memory tasks when provided with external support at retrieval. We administered the California Verbal Learning Test-Children's…
Internalizing versus Externalizing Control: Different Ways to Perform a Time-Based Prospective Memory Task

ERIC Educational Resources Information Center

Huang, Tracy; Loft, Shayne; Humphreys, Michael S.

2014-01-01

"Time-based prospective memory" (PM) refers to performing intended actions at a future time. Participants with time-based PM tasks can be slower to perform ongoing tasks (costs) than participants without PM tasks because internal control is required to maintain the PM intention or to make prospective-timing estimates. However, external…
Rapid knowledge assessment (RKA): Assessing students content knowledge through rapid, in class assessment of expertise

NASA Astrophysics Data System (ADS)

O'Connell, Erin

Understanding how students go about problem solving in chemistry lends many possible advantages for interventions in teaching strategies for the college classroom. The work presented here is the development of an in-classroom, real-time, formative instrument to assess student expertise in chemistry with the purpose of developing classroom interventions. The development of appropriate interventions requires the understanding of how students go about starting to solve tasks presented to them, what their mental effort (load on working memory) is, and whether or not their performance was accurate. To measure this, the Rapid Knowledge Assessment (RKA) instrument uses clickers (handheld electronic instruments for submitting answers) as a means of data collection. The classroom data was used to develop an algorithm to deliver student assessment scores, which when correlated to external measure of standardized American Chemical Society (ACS) examinations and class score show a significant relationship between the accuracy of knowledge assessment (p=0.000). Use of eye-tracking technology and student interviews supports the measurements found in the classroom.
Integrated Network Decompositions and Dynamic Programming for Graph Optimization (INDDGO)

DOE Office of Scientific and Technical Information (OSTI.GOV)

The INDDGO software package offers a set of tools for finding exact solutions to graph optimization problems via tree decompositions and dynamic programming algorithms. Currently the framework offers serial and parallel (distributed memory) algorithms for finding tree decompositions and solving the maximum weighted independent set problem. The parallel dynamic programming algorithm is implemented on top of the MADNESS task-based runtime.
PACE: Power-Aware Computing Engines

DTIC Science & Technology

2005-02-01

more costly than compu- tation on our test platform, and it is memory access that dominates most lossless data compression algorithms . In fact, even...Performance and implementation concerns A compression algorithm may be implemented with many different, yet reasonable, data structures (including...Related work This section discusses data compression for low- bandwidth devices and optimizing algorithms for low energy. Though much work has gone
Efficient parallel linear scaling construction of the density matrix for Born-Oppenheimer molecular dynamics.

PubMed

Mniszewski, S M; Cawkwell, M J; Wall, M E; Mohd-Yusof, J; Bock, N; Germann, T C; Niklasson, A M N

2015-10-13

We present an algorithm for the calculation of the density matrix that for insulators scales linearly with system size and parallelizes efficiently on multicore, shared memory platforms with small and controllable numerical errors. The algorithm is based on an implementation of the second-order spectral projection (SP2) algorithm [ Niklasson, A. M. N. Phys. Rev. B 2002 , 66 , 155115 ] in sparse matrix algebra with the ELLPACK-R data format. We illustrate the performance of the algorithm within self-consistent tight binding theory by total energy calculations of gas phase poly(ethylene) molecules and periodic liquid water systems containing up to 15,000 atoms on up to 16 CPU cores. We consider algorithm-specific performance aspects, such as local vs nonlocal memory access and the degree of matrix sparsity. Comparisons to sparse matrix algebra implementations using off-the-shelf libraries on multicore CPUs, graphics processing units (GPUs), and the Intel many integrated core (MIC) architecture are also presented. The accuracy and stability of the algorithm are illustrated with long duration Born-Oppenheimer molecular dynamics simulations of 1000 water molecules and a 303 atom Trp cage protein solvated by 2682 water molecules.

Content addressable memory project

NASA Technical Reports Server (NTRS)

Hall, J. Storrs; Levy, Saul; Smith, Donald E.; Miyake, Keith M.

1992-01-01

A parameterized version of the tree processor was designed and tested (by simulation). The leaf processor design is 90 percent complete. We expect to complete and test a combination of tree and leaf cell designs in the next period. Work is proceeding on algorithms for the computer aided manufacturing (CAM), and once the design is complete we will begin simulating algorithms for large problems. The following topics are covered: (1) the practical implementation of content addressable memory; (2) design of a LEAF cell for the Rutgers CAM architecture; (3) a circuit design tool user's manual; and (4) design and analysis of efficient hierarchical interconnection networks.
Distributed-Memory Computing With the Langley Aerothermodynamic Upwind Relaxation Algorithm (LAURA)

NASA Technical Reports Server (NTRS)

Riley, Christopher J.; Cheatwood, F. McNeil

1997-01-01

The Langley Aerothermodynamic Upwind Relaxation Algorithm (LAURA), a Navier-Stokes solver, has been modified for use in a parallel, distributed-memory environment using the Message-Passing Interface (MPI) standard. A standard domain decomposition strategy is used in which the computational domain is divided into subdomains with each subdomain assigned to a processor. Performance is examined on dedicated parallel machines and a network of desktop workstations. The effect of domain decomposition and frequency of boundary updates on performance and convergence is also examined for several realistic configurations and conditions typical of large-scale computational fluid dynamic analysis.
Hysteresis modeling of magnetic shape memory alloy actuator based on Krasnosel'skii-Pokrovskii model.

PubMed

Zhou, Miaolei; Wang, Shoubin; Gao, Wei

2013-01-01

As a new type of intelligent material, magnetically shape memory alloy (MSMA) has a good performance in its applications in the actuator manufacturing. Compared with traditional actuators, MSMA actuator has the advantages as fast response and large deformation; however, the hysteresis nonlinearity of the MSMA actuator restricts its further improving of control precision. In this paper, an improved Krasnosel'skii-Pokrovskii (KP) model is used to establish the hysteresis model of MSMA actuator. To identify the weighting parameters of the KP operators, an improved gradient correction algorithm and a variable step-size recursive least square estimation algorithm are proposed in this paper. In order to demonstrate the validity of the proposed modeling approach, simulation experiments are performed, simulations with improved gradient correction algorithm and variable step-size recursive least square estimation algorithm are studied, respectively. Simulation results of both identification algorithms demonstrate that the proposed modeling approach in this paper can establish an effective and accurate hysteresis model for MSMA actuator, and it provides a foundation for improving the control precision of MSMA actuator.
Hysteresis Modeling of Magnetic Shape Memory Alloy Actuator Based on Krasnosel'skii-Pokrovskii Model

PubMed Central

Wang, Shoubin; Gao, Wei

2013-01-01

As a new type of intelligent material, magnetically shape memory alloy (MSMA) has a good performance in its applications in the actuator manufacturing. Compared with traditional actuators, MSMA actuator has the advantages as fast response and large deformation; however, the hysteresis nonlinearity of the MSMA actuator restricts its further improving of control precision. In this paper, an improved Krasnosel'skii-Pokrovskii (KP) model is used to establish the hysteresis model of MSMA actuator. To identify the weighting parameters of the KP operators, an improved gradient correction algorithm and a variable step-size recursive least square estimation algorithm are proposed in this paper. In order to demonstrate the validity of the proposed modeling approach, simulation experiments are performed, simulations with improved gradient correction algorithm and variable step-size recursive least square estimation algorithm are studied, respectively. Simulation results of both identification algorithms demonstrate that the proposed modeling approach in this paper can establish an effective and accurate hysteresis model for MSMA actuator, and it provides a foundation for improving the control precision of MSMA actuator. PMID:23737730
A simple algorithm to compute the peak power output of GaAs/Ge solar cells on the Martian surface

DOE Office of Scientific and Technical Information (OSTI.GOV)

Glueck, P.R.; Bahrami, K.A.

1995-12-31

The Jet Propulsion Laboratory`s (JPL`s) Mars Pathfinder Project will deploy a robotic ``microrover`` on the surface of Mars in the summer of 1997. This vehicle will derive primary power from a GaAs/Ge solar array during the day and will ``sleep`` at night. This strategy requires that the rover be able to (1) determine when it is necessary to save the contents of volatile memory late in the afternoon and (2) determine when sufficient power is available to resume operations in the morning. An algorithm was developed that estimates the peak power point of the solar array from the solar arraymore » short-circuit current and temperature telemetry, and provides functional redundancy for both measurements using the open-circuit voltage telemetry. The algorithm minimizes vehicle processing and memory utilization by using linear equations instead of look-up tables to estimate peak power with very little loss in accuracy. This paper describes the method used to obtain the algorithm and presents the detailed algorithm design.« less
Prefrontal Cortex Networks Shift from External to Internal Modes during Learning.

PubMed

Brincat, Scott L; Miller, Earl K

2016-09-14

As we learn about items in our environment, their neural representations become increasingly enriched with our acquired knowledge. But there is little understanding of how network dynamics and neural processing related to external information changes as it becomes laden with "internal" memories. We sampled spiking and local field potential activity simultaneously from multiple sites in the lateral prefrontal cortex (PFC) and the hippocampus (HPC)-regions critical for sensory associations-of monkeys performing an object paired-associate learning task. We found that in the PFC, evoked potentials to, and neural information about, external sensory stimulation decreased while induced beta-band (∼11-27 Hz) oscillatory power and synchrony associated with "top-down" or internal processing increased. By contrast, the HPC showed little evidence of learning-related changes in either spiking activity or network dynamics. The results suggest that during associative learning, PFC networks shift their resources from external to internal processing. As we learn about items in our environment, their representations in our brain become increasingly enriched with our acquired "top-down" knowledge. We found that in the prefrontal cortex, but not the hippocampus, processing of external sensory inputs decreased while internal network dynamics related to top-down processing increased. The results suggest that during learning, prefrontal cortex networks shift their resources from external (sensory) to internal (memory) processing. Copyright © 2016 the authors 0270-6474/16/369739-16$15.00/0.
Prefrontal Cortex Networks Shift from External to Internal Modes during Learning

PubMed Central

Brincat, Scott L.

2016-01-01

As we learn about items in our environment, their neural representations become increasingly enriched with our acquired knowledge. But there is little understanding of how network dynamics and neural processing related to external information changes as it becomes laden with “internal” memories. We sampled spiking and local field potential activity simultaneously from multiple sites in the lateral prefrontal cortex (PFC) and the hippocampus (HPC)—regions critical for sensory associations—of monkeys performing an object paired-associate learning task. We found that in the PFC, evoked potentials to, and neural information about, external sensory stimulation decreased while induced beta-band (∼11–27 Hz) oscillatory power and synchrony associated with “top-down” or internal processing increased. By contrast, the HPC showed little evidence of learning-related changes in either spiking activity or network dynamics. The results suggest that during associative learning, PFC networks shift their resources from external to internal processing. SIGNIFICANCE STATEMENT As we learn about items in our environment, their representations in our brain become increasingly enriched with our acquired “top-down” knowledge. We found that in the prefrontal cortex, but not the hippocampus, processing of external sensory inputs decreased while internal network dynamics related to top-down processing increased. The results suggest that during learning, prefrontal cortex networks shift their resources from external (sensory) to internal (memory) processing. PMID:27629722
FPGA accelerator for protein secondary structure prediction based on the GOR algorithm

PubMed Central

2011-01-01

Background Protein is an important molecule that performs a wide range of functions in biological systems. Recently, the protein folding attracts much more attention since the function of protein can be generally derived from its molecular structure. The GOR algorithm is one of the most successful computational methods and has been widely used as an efficient analysis tool to predict secondary structure from protein sequence. However, the execution time is still intolerable with the steep growth in protein database. Recently, FPGA chips have emerged as one promising application accelerator to accelerate bioinformatics algorithms by exploiting fine-grained custom design. Results In this paper, we propose a complete fine-grained parallel hardware implementation on FPGA to accelerate the GOR-IV package for 2D protein structure prediction. To improve computing efficiency, we partition the parameter table into small segments and access them in parallel. We aggressively exploit data reuse schemes to minimize the need for loading data from external memory. The whole computation structure is carefully pipelined to overlap the sequence loading, computing and back-writing operations as much as possible. We implemented a complete GOR desktop system based on an FPGA chip XC5VLX330. Conclusions The experimental results show a speedup factor of more than 430x over the original GOR-IV version and 110x speedup over the optimized version with multi-thread SIMD implementation running on a PC platform with AMD Phenom 9650 Quad CPU for 2D protein structure prediction. However, the power consumption is only about 30% of that of current general-propose CPUs. PMID:21342582
Age differences in perceptions of memory strategy effectiveness for recent and remote memory.

PubMed

Lineweaver, Tara T; Horhota, Michelle; Crumley, Jessica; Geanon, Catherine T; Juett, Jacqueline J

2018-03-01

We examined whether young and older adults hold different beliefs about the effectiveness of memory strategies for specific types of memory tasks and whether memory strategies are perceived to be differentially effective for young, middle-aged, and older targets. Participants rated the effectiveness of five memory strategies for 10 memory tasks at three target ages (20, 50, and 80 years old). Older adults did not strongly differentiate strategy effectiveness, viewing most strategies as similarly effective across memory tasks. Young adults held strategy-specific beliefs, endorsing external aids and physical health as more effective than a positive attitude or internal strategies, without substantial differentiation based on task. We also found differences in anticipated strategy effectiveness for targets of different ages. Older adults described cognitive and physical health strategies as more effective for older than middle-aged targets, whereas young adults expected these strategies to be equally effective for middle-aged and older target adults.
FPGA architecture and implementation of sparse matrix vector multiplication for the finite element method

NASA Astrophysics Data System (ADS)

Elkurdi, Yousef; Fernández, David; Souleimanov, Evgueni; Giannacopoulos, Dennis; Gross, Warren J.

2008-04-01

The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Programmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix-vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside the FPGA internal memory.
Committing to Memory: Memory Prosthetics Show Promise in Helping Those with Neurodegenerative Disorders.

PubMed

Solis, Michele

2017-01-01

Cell phone chimes, sticky notes, even the proverbial string around a finger-these timehonored external cues help guard against our inevitable memory lapses. But some internal help to the brain itself may be on the way in the form of what's being called memory prosthetics. Once considered to be on the fringes of neuroscience, the idea of adding hardware to the brain to help with memory has gathered steam. In 2014, the U.S. Defense Advanced Research Projects Agency (DARPA) made a US$30 million investment in memory prosthetic research as part of the Obama administration's Brain Research through Advancing Innovative Neurotechnologies initiative. In August 2016, Kernel, a startup based in Los Angeles, California, announced its goal to develop a clinical memory device for those debilitated by neurodegenerative disorders such as Alzheimer's disease.
Working memory-driven attention improves spatial resolution: Support for perceptual enhancement.

PubMed

Pan, Yi; Luo, Qianying; Cheng, Min

2016-08-01

Previous research has indicated that attention can be biased toward those stimuli matching the contents of working memory and thereby facilitates visual processing at the location of the memory-matching stimuli. However, whether this working memory-driven attentional modulation takes place on early perceptual processes remains unclear. Our present results showed that working memory-driven attention improved identification of a brief Landolt target presented alone in the visual field. Because the suprathreshold target appeared without any external noise added (i.e., no distractors or masks), the results suggest that working memory-driven attention enhances the target signal at early perceptual stages of visual processing. Furthermore, given that performance in the Landolt target identification task indexes spatial resolution, this attentional facilitation indicates that working memory-driven attention can boost early perceptual processing via enhancement of spatial resolution at the attended location.
Numerical simulation of a helical shape electric arc in the external axial magnetic field

NASA Astrophysics Data System (ADS)

Urusov, R. M.; Urusova, I. R.

2016-10-01

Within the frameworks of non-stationary three-dimensional mathematical model, in approximation of a partial local thermodynamic equilibrium, a numerical calculation was made of characteristics of DC electric arc burning in a cylindrical channel in the uniform external axial magnetic field. The method of numerical simulation of the arc of helical shape in a uniform external axial magnetic field was proposed. This method consists in that that in the computational algorithm, a "scheme" analog of fluctuations for electrons temperature is supplemented. The "scheme" analogue of fluctuations increases a weak numerical asymmetry of electrons temperature distribution, which occurs randomly in the course of computing. This asymmetry can be "picked up" by the external magnetic field that continues to increase up to a certain value, which is sufficient for the formation of helical structure of the arc column. In the absence of fluctuations in the computational algorithm, the arc column in the external axial magnetic field maintains cylindrical axial symmetry, and a helical form of the arc is not observed.
\\mathscr{H}_2 optimal control techniques for resistive wall mode feedback in tokamaks

NASA Astrophysics Data System (ADS)

Clement, Mitchell; Hanson, Jeremy; Bialek, Jim; Navratil, Gerald

2018-04-01

DIII-D experiments show that a new, advanced algorithm enables resistive wall mode (RWM) stability control in high performance discharges using external coils. DIII-D can excite strong, locked or nearly locked external kink modes whose rotation frequencies and growth rates are on the order of the magnetic flux diffusion time of the vacuum vessel wall. Experiments have shown that modern control techniques like linear quadratic Gaussian (LQG) control require less current than the proportional controller in use at DIII-D when using control coils external to DIII-D’s vacuum vessel. Experiments were conducted to develop control of a rotating n = 1 perturbation using an LQG controller derived from VALEN and external coils. Feedback using this LQG algorithm outperformed a proportional gain only controller in these perturbation experiments over a range of frequencies. Results from high βN experiments also show that advanced feedback techniques using external control coils may be as effective as internal control coil feedback using classical control techniques.
Optimized Laplacian image sharpening algorithm based on graphic processing unit

NASA Astrophysics Data System (ADS)

Ma, Tinghuai; Li, Lu; Ji, Sai; Wang, Xin; Tian, Yuan; Al-Dhelaan, Abdullah; Al-Rodhaan, Mznah

2014-12-01

In classical Laplacian image sharpening, all pixels are processed one by one, which leads to large amount of computation. Traditional Laplacian sharpening processed on CPU is considerably time-consuming especially for those large pictures. In this paper, we propose a parallel implementation of Laplacian sharpening based on Compute Unified Device Architecture (CUDA), which is a computing platform of Graphic Processing Units (GPU), and analyze the impact of picture size on performance and the relationship between the processing time of between data transfer time and parallel computing time. Further, according to different features of different memory, an improved scheme of our method is developed, which exploits shared memory in GPU instead of global memory and further increases the efficiency. Experimental results prove that two novel algorithms outperform traditional consequentially method based on OpenCV in the aspect of computing speed.
The Design and Implementation of Indoor Localization System Using Magnetic Field Based on Smartphone

NASA Astrophysics Data System (ADS)

Liu, J.; Jiang, C.; Shi, Z.

2017-09-01

Sufficient signal nodes are mostly required to implement indoor localization in mainstream research. Magnetic field take advantage of high precision, stable and reliability, and the reception of magnetic field signals is reliable and uncomplicated, it could be realized by geomagnetic sensor on smartphone, without external device. After the study of indoor positioning technologies, choose the geomagnetic field data as fingerprints to design an indoor localization system based on smartphone. A localization algorithm that appropriate geomagnetic matching is designed, and present filtering algorithm and algorithm for coordinate conversion. With the implement of plot geomagnetic fingerprints, the indoor positioning of smartphone without depending on external devices can be achieved. Finally, an indoor positioning system which is based on Android platform is successfully designed, through the experiments, proved the capability and effectiveness of indoor localization algorithm.
Component deficits of visual neglect: "Magnetic" attraction of attention vs. impaired spatial working memory.

PubMed

Toba, Monica N; Rabuffetti, Marco; Duret, Christophe; Pradat-Diehl, Pascale; Gainotti, Guido; Bartolomeo, Paolo

2018-01-31

Visual neglect is a disabling consequence of right hemisphere damage, whereby patients fail to detect left-sided objects. Its precise mechanisms are debated, but there is some consensus that distinct component deficits may variously associate and interact in different patients. Here we used a touch-screen based procedure to study two putative component deficits of neglect, rightward "magnetic" attraction of attention and impaired spatial working memory, in a group of 47 right brain-damaged patients, of whom 33 had signs of left neglect. Patients performed a visual search task on three distinct conditions, whereby touched targets could (1) be tagged, (2) disappear or (3) show no change. Magnetic attraction of attention was defined as more left neglect on the tag condition than on the disappear condition, where right-sided disappeared targets could not capture patients' attention. Impaired spatial working memory should instead produce more neglect on the no change condition, where no external cue indicated that a target had already been explored, than on the tag condition. Using a specifically developed analysis algorithm, we identified significant differences of performance between the critical conditions. Neglect patients as a group performed better on the disappear condition than on the no change condition and also better in the tag condition comparing with the no change condition. No difference was found between the tag condition and the disappear condition. Some of our neglect patients had dissociated patterns of performance, with predominant magnetic attraction or impaired spatial working memory. Anatomical results issued from both grey matter analysis and fiber tracking were consistent with the typical patterns of fronto-parietal and occipito-frontal disconnection in neglect, but did not identify lesional patterns specifically associated with one or another deficit, thus suggesting the possible co-localization of attentional and working memory processes in fronto-parietal networks. These findings give support to the hypothesis of the co-occurrence of distinct cognitive deficits in visual neglect and stress the necessity of multi-component models of visuospatial disorders. Copyright © 2017 Elsevier Ltd. All rights reserved.
A Three-Threshold Learning Rule Approaches the Maximal Capacity of Recurrent Neural Networks

PubMed Central

Alemi, Alireza; Baldassi, Carlo; Brunel, Nicolas; Zecchina, Riccardo

2015-01-01

Understanding the theoretical foundations of how memories are encoded and retrieved in neural populations is a central challenge in neuroscience. A popular theoretical scenario for modeling memory function is the attractor neural network scenario, whose prototype is the Hopfield model. The model simplicity and the locality of the synaptic update rules come at the cost of a poor storage capacity, compared with the capacity achieved with perceptron learning algorithms. Here, by transforming the perceptron learning rule, we present an online learning rule for a recurrent neural network that achieves near-maximal storage capacity without an explicit supervisory error signal, relying only upon locally accessible information. The fully-connected network consists of excitatory binary neurons with plastic recurrent connections and non-plastic inhibitory feedback stabilizing the network dynamics; the memory patterns to be memorized are presented online as strong afferent currents, producing a bimodal distribution for the neuron synaptic inputs. Synapses corresponding to active inputs are modified as a function of the value of the local fields with respect to three thresholds. Above the highest threshold, and below the lowest threshold, no plasticity occurs. In between these two thresholds, potentiation/depression occurs when the local field is above/below an intermediate threshold. We simulated and analyzed a network of binary neurons implementing this rule and measured its storage capacity for different sizes of the basins of attraction. The storage capacity obtained through numerical simulations is shown to be close to the value predicted by analytical calculations. We also measured the dependence of capacity on the strength of external inputs. Finally, we quantified the statistics of the resulting synaptic connectivity matrix, and found that both the fraction of zero weight synapses and the degree of symmetry of the weight matrix increase with the number of stored patterns. PMID:26291608
A Three-Threshold Learning Rule Approaches the Maximal Capacity of Recurrent Neural Networks.

PubMed

Alemi, Alireza; Baldassi, Carlo; Brunel, Nicolas; Zecchina, Riccardo

2015-08-01

Understanding the theoretical foundations of how memories are encoded and retrieved in neural populations is a central challenge in neuroscience. A popular theoretical scenario for modeling memory function is the attractor neural network scenario, whose prototype is the Hopfield model. The model simplicity and the locality of the synaptic update rules come at the cost of a poor storage capacity, compared with the capacity achieved with perceptron learning algorithms. Here, by transforming the perceptron learning rule, we present an online learning rule for a recurrent neural network that achieves near-maximal storage capacity without an explicit supervisory error signal, relying only upon locally accessible information. The fully-connected network consists of excitatory binary neurons with plastic recurrent connections and non-plastic inhibitory feedback stabilizing the network dynamics; the memory patterns to be memorized are presented online as strong afferent currents, producing a bimodal distribution for the neuron synaptic inputs. Synapses corresponding to active inputs are modified as a function of the value of the local fields with respect to three thresholds. Above the highest threshold, and below the lowest threshold, no plasticity occurs. In between these two thresholds, potentiation/depression occurs when the local field is above/below an intermediate threshold. We simulated and analyzed a network of binary neurons implementing this rule and measured its storage capacity for different sizes of the basins of attraction. The storage capacity obtained through numerical simulations is shown to be close to the value predicted by analytical calculations. We also measured the dependence of capacity on the strength of external inputs. Finally, we quantified the statistics of the resulting synaptic connectivity matrix, and found that both the fraction of zero weight synapses and the degree of symmetry of the weight matrix increase with the number of stored patterns.
User-Assisted Store Recycling for Dynamic Task Graph Schedulers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kurt, Mehmet Can; Krishnamoorthy, Sriram; Agrawal, Gagan

The emergence of the multi-core era has led to increased interest in designing effective yet practical parallel programming models. Models based on task graphs that operate on single-assignment data are attractive in several ways: they can support dynamic applications and precisely represent the available concurrency. However, they also require nuanced algorithms for scheduling and memory management for efficient execution. In this paper, we consider memory-efficient dynamic scheduling of task graphs. Specifically, we present a novel approach for dynamically recycling the memory locations assigned to data items as they are produced by tasks. We develop algorithms to identify memory-efficient store recyclingmore » functions by systematically evaluating the validity of a set of (user-provided or automatically generated) alternatives. Because recycling function can be input data-dependent, we have also developed support for continued correct execution of a task graph in the presence of a potentially incorrect store recycling function. Experimental evaluation demonstrates that our approach to automatic store recycling incurs little to no overheads, achieves memory usage comparable to the best manually derived solutions, often produces recycling functions valid across problem sizes and input parameters, and efficiently recovers from an incorrect choice of store recycling functions.« less

Episodic and semantic content of memory and imagination: A multilevel analysis.

PubMed

Devitt, Aleea L; Addis, Donna Rose; Schacter, Daniel L

2017-10-01

Autobiographical memories of past events and imaginations of future scenarios comprise both episodic and semantic content. Correlating the amount of "internal" (episodic) and "external" (semantic) details generated when describing autobiographical events can illuminate the relationship between the processes supporting these constructs. Yet previous studies performing such correlations were limited by aggregating data across all events generated by an individual, potentially obscuring the underlying relationship within the events themselves. In the current article, we reanalyzed datasets from eight studies using a multilevel approach, allowing us to explore the relationship between internal and external details within events. We also examined whether this relationship changes with healthy aging. Our reanalyses demonstrated a largely negative relationship between the internal and external details produced when describing autobiographical memories and future imaginations. This negative relationship was stronger and more consistent for older adults and was evident both in direct and indirect measures of semantic content. Moreover, this relationship appears to be specific to episodic tasks, as no relationship was observed for a nonepisodic picture description task. This negative association suggests that people do not generate semantic information indiscriminately, but do so in a compensatory manner, to embellish episodically impoverished events. Our reanalysis further lends support for dissociable processes underpinning episodic and semantic information generation when remembering and imagining autobiographical events.
Blind equalization with criterion with memory nonlinearity

NASA Astrophysics Data System (ADS)

Chen, Yuanjie; Nikias, Chrysostomos L.; Proakis, John G.

1992-06-01

Blind equalization methods usually combat the linear distortion caused by a nonideal channel via a transversal filter, without resorting to the a priori known training sequences. We introduce a new criterion with memory nonlinearity (CRIMNO) for the blind equalization problem. The basic idea of this criterion is to augment the Godard [or constant modulus algorithm (CMA)] cost function with additional terms that penalize the autocorrelations of the equalizer outputs. Several variations of the CRIMNO algorithms are derived, with the variations dependent on (1) whether the empirical averages or the single point estimates are used to approximate the expectations, (2) whether the recent or the delayed equalizer coefficients are used, and (3) whether the weights applied to the autocorrelation terms are fixed or are allowed to adapt. Simulation experiments show that the CRIMNO algorithm, and especially its adaptive weight version, exhibits faster convergence speed than the Godard (or CMA) algorithm. Extensions of the CRIMNO criterion to accommodate the case of correlated inputs to the channel are also presented.
Paradeisos: A perfect hashing algorithm for many-body eigenvalue problems

DOE PAGES

Jia, C. J.; Wang, Y.; Mendl, C. B.; ...

2017-12-02

Here, we describe an essentially perfect hashing algorithm for calculating the position of an element in an ordered list, appropriate for the construction and manipulation of many-body Hamiltonian, sparse matrices. Each element of the list corresponds to an integer value whose binary representation reflects the occupation of single-particle basis states for each element in the many-body Hilbert space. The algorithm replaces conventional methods, such as binary search, for locating the elements of the ordered list, eliminating the need to store the integer representation for each element, without increasing the computational complexity. Combined with the “checkerboard” decomposition of the Hamiltonian matrixmore » for distribution over parallel computing environments, this leads to a substantial savings in aggregate memory. While the algorithm can be applied broadly to many-body, correlated problems, we demonstrate its utility in reducing total memory consumption for a series of fermionic single-band Hubbard model calculations on small clusters with progressively larger Hilbert space dimension.« less
Freeing Space for NASA: Incorporating a Lossless Compression Algorithm into NASA's FOSS System

NASA Technical Reports Server (NTRS)

Fiechtner, Kaitlyn; Parker, Allen

2011-01-01

NASA's Fiber Optic Strain Sensing (FOSS) system can gather and store up to 1,536,000 bytes (1.46 megabytes) per second. Since the FOSS system typically acquires hours - or even days - of data, the system can gather hundreds of gigabytes of data for a given test event. To store such large quantities of data more effectively, NASA is modifying a Lempel-Ziv-Oberhumer (LZO) lossless data compression program to compress data as it is being acquired in real time. After proving that the algorithm is capable of compressing the data from the FOSS system, the LZO program will be modified and incorporated into the FOSS system. Implementing an LZO compression algorithm will instantly free up memory space without compromising any data obtained. With the availability of memory space, the FOSS system can be used more efficiently on test specimens, such as Unmanned Aerial Vehicles (UAVs) that can be in flight for days. By integrating the compression algorithm, the FOSS system can continue gathering data, even on longer flights.
A fuzzy discrete harmony search algorithm applied to annual cost reduction in radial distribution systems

NASA Astrophysics Data System (ADS)

Ameli, Kazem; Alfi, Alireza; Aghaebrahimi, Mohammadreza

2016-09-01

Similarly to other optimization algorithms, harmony search (HS) is quite sensitive to the tuning parameters. Several variants of the HS algorithm have been developed to decrease the parameter-dependency character of HS. This article proposes a novel version of the discrete harmony search (DHS) algorithm, namely fuzzy discrete harmony search (FDHS), for optimizing capacitor placement in distribution systems. In the FDHS, a fuzzy system is employed to dynamically adjust two parameter values, i.e. harmony memory considering rate and pitch adjusting rate, with respect to normalized mean fitness of the harmony memory. The key aspect of FDHS is that it needs substantially fewer iterations to reach convergence in comparison with classical discrete harmony search (CDHS). To the authors' knowledge, this is the first application of DHS to specify appropriate capacitor locations and their best amounts in the distribution systems. Simulations are provided for 10-, 34-, 85- and 141-bus distribution systems using CDHS and FDHS. The results show the effectiveness of FDHS over previous related studies.
A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors.

PubMed

Acevedo-Avila, Ricardo; Gonzalez-Mendoza, Miguel; Garcia-Garcia, Andres

2016-05-28

Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms.
Paradeisos: A perfect hashing algorithm for many-body eigenvalue problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jia, C. J.; Wang, Y.; Mendl, C. B.

Here, we describe an essentially perfect hashing algorithm for calculating the position of an element in an ordered list, appropriate for the construction and manipulation of many-body Hamiltonian, sparse matrices. Each element of the list corresponds to an integer value whose binary representation reflects the occupation of single-particle basis states for each element in the many-body Hilbert space. The algorithm replaces conventional methods, such as binary search, for locating the elements of the ordered list, eliminating the need to store the integer representation for each element, without increasing the computational complexity. Combined with the “checkerboard” decomposition of the Hamiltonian matrixmore » for distribution over parallel computing environments, this leads to a substantial savings in aggregate memory. While the algorithm can be applied broadly to many-body, correlated problems, we demonstrate its utility in reducing total memory consumption for a series of fermionic single-band Hubbard model calculations on small clusters with progressively larger Hilbert space dimension.« less
GPU-accelerated computing for Lagrangian coherent structures of multi-body gravitational regimes

NASA Astrophysics Data System (ADS)

Lin, Mingpei; Xu, Ming; Fu, Xiaoyu

2017-04-01

Based on a well-established theoretical foundation, Lagrangian Coherent Structures (LCSs) have elicited widespread research on the intrinsic structures of dynamical systems in many fields, including the field of astrodynamics. Although the application of LCSs in dynamical problems seems straightforward theoretically, its associated computational cost is prohibitive. We propose a block decomposition algorithm developed on Compute Unified Device Architecture (CUDA) platform for the computation of the LCSs of multi-body gravitational regimes. In order to take advantage of GPU's outstanding computing properties, such as Shared Memory, Constant Memory, and Zero-Copy, the algorithm utilizes a block decomposition strategy to facilitate computation of finite-time Lyapunov exponent (FTLE) fields of arbitrary size and timespan. Simulation results demonstrate that this GPU-based algorithm can satisfy double-precision accuracy requirements and greatly decrease the time needed to calculate final results, increasing speed by approximately 13 times. Additionally, this algorithm can be generalized to various large-scale computing problems, such as particle filters, constellation design, and Monte-Carlo simulation.
The effectiveness of a new algorithm on a three-dimensional finite element model construction of bone trabeculae in implant biomechanics.

PubMed

Sato, Y; Teixeira, E R; Tsuga, K; Shindoi, N

1999-08-01

More validity of finite element analysis (FEA) in implant biomechanics requires element downsizing. However, excess downsizing needs computer memory and calculation time. To evaluate the effectiveness of a new algorithm established for more valid FEA model construction without downsizing, three-dimensional FEA bone trabeculae models with different element sizes (300, 150 and 75 micron) were constructed. Four algorithms of stepwise (1 to 4 ranks) assignment of Young's modulus accorded with bone volume in the individual cubic element was used and then stress distribution against vertical loading was analysed. The model with 300 micron element size, with 4 ranks of Young's moduli accorded with bone volume in each element presented similar stress distribution to the model with the 75 micron element size. These results show that the new algorithm was effective, and the use of the 300 micron element for bone trabeculae representation was proposed, without critical changes in stress values and for possible savings on computer memory and calculation time in the laboratory.
Improving Memory for Optimization and Learning in Dynamic Environments

DTIC Science & Technology

2011-07-01

algorithm uses simple, in- cremental clustering to separate solutions into memory entries. The cluster centers are used as the models in the memory. This is...entire days of traffic with realistic traffic de - mands and turning ratios on a 32 intersection network modeled on downtown Pittsburgh, Pennsyl- vania...early/tardy problem. Management Science, 35(2):177–191, 1989. [78] Daniel Parrott and Xiaodong Li. A particle swarm model for tracking multiple peaks in
Efficient implementation of parallel three-dimensional FFT on clusters of PCs

NASA Astrophysics Data System (ADS)

Takahashi, Daisuke

2003-05-01

In this paper, we propose a high-performance parallel three-dimensional fast Fourier transform (FFT) algorithm on clusters of PCs. The three-dimensional FFT algorithm can be altered into a block three-dimensional FFT algorithm to reduce the number of cache misses. We show that the block three-dimensional FFT algorithm improves performance by utilizing the cache memory effectively. We use the block three-dimensional FFT algorithm to implement the parallel three-dimensional FFT algorithm. We succeeded in obtaining performance of over 1.3 GFLOPS on an 8-node dual Pentium III 1 GHz PC SMP cluster.
The Metropolis Monte Carlo method with CUDA enabled Graphic Processing Units

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hall, Clifford; School of Physics, Astronomy, and Computational Sciences, George Mason University, 4400 University Dr., Fairfax, VA 22030; Ji, Weixiao

2014-02-01

We present a CPU–GPU system for runtime acceleration of large molecular simulations using GPU computation and memory swaps. The memory architecture of the GPU can be used both as container for simulation data stored on the graphics card and as floating-point code target, providing an effective means for the manipulation of atomistic or molecular data on the GPU. To fully take advantage of this mechanism, efficient GPU realizations of algorithms used to perform atomistic and molecular simulations are essential. Our system implements a versatile molecular engine, including inter-molecule interactions and orientational variables for performing the Metropolis Monte Carlo (MMC) algorithm,more » which is one type of Markov chain Monte Carlo. By combining memory objects with floating-point code fragments we have implemented an MMC parallel engine that entirely avoids the communication time of molecular data at runtime. Our runtime acceleration system is a forerunner of a new class of CPU–GPU algorithms exploiting memory concepts combined with threading for avoiding bus bandwidth and communication. The testbed molecular system used here is a condensed phase system of oligopyrrole chains. A benchmark shows a size scaling speedup of 60 for systems with 210,000 pyrrole monomers. Our implementation can easily be combined with MPI to connect in parallel several CPU–GPU duets. -- Highlights: •We parallelize the Metropolis Monte Carlo (MMC) algorithm on one CPU—GPU duet. •The Adaptive Tempering Monte Carlo employs MMC and profits from this CPU—GPU implementation. •Our benchmark shows a size scaling-up speedup of 62 for systems with 225,000 particles. •The testbed involves a polymeric system of oligopyrroles in the condensed phase. •The CPU—GPU parallelization includes dipole—dipole and Mie—Jones classic potentials.« less
Synthesizing Dynamic Programming Algorithms from Linear Temporal Logic Formulae

NASA Technical Reports Server (NTRS)

Rosu, Grigore; Havelund, Klaus

2001-01-01

The problem of testing a linear temporal logic (LTL) formula on a finite execution trace of events, generated by an executing program, occurs naturally in runtime analysis of software. We present an algorithm which takes an LTL formula and generates an efficient dynamic programming algorithm. The generated algorithm tests whether the LTL formula is satisfied by a finite trace of events given as input. The generated algorithm runs in linear time, its constant depending on the size of the LTL formula. The memory needed is constant, also depending on the size of the formula.
Real-time implementations of image segmentation algorithms on shared memory multicore architecture: a survey (Conference Presentation)

NASA Astrophysics Data System (ADS)

Akil, Mohamed

2017-05-01

The real-time processing is getting more and more important in many image processing applications. Image segmentation is one of the most fundamental tasks image analysis. As a consequence, many different approaches for image segmentation have been proposed. The watershed transform is a well-known image segmentation tool. The watershed transform is a very data intensive task. To achieve acceleration and obtain real-time processing of watershed algorithms, parallel architectures and programming models for multicore computing have been developed. This paper focuses on the survey of the approaches for parallel implementation of sequential watershed algorithms on multicore general purpose CPUs: homogeneous multicore processor with shared memory. To achieve an efficient parallel implementation, it's necessary to explore different strategies (parallelization/distribution/distributed scheduling) combined with different acceleration and optimization techniques to enhance parallelism. In this paper, we give a comparison of various parallelization of sequential watershed algorithms on shared memory multicore architecture. We analyze the performance measurements of each parallel implementation and the impact of the different sources of overhead on the performance of the parallel implementations. In this comparison study, we also discuss the advantages and disadvantages of the parallel programming models. Thus, we compare the OpenMP (an application programming interface for multi-Processing) with Ptheads (POSIX Threads) to illustrate the impact of each parallel programming model on the performance of the parallel implementations.
GaAs Supercomputing: Architecture, Language, And Algorithms For Image Processing

NASA Astrophysics Data System (ADS)

Johl, John T.; Baker, Nick C.

1988-10-01

The application of high-speed GaAs processors in a parallel system matches the demanding computational requirements of image processing. The architecture of the McDonnell Douglas Astronautics Company (MDAC) vector processor is described along with the algorithms and language translator. Most image and signal processing algorithms can utilize parallel processing and show a significant performance improvement over sequential versions. The parallelization performed by this system is within each vector instruction. Since each vector has many elements, each requiring some computation, useful concurrent arithmetic operations can easily be performed. Balancing the memory bandwidth with the computation rate of the processors is an important design consideration for high efficiency and utilization. The architecture features a bus-based execution unit consisting of four to eight 32-bit GaAs RISC microprocessors running at a 200 MHz clock rate for a peak performance of 1.6 BOPS. The execution unit is connected to a vector memory with three buses capable of transferring two input words and one output word every 10 nsec. The address generators inside the vector memory perform different vector addressing modes and feed the data to the execution unit. The functions discussed in this paper include basic MATRIX OPERATIONS, 2-D SPATIAL CONVOLUTION, HISTOGRAM, and FFT. For each of these algorithms, assembly language programs were run on a behavioral model of the system to obtain performance figures.
Efficient Aho-Corasick String Matching on Emerging Multicore Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tumeo, Antonino; Villa, Oreste; Secchi, Simone

String matching algorithms are critical to several scientific fields. Beside text processing and databases, emerging applications such as DNA protein sequence analysis, data mining, information security software, antivirus, ma- chine learning, all exploit string matching algorithms [3]. All these applica- tions usually process large quantity of textual data, require high performance and/or predictable execution times. Among all the string matching algorithms, one of the most studied, especially for text processing and security applica- tions, is the Aho-Corasick algorithm. 1 2 Book title goes here Aho-Corasick is an exact, multi-pattern string matching algorithm which performs the search in a time linearlymore » proportional to the length of the input text independently from pattern set size. However, depending on the imple- mentation, when the number of patterns increase, the memory occupation may raise drastically. In turn, this can lead to significant variability in the performance, due to the memory access times and the caching effects. This is a significant concern for many mission critical applications and modern high performance architectures. For example, security applications such as Network Intrusion Detection Systems (NIDS), must be able to scan network traffic against very large dictionaries in real time. Modern Ethernet links reach up to 10 Gbps, and malicious threats are already well over 1 million, and expo- nentially growing [28]. When performing the search, a NIDS should not slow down the network, or let network packets pass unchecked. Nevertheless, on the current state-of-the-art cache based processors, there may be a large per- formance variability when dealing with big dictionaries and inputs that have different frequencies of matching patterns. In particular, when few patterns are matched and they are all in the cache, the procedure is fast. Instead, when they are not in the cache, often because many patterns are matched and the caches are continuously thrashed, they should be retrieved from the system memory and the procedure is slowed down by the increased latency. Efficient implementations of string matching algorithms have been the fo- cus of several works, targeting Field Programmable Gate Arrays [4, 25, 15, 5], highly multi-threaded solutions like the Cray XMT [34], multicore proces- sors [19] or heterogeneous processors like the Cell Broadband Engine [35, 22]. Recently, several researchers have also started to investigate the use Graphic Processing Units (GPUs) for string matching algorithms in security applica- tions [20, 10, 32, 33]. Most of these approaches mainly focus on reaching high peak performance, or try to optimize the memory occupation, rather than looking at performance stability. However, hardware solutions supports only small dictionary sizes due to lack of memory and are difficult to customize, while platforms such as the Cell/B.E. are very complex to program.« less
Semiconductor diode with external field modulation

DOEpatents

Nasby, Robert D.

2000-01-01

A non-destructive-readout nonvolatile semiconductor diode switching device that may be used as a memory element is disclosed. The diode switching device is formed with a ferroelectric material disposed above a rectifying junction to control the conduction characteristics therein by means of a remanent polarization. The invention may be used for the formation of integrated circuit memories for the storage of information.
Optical mass memory system (AMM-13). AMM/DBMS interface control document

NASA Technical Reports Server (NTRS)

Bailey, G. A.

1980-01-01

The baseline for external interfaces of a 10 to the 13th power bit, optical archival mass memory system (AMM-13) is established. The types of interfaces addressed include data transfer; AMM-13, Data Base Management System, NASA End-to-End Data System computer interconnect; data/control input and output interfaces; test input data source; file management; and facilities interface.
Prospective Memory in an Air Traffic Control Simulation: External Aids that Signal when to Act

ERIC Educational Resources Information Center

Loft, Shayne; Smith, Rebekah E.; Bhaskara, Adella

2011-01-01

At work and in our personal life we often need to remember to perform intended actions at some point in the future, referred to as Prospective Memory. Individuals sometimes forget to perform intentions in safety-critical work contexts. Holding intentions can also interfere with ongoing tasks. We applied theories and methods from the experimental…
GPU-accelerated algorithms for compressed signals recovery with application to astronomical imagery deblurring

NASA Astrophysics Data System (ADS)

Fiandrotti, Attilio; Fosson, Sophie M.; Ravazzi, Chiara; Magli, Enrico

2018-04-01

Compressive sensing promises to enable bandwidth-efficient on-board compression of astronomical data by lifting the encoding complexity from the source to the receiver. The signal is recovered off-line, exploiting GPUs parallel computation capabilities to speedup the reconstruction process. However, inherent GPU hardware constraints limit the size of the recoverable signal and the speedup practically achievable. In this work, we design parallel algorithms that exploit the properties of circulant matrices for efficient GPU-accelerated sparse signals recovery. Our approach reduces the memory requirements, allowing us to recover very large signals with limited memory. In addition, it achieves a tenfold signal recovery speedup thanks to ad-hoc parallelization of matrix-vector multiplications and matrix inversions. Finally, we practically demonstrate our algorithms in a typical application of circulant matrices: deblurring a sparse astronomical image in the compressed domain.

A multiresolution approach to iterative reconstruction algorithms in X-ray computed tomography.

PubMed

De Witte, Yoni; Vlassenbroeck, Jelle; Van Hoorebeke, Luc

2010-09-01

In computed tomography, the application of iterative reconstruction methods in practical situations is impeded by their high computational demands. Especially in high resolution X-ray computed tomography, where reconstruction volumes contain a high number of volume elements (several giga voxels), this computational burden prevents their actual breakthrough. Besides the large amount of calculations, iterative algorithms require the entire volume to be kept in memory during reconstruction, which quickly becomes cumbersome for large data sets. To overcome this obstacle, we present a novel multiresolution reconstruction, which greatly reduces the required amount of memory without significantly affecting the reconstructed image quality. It is shown that, combined with an efficient implementation on a graphical processing unit, the multiresolution approach enables the application of iterative algorithms in the reconstruction of large volumes at an acceptable speed using only limited resources.
Improved cache performance in Monte Carlo transport calculations using energy banding

NASA Astrophysics Data System (ADS)

Siegel, A.; Smith, K.; Felker, K.; Romano, P.; Forget, B.; Beckman, P.

2014-04-01

We present an energy banding algorithm for Monte Carlo (MC) neutral particle transport simulations which depend on large cross section lookup tables. In MC codes, read-only cross section data tables are accessed frequently, exhibit poor locality, and are typically too much large to fit in fast memory. Thus, performance is often limited by long latencies to RAM, or by off-node communication latencies when the data footprint is very large and must be decomposed on a distributed memory machine. The proposed energy banding algorithm allows maximal temporal reuse of data in band sizes that can flexibly accommodate different architectural features. The energy banding algorithm is general and has a number of benefits compared to the traditional approach. In the present analysis we explore its potential to achieve improvements in time-to-solution on modern cache-based architectures.
A highly efficient multi-core algorithm for clustering extremely large datasets

PubMed Central

2010-01-01

Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer. PMID:20370922
An efficient parallel algorithm: Poststack and prestack Kirchhoff 3D depth migration using flexi-depth iterations

NASA Astrophysics Data System (ADS)

Rastogi, Richa; Srivastava, Abhishek; Khonde, Kiran; Sirasala, Kirannmayi M.; Londhe, Ashutosh; Chavhan, Hitesh

2015-07-01

This paper presents an efficient parallel 3D Kirchhoff depth migration algorithm suitable for current class of multicore architecture. The fundamental Kirchhoff depth migration algorithm exhibits inherent parallelism however, when it comes to 3D data migration, as the data size increases the resource requirement of the algorithm also increases. This challenges its practical implementation even on current generation high performance computing systems. Therefore a smart parallelization approach is essential to handle 3D data for migration. The most compute intensive part of Kirchhoff depth migration algorithm is the calculation of traveltime tables due to its resource requirements such as memory/storage and I/O. In the current research work, we target this area and develop a competent parallel algorithm for post and prestack 3D Kirchhoff depth migration, using hybrid MPI+OpenMP programming techniques. We introduce a concept of flexi-depth iterations while depth migrating data in parallel imaging space, using optimized traveltime table computations. This concept provides flexibility to the algorithm by migrating data in a number of depth iterations, which depends upon the available node memory and the size of data to be migrated during runtime. Furthermore, it minimizes the requirements of storage, I/O and inter-node communication, thus making it advantageous over the conventional parallelization approaches. The developed parallel algorithm is demonstrated and analysed on Yuva II, a PARAM series of supercomputers. Optimization, performance and scalability experiment results along with the migration outcome show the effectiveness of the parallel algorithm.
Memory and Spin Injection Devices Involving Half Metals

DOE PAGES

Shaughnessy, M.; Snow, Ryan; Damewood, L.; ...

2011-01-01

We suggest memory and spin injection devices fabricated with half-metallic materials and based on the anomalous Hall effect. Schematic diagrams of the memory chips, in thin film and bulk crystal form, are presented. Spin injection devices made in thin film form are also suggested. These devices do not need any external magnetic field but make use of their own magnetization. Only a gate voltage is needed. The carriers are 100% spin polarized. Memory devices may potentially be smaller, faster, and less volatile than existing ones, and the injection devices may be much smaller and more efficient than existing spin injectionmore » devices.« less
Evaluation of the performance of existing non-laboratory based cardiovascular risk assessment algorithms

PubMed Central

2013-01-01

Background The high burden and rising incidence of cardiovascular disease (CVD) in resource constrained countries necessitates implementation of robust and pragmatic primary and secondary prevention strategies. Many current CVD management guidelines recommend absolute cardiovascular (CV) risk assessment as a clinically sound guide to preventive and treatment strategies. Development of non-laboratory based cardiovascular risk assessment algorithms enable absolute risk assessment in resource constrained countries. The objective of this review is to evaluate the performance of existing non-laboratory based CV risk assessment algorithms using the benchmarks for clinically useful CV risk assessment algorithms outlined by Cooney and colleagues. Methods A literature search to identify non-laboratory based risk prediction algorithms was performed in MEDLINE, CINAHL, Ovid Premier Nursing Journals Plus, and PubMed databases. The identified algorithms were evaluated using the benchmarks for clinically useful cardiovascular risk assessment algorithms outlined by Cooney and colleagues. Results Five non-laboratory based CV risk assessment algorithms were identified. The Gaziano and Framingham algorithms met the criteria for appropriateness of statistical methods used to derive the algorithms and endpoints. The Swedish Consultation, Framingham and Gaziano algorithms demonstrated good discrimination in derivation datasets. Only the Gaziano algorithm was externally validated where it had optimal discrimination. The Gaziano and WHO algorithms had chart formats which made them simple and user friendly for clinical application. Conclusion Both the Gaziano and Framingham non-laboratory based algorithms met most of the criteria outlined by Cooney and colleagues. External validation of the algorithms in diverse samples is needed to ascertain their performance and applicability to different populations and to enhance clinicians’ confidence in them. PMID:24373202
Shaping memory consolidation via targeted memory reactivation during sleep.

PubMed

Cellini, Nicola; Capuozzo, Alessandra

2018-05-15

Recent studies have shown that the reactivation of specific memories during sleep can be modulated using external stimulation. Specifically, it has been reported that matching a sensory stimulus (e.g., odor or sound cue) with target information (e.g., pairs of words, pictures, and motor sequences) during wakefulness, and then presenting the cue alone during sleep, facilitates memory of the target information. Thus, presenting learned cues while asleep may reactivate related declarative, procedural, and emotional material, and facilitate the neurophysiological processes underpinning memory consolidation in humans. This paradigm, which has been named targeted memory reactivation, has been successfully used to improve visuospatial and verbal memories, strengthen motor skills, modify implicit social biases, and enhance fear extinction. However, these studies also show that results depend on the type of memory investigated, the task employed, the sensory cue used, and the specific sleep stage of stimulation. Here, we present a review of how memory consolidation may be shaped using noninvasive sensory stimulation during sleep. © 2018 New York Academy of Sciences.
3D Kirchhoff depth migration algorithm: A new scalable approach for parallelization on multicore CPU based cluster

NASA Astrophysics Data System (ADS)

Rastogi, Richa; Londhe, Ashutosh; Srivastava, Abhishek; Sirasala, Kirannmayi M.; Khonde, Kiran

2017-03-01

In this article, a new scalable 3D Kirchhoff depth migration algorithm is presented on state of the art multicore CPU based cluster. Parallelization of 3D Kirchhoff depth migration is challenging due to its high demand of compute time, memory, storage and I/O along with the need of their effective management. The most resource intensive modules of the algorithm are traveltime calculations and migration summation which exhibit an inherent trade off between compute time and other resources. The parallelization strategy of the algorithm largely depends on the storage of calculated traveltimes and its feeding mechanism to the migration process. The presented work is an extension of our previous work, wherein a 3D Kirchhoff depth migration application for multicore CPU based parallel system had been developed. Recently, we have worked on improving parallel performance of this application by re-designing the parallelization approach. The new algorithm is capable to efficiently migrate both prestack and poststack 3D data. It exhibits flexibility for migrating large number of traces within the available node memory and with minimal requirement of storage, I/O and inter-node communication. The resultant application is tested using 3D Overthrust data on PARAM Yuva II, which is a Xeon E5-2670 based multicore CPU cluster with 16 cores/node and 64 GB shared memory. Parallel performance of the algorithm is studied using different numerical experiments and the scalability results show striking improvement over its previous version. An impressive 49.05X speedup with 76.64% efficiency is achieved for 3D prestack data and 32.00X speedup with 50.00% efficiency for 3D poststack data, using 64 nodes. The results also demonstrate the effectiveness and robustness of the improved algorithm with high scalability and efficiency on a multicore CPU cluster.
Efficient Graph Based Assembly of Short-Read Sequences on Hybrid Core Architecture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sczyrba, Alex; Pratap, Abhishek; Canon, Shane

2011-03-22

Advanced architectures can deliver dramatically increased throughput for genomics and proteomics applications, reducing time-to-completion in some cases from days to minutes. One such architecture, hybrid-core computing, marries a traditional x86 environment with a reconfigurable coprocessor, based on field programmable gate array (FPGA) technology. In addition to higher throughput, increased performance can fundamentally improve research quality by allowing more accurate, previously impractical approaches. We will discuss the approach used by Convey?s de Bruijn graph constructor for short-read, de-novo assembly. Bioinformatics applications that have random access patterns to large memory spaces, such as graph-based algorithms, experience memory performance limitations on cache-based x86more » servers. Convey?s highly parallel memory subsystem allows application-specific logic to simultaneously access 8192 individual words in memory, significantly increasing effective memory bandwidth over cache-based memory systems. Many algorithms, such as Velvet and other de Bruijn graph based, short-read, de-novo assemblers, can greatly benefit from this type of memory architecture. Furthermore, small data type operations (four nucleotides can be represented in two bits) make more efficient use of logic gates than the data types dictated by conventional programming models.JGI is comparing the performance of Convey?s graph constructor and Velvet on both synthetic and real data. We will present preliminary results on memory usage and run time metrics for various data sets with different sizes, from small microbial and fungal genomes to very large cow rumen metagenome. For genomes with references we will also present assembly quality comparisons between the two assemblers.« less
Algorithmic formulation of control problems in manipulation

NASA Technical Reports Server (NTRS)

Bejczy, A. K.

1975-01-01

The basic characteristics of manipulator control algorithms are discussed. The state of the art in the development of manipulator control algorithms is briefly reviewed. Different end-point control techniques are described together with control algorithms which operate on external sensor (imaging, proximity, tactile, and torque/force) signals in realtime. Manipulator control development at JPL is briefly described and illustrated with several figures. The JPL work pays special attention to the front or operator input end of the control algorithms.
A random utility model of delay discounting and its application to people with externalizing psychopathology.

PubMed

Dai, Junyi; Gunn, Rachel L; Gerst, Kyle R; Busemeyer, Jerome R; Finn, Peter R

2016-10-01

Previous studies have demonstrated that working memory capacity plays a central role in delay discounting in people with externalizing psychopathology. These studies used a hyperbolic discounting model, and its single parameter-a measure of delay discounting-was estimated using the standard method of searching for indifference points between intertemporal options. However, there are several problems with this approach. First, the deterministic perspective on delay discounting underlying the indifference point method might be inappropriate. Second, the estimation procedure using the R2 measure often leads to poor model fit. Third, when parameters are estimated using indifference points only, much of the information collected in a delay discounting decision task is wasted. To overcome these problems, this article proposes a random utility model of delay discounting. The proposed model has 2 parameters, 1 for delay discounting and 1 for choice variability. It was fit to choice data obtained from a recently published data set using both maximum-likelihood and Bayesian parameter estimation. As in previous studies, the delay discounting parameter was significantly associated with both externalizing problems and working memory capacity. Furthermore, choice variability was also found to be significantly associated with both variables. This finding suggests that randomness in decisions may be a mechanism by which externalizing problems and low working memory capacity are associated with poor decision making. The random utility model thus has the advantage of disclosing the role of choice variability, which had been masked by the traditional deterministic model. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Fast, noise-free memory for photon synchronization at room temperature.

PubMed

Finkelstein, Ran; Poem, Eilon; Michel, Ohad; Lahad, Ohr; Firstenberg, Ofer

2018-01-01

Future quantum photonic networks require coherent optical memories for synchronizing quantum sources and gates of probabilistic nature. We demonstrate a fast ladder memory (FLAME) mapping the optical field onto the superposition between electronic orbitals of rubidium vapor. Using a ladder-level system of orbital transitions with nearly degenerate frequencies simultaneously enables high bandwidth, low noise, and long memory lifetime. We store and retrieve 1.7-ns-long pulses, containing 0.5 photons on average, and observe short-time external efficiency of 25%, memory lifetime (1/ e ) of 86 ns, and below 10 -4 added noise photons. Consequently, coupling this memory to a probabilistic source would enhance the on-demand photon generation probability by a factor of 12, the highest number yet reported for a noise-free, room temperature memory. This paves the way toward the controlled production of large quantum states of light from probabilistic photon sources.
Reconfigurable photonic crystals enabled by pressure-responsive shape-memory polymers

PubMed Central

Fang, Yin; Ni, Yongliang; Leo, Sin-Yen; Taylor, Curtis; Basile, Vito; Jiang, Peng

2015-01-01

Smart shape-memory polymers can memorize and recover their permanent shape in response to an external stimulus (for example, heat). They have been extensively exploited for a wide spectrum of applications ranging from biomedical devices to aerospace morphing structures. However, most of the existing shape-memory polymers are thermoresponsive and their performance is hindered by heat-demanding programming and recovery steps. Although pressure is an easily adjustable process variable such as temperature, pressure-responsive shape-memory polymers are largely unexplored. Here we report a series of shape-memory polymers that enable unusual ‘cold' programming and instantaneous shape recovery triggered by applying a contact pressure at ambient conditions. Moreover, the interdisciplinary integration of scientific principles drawn from two disparate fields—the fast-growing photonic crystal and shape-memory polymer technologies—enables fabrication of reconfigurable photonic crystals and simultaneously provides a simple and sensitive optical technique for investigating the intriguing shape-memory effects at nanoscale. PMID:26074349
Resolution of singularities for multi-loop integrals

NASA Astrophysics Data System (ADS)

Bogner, Christian; Weinzierl, Stefan

2008-04-01

We report on a program for the numerical evaluation of divergent multi-loop integrals. The program is based on iterated sector decomposition. We improve the original algorithm of Binoth and Heinrich such that the program is guaranteed to terminate. The program can be used to compute numerically the Laurent expansion of divergent multi-loop integrals regulated by dimensional regularisation. The symbolic and the numerical steps of the algorithm are combined into one program. Program summaryProgram title: sector_decomposition Catalogue identifier: AEAG_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEAG_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 47 506 No. of bytes in distributed program, including test data, etc.: 328 485 Distribution format: tar.gz Programming language: C++ Computer: all Operating system: Unix RAM: Depending on the complexity of the problem Classification: 4.4 External routines: GiNaC, available from http://www.ginac.de, GNU scientific library, available from http://www.gnu.org/software/gsl Nature of problem: Computation of divergent multi-loop integrals. Solution method: Sector decomposition. Restrictions: Only limited by the available memory and CPU time. Running time: Depending on the complexity of the problem.
Methodology for Sensitivity Analysis, Approximate Analysis, and Design Optimization in CFD for Multidisciplinary Applications

NASA Technical Reports Server (NTRS)

Taylor, Arthur C., III; Hou, Gene W.

1996-01-01

An incremental iterative formulation together with the well-known spatially split approximate-factorization algorithm, is presented for solving the large, sparse systems of linear equations that are associated with aerodynamic sensitivity analysis. This formulation is also known as the 'delta' or 'correction' form. For the smaller two dimensional problems, a direct method can be applied to solve these linear equations in either the standard or the incremental form, in which case the two are equivalent. However, iterative methods are needed for larger two-dimensional and three dimensional applications because direct methods require more computer memory than is currently available. Iterative methods for solving these equations in the standard form are generally unsatisfactory due to an ill-conditioned coefficient matrix; this problem is overcome when these equations are cast in the incremental form. The methodology is successfully implemented and tested using an upwind cell-centered finite-volume formulation applied in two dimensions to the thin-layer Navier-Stokes equations for external flow over an airfoil. In three dimensions this methodology is demonstrated with a marching-solution algorithm for the Euler equations to calculate supersonic flow over the High-Speed Civil Transport configuration (HSCT 24E). The sensitivity derivatives obtained with the incremental iterative method from a marching Euler code are used in a design-improvement study of the HSCT configuration that involves thickness. camber, and planform design variables.
Bi-periodicity evoked by periodic external inputs in delayed Cohen-Grossberg-type bidirectional associative memory networks

NASA Astrophysics Data System (ADS)

Cao, Jinde; Wang, Yanyan

2010-05-01

In this paper, the bi-periodicity issue is discussed for Cohen-Grossberg-type (CG-type) bidirectional associative memory (BAM) neural networks (NNs) with time-varying delays and standard activation functions. It is shown that the model considered in this paper has two periodic orbits located in saturation regions and they are locally exponentially stable. Meanwhile, some conditions are derived to ensure that, in any designated region, the model has a locally exponentially stable or globally exponentially attractive periodic orbit located in it. As a special case of bi-periodicity, some results are also presented for the system with constant external inputs. Finally, four examples are given to illustrate the effectiveness of the obtained results.
Mechanisms of Working Memory Disruption by External Interference

PubMed Central

Rubens, Michael T.; Gazzaley, Adam

2010-01-01

The negative impact of external interference on working memory (WM) performance is well documented; yet, the mechanisms underlying this disruption are not sufficiently understood. In this study, electroencephalogram and functional magnetic resonance imaging (fMRI) data were recorded in separate experiments that each introduced different types of visual interference during a period of WM maintenance: distraction (irrelevant stimuli) and interruption (stimuli that required attention). The data converged to reveal that regardless of the type of interference, the magnitude of processing interfering stimuli in the visual cortex (as rapidly as 100 ms) predicted subsequent WM recognition accuracy for stored items. fMRI connectivity analyses suggested that in the presence of distraction, encoded items were maintained throughout the delay period via connectivity between the middle frontal gyrus and visual association cortex, whereas memoranda were not maintained when subjects were interrupted but rather reactivated in the postinterruption period. These results elucidate the mechanisms of external interference on WM performance and highlight similarities and differences of distraction and multitasking. PMID:19648173
Reducing Memory Cost of Exact Diagonalization using Singular Value Decomposition

NASA Astrophysics Data System (ADS)

Weinstein, Marvin; Chandra, Ravi; Auerbach, Assa

2012-02-01

We present a modified Lanczos algorithm to diagonalize lattice Hamiltonians with dramatically reduced memory requirements. In contrast to variational approaches and most implementations of DMRG, Lanczos rotations towards the ground state do not involve incremental minimizations, (e.g. sweeping procedures) which may get stuck in false local minima. The lattice of size N is partitioned into two subclusters. At each iteration the rotating Lanczos vector is compressed into two sets of nsvd small subcluster vectors using singular value decomposition. For low entanglement entropy See, (satisfied by short range Hamiltonians), the truncation error is bounded by (-nsvd^1/See). Convergence is tested for the Heisenberg model on Kagom'e clusters of 24, 30 and 36 sites, with no lattice symmetries exploited, using less than 15GB of dynamical memory. Generalization of the Lanczos-SVD algorithm to multiple partitioning is discussed, and comparisons to other techniques are given. Reference: arXiv:1105.0007
Computational Issues in Damping Identification for Large Scale Problems

NASA Technical Reports Server (NTRS)

Pilkey, Deborah L.; Roe, Kevin P.; Inman, Daniel J.

1997-01-01

Two damping identification methods are tested for efficiency in large-scale applications. One is an iterative routine, and the other a least squares method. Numerical simulations have been performed on multiple degree-of-freedom models to test the effectiveness of the algorithm and the usefulness of parallel computation for the problems. High Performance Fortran is used to parallelize the algorithm. Tests were performed using the IBM-SP2 at NASA Ames Research Center. The least squares method tested incurs high communication costs, which reduces the benefit of high performance computing. This method's memory requirement grows at a very rapid rate meaning that larger problems can quickly exceed available computer memory. The iterative method's memory requirement grows at a much slower pace and is able to handle problems with 500+ degrees of freedom on a single processor. This method benefits from parallelization, and significant speedup can he seen for problems of 100+ degrees-of-freedom.
A Very Low Cost BCH Decoder for High Immunity of On-Chip Memories

NASA Astrophysics Data System (ADS)

Seo, Haejun; Han, Sehwan; Heo, Yoonseok; Cho, Taewon

BCH(Bose-Chaudhuri-Hoquenbhem) code, a type of block codes-cyclic codes, has very strong error-correcting ability which is vital for performing the error protection on the memory system. BCH code has many kinds of dual algorithms, PGZ(Pererson-Gorenstein-Zierler) algorithm out of them is advantageous in view of correcting the errors through the simple calculation in t value. However, this is problematic when this becomes 0 (divided by zero) in case ν ≠ t. In this paper, the circuit would be simplified by suggesting the multi-mode hardware architecture in preparation that v were 0~3. First, production cost would be less thanks to the smaller number of gates. Second, lessening power consumption could lengthen the recharging period. The very low cost and simple datapath make our design a good choice in small-footprint SoC(System on Chip) as ECC(Error Correction Code/Circuit) in memory system.

Hybrid Parallelism for Volume Rendering on Large-, Multi-, and Many-Core Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Howison, Mark; Bethel, E. Wes; Childs, Hank

2012-01-01

With the computing industry trending towards multi- and many-core processors, we study how a standard visualization algorithm, ray-casting volume rendering, can benefit from a hybrid parallelism approach. Hybrid parallelism provides the best of both worlds: using distributed-memory parallelism across a large numbers of nodes increases available FLOPs and memory, while exploiting shared-memory parallelism among the cores within each node ensures that each node performs its portion of the larger calculation as efficiently as possible. We demonstrate results from weak and strong scaling studies, at levels of concurrency ranging up to 216,000, and with datasets as large as 12.2 trillion cells.more » The greatest benefit from hybrid parallelism lies in the communication portion of the algorithm, the dominant cost at higher levels of concurrency. We show that reducing the number of participants with a hybrid approach significantly improves performance.« less
Decoding of DBEC-TBED Reed-Solomon codes. [Double-Byte-Error-Correcting, Triple-Byte-Error-Detecting

NASA Technical Reports Server (NTRS)

Deng, Robert H.; Costello, Daniel J., Jr.

1987-01-01

A problem in designing semiconductor memories is to provide some measure of error control without requiring excessive coding overhead or decoding time. In LSI and VLSI technology, memories are often organized on a multiple bit (or byte) per chip basis. For example, some 256 K bit DRAM's are organized in 32 K x 8 bit-bytes. Byte-oriented codes such as Reed-Solomon (RS) codes can provide efficient low overhead error control for such memories. However, the standard iterative algorithm for decoding RS codes is too slow for these applications. The paper presents a special decoding technique for double-byte-error-correcting, triple-byte-error-detecting RS codes which is capable of high-speed operation. This technique is designed to find the error locations and the error values directly from the syndrome without having to use the iterative algorithm to find the error locator polynomial.
INDDGO: Integrated Network Decomposition & Dynamic programming for Graph Optimization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Groer, Christopher S; Sullivan, Blair D; Weerapurage, Dinesh P

2012-10-01

It is well-known that dynamic programming algorithms can utilize tree decompositions to provide a way to solve some \\emph{NP}-hard problems on graphs where the complexity is polynomial in the number of nodes and edges in the graph, but exponential in the width of the underlying tree decomposition. However, there has been relatively little computational work done to determine the practical utility of such dynamic programming algorithms. We have developed software to construct tree decompositions using various heuristics and have created a fast, memory-efficient dynamic programming implementation for solving maximum weighted independent set. We describe our software and the algorithms wemore » have implemented, focusing on memory saving techniques for the dynamic programming. We compare the running time and memory usage of our implementation with other techniques for solving maximum weighted independent set, including a commercial integer programming solver and a semi-definite programming solver. Our results indicate that it is possible to solve some instances where the underlying decomposition has width much larger than suggested by the literature. For certain types of problems, our dynamic programming code runs several times faster than these other methods.« less
Proceedings: Sisal `93

DOE Office of Scientific and Technical Information (OSTI.GOV)

Feo, J.T.

1993-10-01

This report contain papers on: Programmability and performance issues; The case of an iterative partial differential equation solver; Implementing the kernal of the Australian Region Weather Prediction Model in Sisal; Even and quarter-even prime length symmetric FFTs and their Sisal Implementations; Top-down thread generation for Sisal; Overlapping communications and computations on NUMA architechtures; Compiling technique based on dataflow analysis for funtional programming language Valid; Copy elimination for true multidimensional arrays in Sisal 2.0; Increasing parallelism for an optimization that reduces copying in IF2 graphs; Caching in on Sisal; Cache performance of Sisal Vs. FORTRAN; FFT algorithms on a shared-memory multiprocessor;more » A parallel implementation of nonnumeric search problems in Sisal; Computer vision algorithms in Sisal; Compilation of Sisal for a high-performance data driven vector processor; Sisal on distributed memory machines; A virtual shared addressing system for distributed memory Sisal; Developing a high-performance FFT algorithm in Sisal for a vector supercomputer; Implementation issues for IF2 on a static data-flow architechture; and Systematic control of parallelism in array-based data-flow computation. Selected papers have been indexed separately for inclusion in the Energy Science and Technology Database.« less
The influence of self-awareness on emotional memory formation: an fMRI study

PubMed Central

Wing, Erik A.; Cabeza, Roberto

2016-01-01

Evidence from functional neuroimaging studies of emotional perception shows that when attention is focused on external features of emotional stimuli (external perceptual orienting—EPO), the amygdala is primarily engaged, but when attention is turned inwards towards one’s own emotional state (interoceptive self-orienting—ISO), regions of the salience network, such as the anterior insula (AI) and the dorsal anterior cingulate cortex (dACC), also play a major role. Yet, it is unknown if ISO boosts the contributions of AI and dACC not only to emotional ‘perception’ but also to emotional ‘memory’. To investigate this issue, participants were scanned with functional magnetic resonance imaging (fMRI) while viewing emotional and neutral pictures under ISO or EPO, and memory was tested several days later. The study yielded three main findings: (i) emotion boosted perception-related activity in the amygdala during both ISO and EPO and in the right AI exclusively during ISO; (ii) emotion augmented activity predicting subsequent memory in AI and dACC during ISO but not during EPO and (iii) high confidence memory was associated with increased amygdala–dACC connectivity, selectively for ISO encoding. These findings show, for the first time, that ISO promotes emotional memory formation via regions associated with interoceptive awareness of emotional experience, such as AI and dACC. PMID:26645274
A multiresolution halftoning algorithm for progressive display

NASA Astrophysics Data System (ADS)

Mukherjee, Mithun; Sharma, Gaurav

2005-01-01

We describe and implement an algorithmic framework for memory efficient, 'on-the-fly' halftoning in a progressive transmission environment. Instead of a conventional approach which repeatedly recalls the continuous tone image from memory and subsequently halftones it for display, the proposed method achieves significant memory efficiency by storing only the halftoned image and updating it in response to additional information received through progressive transmission. Thus the method requires only a single frame-buffer of bits for storage of the displayed binary image and no additional storage is required for the contone data. The additional image data received through progressive transmission is accommodated through in-place updates of the buffer. The method is thus particularly advantageous for high resolution bi-level displays where it can result in significant savings in memory. The proposed framework is implemented using a suitable multi-resolution, multi-level modification of error diffusion that is motivated by the presence of a single binary frame-buffer. Aggregates of individual display bits constitute the multiple output levels at a given resolution. This creates a natural progression of increasing resolution with decreasing bit-depth.
Optimal colour quality of LED clusters based on memory colours.

PubMed

Smet, Kevin; Ryckaert, Wouter R; Pointer, Michael R; Deconinck, Geert; Hanselaer, Peter

2011-03-28

The spectral power distributions of tri- and tetrachromatic clusters of Light-Emitting-Diodes, composed of simulated and commercially available LEDs, were optimized with a genetic algorithm to maximize the luminous efficacy of radiation and the colour quality as assessed by the memory colour quality metric developed by the authors. The trade-off of the colour quality as assessed by the memory colour metric and the luminous efficacy of radiation was investigated by calculating the Pareto optimal front using the NSGA-II genetic algorithm. Optimal peak wavelengths and spectral widths of the LEDs were derived, and over half of them were found to be close to Thornton's prime colours. The Pareto optimal fronts of real LED clusters were always found to be smaller than those of the simulated clusters. The effect of binning on designing a real LED cluster was investigated and was found to be quite large. Finally, a real LED cluster of commercially available AlGaInP, InGaN and phosphor white LEDs was optimized to obtain a higher score on memory colour quality scale than its corresponding CIE reference illuminant.
Optical tomographic memories: algorithms for the efficient information readout

NASA Astrophysics Data System (ADS)

Pantelic, Dejan V.

1990-07-01

Tomographic alogithms are modified in order to reconstruct the inf ormation previously stored by focusing laser radiation in a volume of photosensitive media. Apriori information about the position of bits of inf ormation is used. 1. THE PRINCIPLES OF TOMOGRAPHIC MEMORIES Tomographic principles can be used to store and reconstruct the inf ormation artificially stored in a bulk of a photosensitive media 1 The information is stored by changing some characteristics of a memory material (e. g. refractive index). Radiation from the two independent light sources (e. g. lasers) is f ocused inside the memory material. In this way the intensity of the light is above the threshold only in the localized point where the light rays intersect. By scanning the material the information can be stored in binary or nary format. When the information is stored it can be read by tomographic methods. However the situation is quite different from the classical tomographic problem. Here a lot of apriori information is present regarding the p0- sitions of the bits of information profile representing single bit and a mode of operation (binary or n-ary). 2. ALGORITHMS FOR THE READOUT OF THE TOMOGRAPHIC MEMORIES Apriori information enables efficient reconstruction of the memory contents. In this paper a few methods for the information readout together with the simulation results will be presented. Special attention will be given to the noise considerations. Two different
A communication-avoiding, hybrid-parallel, rank-revealing orthogonalization method.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoemmen, Mark

2010-11-01

Orthogonalization consumes much of the run time of many iterative methods for solving sparse linear systems and eigenvalue problems. Commonly used algorithms, such as variants of Gram-Schmidt or Householder QR, have performance dominated by communication. Here, 'communication' includes both data movement between the CPU and memory, and messages between processors in parallel. Our Tall Skinny QR (TSQR) family of algorithms requires asymptotically fewer messages between processors and data movement between CPU and memory than typical orthogonalization methods, yet achieves the same accuracy as Householder QR factorization. Furthermore, in block orthogonalizations, TSQR is faster and more accurate than existing approaches formore » orthogonalizing the vectors within each block ('normalization'). TSQR's rank-revealing capability also makes it useful for detecting deflation in block iterative methods, for which existing approaches sacrifice performance, accuracy, or both. We have implemented a version of TSQR that exploits both distributed-memory and shared-memory parallelism, and supports real and complex arithmetic. Our implementation is optimized for the case of orthogonalizing a small number (5-20) of very long vectors. The shared-memory parallel component uses Intel's Threading Building Blocks, though its modular design supports other shared-memory programming models as well, including computation on the GPU. Our implementation achieves speedups of 2 times or more over competing orthogonalizations. It is available now in the development branch of the Trilinos software package, and will be included in the 10.8 release.« less
Direct evidence of detwinning in polycrystalline Ni-Mn-Ga ferromagnetic shape memory alloys during deformation.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nie, Z. H.; Lin Peng, R.; Johansson, S.

2008-01-01

In situ time-of-flight neutron diffraction and high-energy x-ray diffraction techniques were used to reveal the preferred reselection of martensite variants through a detwinning process in polycrystalline Ni-Mn-Ga ferromagnetic shape memory alloys under uniaxial compressive stress. The variant reorientation via detwinning during loading can be explained by considering the influence of external stress on the grain/variant orientation-dependent distortion energy. These direct observations of detwinning provide a good understanding of the deformation mechanisms in shape memory alloys.
A class Hierarchical, object-oriented approach to virtual memory management

NASA Technical Reports Server (NTRS)

Russo, Vincent F.; Campbell, Roy H.; Johnston, Gary M.

1989-01-01

The Choices family of operating systems exploits class hierarchies and object-oriented programming to facilitate the construction of customized operating systems for shared memory and networked multiprocessors. The software is being used in the Tapestry laboratory to study the performance of algorithms, mechanisms, and policies for parallel systems. Described here are the architectural design and class hierarchy of the Choices virtual memory management system. The software and hardware mechanisms and policies of a virtual memory system implement a memory hierarchy that exploits the trade-off between response times and storage capacities. In Choices, the notion of a memory hierarchy is captured by abstract classes. Concrete subclasses of those abstractions implement a virtual address space, segmentation, paging, physical memory management, secondary storage, and remote (that is, networked) storage. Captured in the notion of a memory hierarchy are classes that represent memory objects. These classes provide a storage mechanism that contains encapsulated data and have methods to read or write the memory object. Each of these classes provides specializations to represent the memory hierarchy.
Automatic Detection of Steganographic Content

DTIC Science & Technology

2005-06-30

Practically, it is mostly embedded into the media files, especially the image files. Consequently, a lot of the anti- steganography algorithms work with raw...1: not enough memory * -2: error running the removal algorithm EXPORT IMAGE *StegRemove( IMAGE * image , int *error); 2.8 Steganography Extraction API...researcher just invented a reliable algorithm that can detect the existence of a steganography if it is embedded anywhere in any uncompressed image . The
Computational mechanics analysis tools for parallel-vector supercomputers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning

1993-01-01

Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.
Cognitive Machine-Learning Algorithm for Cardiac Imaging: A Pilot Study for Differentiating Constrictive Pericarditis From Restrictive Cardiomyopathy.

PubMed

Sengupta, Partho P; Huang, Yen-Min; Bansal, Manish; Ashrafi, Ali; Fisher, Matt; Shameer, Khader; Gall, Walt; Dudley, Joel T

2016-06-01

Associating a patient's profile with the memories of prototypical patients built through previous repeat clinical experience is a key process in clinical judgment. We hypothesized that a similar process using a cognitive computing tool would be well suited for learning and recalling multidimensional attributes of speckle tracking echocardiography data sets derived from patients with known constrictive pericarditis and restrictive cardiomyopathy. Clinical and echocardiographic data of 50 patients with constrictive pericarditis and 44 with restrictive cardiomyopathy were used for developing an associative memory classifier-based machine-learning algorithm. The speckle tracking echocardiography data were normalized in reference to 47 controls with no structural heart disease, and the diagnostic area under the receiver operating characteristic curve of the associative memory classifier was evaluated for differentiating constrictive pericarditis from restrictive cardiomyopathy. Using only speckle tracking echocardiography variables, associative memory classifier achieved a diagnostic area under the curve of 89.2%, which improved to 96.2% with addition of 4 echocardiographic variables. In comparison, the area under the curve of early diastolic mitral annular velocity and left ventricular longitudinal strain were 82.1% and 63.7%, respectively. Furthermore, the associative memory classifier demonstrated greater accuracy and shorter learning curves than other machine-learning approaches, with accuracy asymptotically approaching 90% after a training fraction of 0.3 and remaining flat at higher training fractions. This study demonstrates feasibility of a cognitive machine-learning approach for learning and recalling patterns observed during echocardiographic evaluations. Incorporation of machine-learning algorithms in cardiac imaging may aid standardized assessments and support the quality of interpretations, particularly for novice readers with limited experience. © 2016 American Heart Association, Inc.
LiteNet: Lightweight Neural Network for Detecting Arrhythmias at Resource-Constrained Mobile Devices.

PubMed

He, Ziyang; Zhang, Xiaoqing; Cao, Yangjie; Liu, Zhi; Zhang, Bo; Wang, Xiaoyan

2018-04-17

By running applications and services closer to the user, edge processing provides many advantages, such as short response time and reduced network traffic. Deep-learning based algorithms provide significantly better performances than traditional algorithms in many fields but demand more resources, such as higher computational power and more memory. Hence, designing deep learning algorithms that are more suitable for resource-constrained mobile devices is vital. In this paper, we build a lightweight neural network, termed LiteNet which uses a deep learning algorithm design to diagnose arrhythmias, as an example to show how we design deep learning schemes for resource-constrained mobile devices. Compare to other deep learning models with an equivalent accuracy, LiteNet has several advantages. It requires less memory, incurs lower computational cost, and is more feasible for deployment on resource-constrained mobile devices. It can be trained faster than other neural network algorithms and requires less communication across different processing units during distributed training. It uses filters of heterogeneous size in a convolutional layer, which contributes to the generation of various feature maps. The algorithm was tested using the MIT-BIH electrocardiogram (ECG) arrhythmia database; the results showed that LiteNet outperforms comparable schemes in diagnosing arrhythmias, and in its feasibility for use at the mobile devices.
Algorithms and Libraries

NASA Technical Reports Server (NTRS)

Dongarra, Jack

1998-01-01

This exploratory study initiated our inquiry into algorithms and applications that would benefit by latency tolerant approach to algorithm building, including the construction of new algorithms where appropriate. In a multithreaded execution, when a processor reaches a point where remote memory access is necessary, the request is sent out on the network and a context--switch occurs to a new thread of computation. This effectively masks a long and unpredictable latency due to remote loads, thereby providing tolerance to remote access latency. We began to develop standards to profile various algorithm and application parameters, such as the degree of parallelism, granularity, precision, instruction set mix, interprocessor communication, latency etc. These tools will continue to develop and evolve as the Information Power Grid environment matures. To provide a richer context for this research, the project also focused on issues of fault-tolerance and computation migration of numerical algorithms and software. During the initial phase we tried to increase our understanding of the bottlenecks in single processor performance. Our work began by developing an approach for the automatic generation and optimization of numerical software for processors with deep memory hierarchies and pipelined functional units. Based on the results we achieved in this study we are planning to study other architectures of interest, including development of cost models, and developing code generators appropriate to these architectures.
LiteNet: Lightweight Neural Network for Detecting Arrhythmias at Resource-Constrained Mobile Devices

PubMed Central

Zhang, Xiaoqing; Cao, Yangjie; Liu, Zhi; Zhang, Bo; Wang, Xiaoyan

2018-01-01

By running applications and services closer to the user, edge processing provides many advantages, such as short response time and reduced network traffic. Deep-learning based algorithms provide significantly better performances than traditional algorithms in many fields but demand more resources, such as higher computational power and more memory. Hence, designing deep learning algorithms that are more suitable for resource-constrained mobile devices is vital. In this paper, we build a lightweight neural network, termed LiteNet which uses a deep learning algorithm design to diagnose arrhythmias, as an example to show how we design deep learning schemes for resource-constrained mobile devices. Compare to other deep learning models with an equivalent accuracy, LiteNet has several advantages. It requires less memory, incurs lower computational cost, and is more feasible for deployment on resource-constrained mobile devices. It can be trained faster than other neural network algorithms and requires less communication across different processing units during distributed training. It uses filters of heterogeneous size in a convolutional layer, which contributes to the generation of various feature maps. The algorithm was tested using the MIT-BIH electrocardiogram (ECG) arrhythmia database; the results showed that LiteNet outperforms comparable schemes in diagnosing arrhythmias, and in its feasibility for use at the mobile devices. PMID:29673171
Accelerate quasi Monte Carlo method for solving systems of linear algebraic equations through shared memory

NASA Astrophysics Data System (ADS)

Lai, Siyan; Xu, Ying; Shao, Bo; Guo, Menghan; Lin, Xiaola

2017-04-01

In this paper we study on Monte Carlo method for solving systems of linear algebraic equations (SLAE) based on shared memory. Former research demostrated that GPU can effectively speed up the computations of this issue. Our purpose is to optimize Monte Carlo method simulation on GPUmemoryachritecture specifically. Random numbers are organized to storein shared memory, which aims to accelerate the parallel algorithm. Bank conflicts can be avoided by our Collaborative Thread Arrays(CTA)scheme. The results of experiments show that the shared memory based strategy can speed up the computaions over than 3X at most.
Parental Verbal Strategies and Children's Capacities at 3 and 5 Years during a Memory Task

ERIC Educational Resources Information Center

Labrell, Florence; Ubersfeld, Guillaume

2004-01-01

In order to study the influence on memorization of external inputs as well as children's own strategies, we examined both parental discourses in terms of distancing (Sigel, 1970) and spontaneous rehearsal by children during a memory task. Our aim was to assess the influence of each factor for children between 3 and 5 years of age. In our study of…
A comparison of common programming languages used in bioinformatics.

PubMed

Fourment, Mathieu; Gillings, Michael R

2008-02-05

The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python. Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found. Source code and additional information are available from http://www.bioinformatics.org/benchmark/. This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language.

Chaotic Traversal (CHAT): Very Large Graphs Traversal Using Chaotic Dynamics

NASA Astrophysics Data System (ADS)

Changaival, Boonyarit; Rosalie, Martin; Danoy, Grégoire; Lavangnananda, Kittichai; Bouvry, Pascal

2017-12-01

Graph Traversal algorithms can find their applications in various fields such as routing problems, natural language processing or even database querying. The exploration can be considered as a first stepping stone into knowledge extraction from the graph which is now a popular topic. Classical solutions such as Breadth First Search (BFS) and Depth First Search (DFS) require huge amounts of memory for exploring very large graphs. In this research, we present a novel memoryless graph traversal algorithm, Chaotic Traversal (CHAT) which integrates chaotic dynamics to traverse large unknown graphs via the Lozi map and the Rössler system. To compare various dynamics effects on our algorithm, we present an original way to perform the exploration of a parameter space using a bifurcation diagram with respect to the topological structure of attractors. The resulting algorithm is an efficient and nonresource demanding algorithm, and is therefore very suitable for partial traversal of very large and/or unknown environment graphs. CHAT performance using Lozi map is proven superior than the, commonly known, Random Walk, in terms of number of nodes visited (coverage percentage) and computation time where the environment is unknown and memory usage is restricted.
A comparison of common programming languages used in bioinformatics

PubMed Central

Fourment, Mathieu; Gillings, Michael R

2008-01-01

Background The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python. Results Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found. Source code and additional information are available from Conclusion This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language. PMID:18251993
A tunable algorithm for collective decision-making.

PubMed

Pratt, Stephen C; Sumpter, David J T

2006-10-24

Complex biological systems are increasingly understood in terms of the algorithms that guide the behavior of system components and the information pathways that link them. Much attention has been given to robust algorithms, or those that allow a system to maintain its functions in the face of internal or external perturbations. At the same time, environmental variation imposes a complementary need for algorithm versatility, or the ability to alter system function adaptively as external circumstances change. An important goal of systems biology is thus the identification of biological algorithms that can meet multiple challenges rather than being narrowly specified to particular problems. Here we show that emigrating colonies of the ant Temnothorax curvispinosus tune the parameters of a single decision algorithm to respond adaptively to two distinct problems: rapid abandonment of their old nest in a crisis and deliberative selection of the best available new home when their old nest is still intact. The algorithm uses a stepwise commitment scheme and a quorum rule to integrate information gathered by numerous individual ants visiting several candidate homes. By varying the rates at which they search for and accept these candidates, the ants yield a colony-level response that adaptively emphasizes either speed or accuracy. We propose such general but tunable algorithms as a design feature of complex systems, each algorithm providing elegant solutions to a wide range of problems.
Statistical prediction with Kanerva's sparse distributed memory

NASA Technical Reports Server (NTRS)

Rogers, David

1989-01-01

A new viewpoint of the processing performed by Kanerva's sparse distributed memory (SDM) is presented. In conditions of near- or over-capacity, where the associative-memory behavior of the model breaks down, the processing performed by the model can be interpreted as that of a statistical predictor. Mathematical results are presented which serve as the framework for a new statistical viewpoint of sparse distributed memory and for which the standard formulation of SDM is a special case. This viewpoint suggests possible enhancements to the SDM model, including a procedure for improving the predictiveness of the system based on Holland's work with genetic algorithms, and a method for improving the capacity of SDM even when used as an associative memory.
"Being there" and remembering it: Presence improves memory encoding.

PubMed

Makowski, Dominique; Sperduti, Marco; Nicolas, Serge; Piolino, Pascale

2017-08-01

Few studies have investigated the link between episodic memory and presence: the feeling of "being there" and reacting to a stimulus as if it were real. We collected data from 244 participants after they had watched the movie Avengers: Age of Ultron. They answered questions about factual (details of the movie) and temporal memory (order of the scenes) about the movie, as well as their emotion experience and their sense of presence during the projection. Both higher emotion experience and sense of presence were related to better factual memory, but not to temporal order memory. Crucially, the link between emotion and factual memory was mediated by the sense of presence. We interpreted the role of presence as an external absorption of the attentional focus toward the stimulus, thus enhancing memory encoding. Our findings could shed light on the cognitive processes underlying memory impairments in psychiatric conditions characterized by an altered sense of reality. Copyright © 2017 Elsevier Inc. All rights reserved.
Efficient Parallel Algorithm For Direct Numerical Simulation of Turbulent Flows

NASA Technical Reports Server (NTRS)

Moitra, Stuti; Gatski, Thomas B.

1997-01-01

A distributed algorithm for a high-order-accurate finite-difference approach to the direct numerical simulation (DNS) of transition and turbulence in compressible flows is described. This work has two major objectives. The first objective is to demonstrate that parallel and distributed-memory machines can be successfully and efficiently used to solve computationally intensive and input/output intensive algorithms of the DNS class. The second objective is to show that the computational complexity involved in solving the tridiagonal systems inherent in the DNS algorithm can be reduced by algorithm innovations that obviate the need to use a parallelized tridiagonal solver.
Selective, age-related autobiographical memory deficits in children with severe traumatic brain injury.

PubMed

Lah, Suncica; Gott, Chloe; Parry, Louise; Black, Carly; Epps, Adrienne; Gascoigne, Michael

2017-12-18

Autobiographical memory (AM) is a complex function that involves re-experiencing of past personal events (episodic memory) scaffolded by personal facts (semantic memory). While AM is supported by a brain network and cognitive skills that are vulnerable to disruption by child traumatic brain injury (TBI), AM has not been examined in this patient population. Cross-sectional study. Participants included children with severe closed TBI (n = 14) and healthy control (NC) children (n = 20) of comparable age, sex, and socioeconomic status. Participants completed (1) the Child Autobiographical Interview (Willoughby et al., , Front. Psychol., 3, 53), which required recall of autobiographical events and distinguished episodic (internal) from non-episodic (external) details, and self-rating of event phenomenological qualities, and (2) a battery of neuropsychological tests. Children with TBI recalled significantly fewer internal details relative to NCs, but the between-group difference was eliminated when specific probes were provided. The groups did not differ in either recall of external details or in ratings of events' phenomenological qualities. The gap between the groups in recall of internal details increased with age, as the greater number of internal details was associated with older age in the NC group, but not in the TBI group. Poorer verbal memory and lower IQ were related to recall of fewer internal details in the TBI group. This study unveils, to our knowledge for the first time, that severe child TBI is associated with a selective deficit in autobiographical memory that involves episodic, but spares semantic details, and identifies the risk factors for this impairment. © 2017 The British Psychological Society.
Method of preparing a two-way shape memory alloy

DOEpatents

Johnson, Alfred D.

1984-01-01

A two-way shape memory alloy, a method of training a shape memory alloy, and a heat engine employing the two-way shape memory alloy to do external work during both heating and cooling phases. The alloy is heated under a first training stress to a temperature which is above the upper operating temperature of the alloy, then cooled to a cold temperature below the zero-force transition temperature of the alloy, then deformed while applying a second training stress which is greater in magnitude than the stress at which the alloy is to be operated, then heated back to the hot temperature, changing from the second training stress back to the first training stress.
Cortical midline involvement in autobiographical memory

PubMed Central

Summerfield, Jennifer J.; Hassabis, Demis; Maguire, Eleanor A.

2009-01-01

Recollecting autobiographical memories of personal past experiences is an integral part of our everyday lives and relies on a distributed set of brain regions. Their occurrence externally in the real world (‘realness’) and their self-relevance (‘selfness’) are two defining features of these autobiographical events. Distinguishing between personally experienced events and those that happened to other individuals, and between events that really occurred and those that were mere figments of the imagination, is clearly advantageous, yet the respective neural correlates remain unclear. Here we experimentally manipulated and dissociated realness and selfness during fMRI using a novel paradigm where participants recalled self (autobiographical) and non-self (from a movie or television news clips) events that were either real or previously imagined. Distinct sub-regions within dorsal and ventral medial prefrontal cortex, retrosplenial cortex and along the parieto-occipital sulcus preferentially coded for events (real or imagined) involving the self. By contrast, recollection of autobiographical events that really happened in the external world activated different areas within ventromedial prefrontal cortex and posterior cingulate cortex. In addition, recall of externally experienced real events (self or non-self) was associated with increased activity in areas of dorsomedial prefrontal cortex and posterior cingulate cortex. Taken together our results permitted a functional deconstruction of anterior (medial prefrontal) and posterior (retrosplenial cortex, posterior cingulate cortex, precuneus) cortical midline regions widely associated with autobiographical memory but whose roles have hitherto been poorly understood. PMID:18973817
When the “I” Looks at the “Me”: Autobiographical Memory, Visual Perspective, and the Self

PubMed Central

Sutin, Angelina R.; Robins, Richard W.

2009-01-01

This article presents a theoretical model of the self processes involved in autobiographical memories and proposes competing hypotheses for the role of visual perspective in autobiographical memory retrieval. Autobiographical memories can be retrieved from either the 1st person perspective, in which individuals see the event through their own eyes, or from the 3rd person perspective, in which individuals see themselves and the event from the perspective of an external observer. A growing body of research suggests that the visual perspective from which a memory is retrieved has important implications for a person's thoughts, feelings, and goals, and is integrally related to a host of self-evaluative processes. We review the relevant research literature, present our theoretical model, and outline directions for future research. PMID:18848783
When the "I" looks at the "Me": autobiographical memory, visual perspective, and the self.

PubMed

Sutin, Angelina R; Robins, Richard W

2008-12-01

This article presents a theoretical model of the self processes involved in autobiographical memories and proposes competing hypotheses for the role of visual perspective in autobiographical memory retrieval. Autobiographical memories can be retrieved from either the 1st person perspective, in which individuals see the event through their own eyes, or from the 3rd person perspective, in which individuals see themselves and the event from the perspective of an external observer. A growing body of research suggests that the visual perspective from which a memory is retrieved has important implications for a person's thoughts, feelings, and goals, and is integrally related to a host of self-evaluative processes. We review the relevant research literature, present our theoretical model, and outline directions for future research.
Optimization of image processing algorithms on mobile platforms

NASA Astrophysics Data System (ADS)

Poudel, Pramod; Shirvaikar, Mukul

2011-03-01

This work presents a technique to optimize popular image processing algorithms on mobile platforms such as cell phones, net-books and personal digital assistants (PDAs). The increasing demand for video applications like context-aware computing on mobile embedded systems requires the use of computationally intensive image processing algorithms. The system engineer has a mandate to optimize them so as to meet real-time deadlines. A methodology to take advantage of the asymmetric dual-core processor, which includes an ARM and a DSP core supported by shared memory, is presented with implementation details. The target platform chosen is the popular OMAP 3530 processor for embedded media systems. It has an asymmetric dual-core architecture with an ARM Cortex-A8 and a TMS320C64x Digital Signal Processor (DSP). The development platform was the BeagleBoard with 256 MB of NAND RAM and 256 MB SDRAM memory. The basic image correlation algorithm is chosen for benchmarking as it finds widespread application for various template matching tasks such as face-recognition. The basic algorithm prototypes conform to OpenCV, a popular computer vision library. OpenCV algorithms can be easily ported to the ARM core which runs a popular operating system such as Linux or Windows CE. However, the DSP is architecturally more efficient at handling DFT algorithms. The algorithms are tested on a variety of images and performance results are presented measuring the speedup obtained due to dual-core implementation. A major advantage of this approach is that it allows the ARM processor to perform important real-time tasks, while the DSP addresses performance-hungry algorithms.
Cognitive Rehabilitation of Episodic Memory Disorders: From Theory to Practice

PubMed Central

Ptak, Radek; der Linden, Martial Van; Schnider, Armin

2010-01-01

Memory disorders are among the most frequent and most debilitating cognitive impairments following acquired brain damage. Cognitive remediation strategies attempt to restore lost memory capacity, provide compensatory techniques or teach the use of external memory aids. Memory rehabilitation has strongly been influenced by memory theory, and the interaction between both has stimulated the development of techniques such as spaced retrieval, vanishing cues or errorless learning. These techniques partly rely on implicit memory and therefore enable even patients with dense amnesia to acquire new information. However, knowledge acquired in this way is often strongly domain-specific and inflexible. In addition, individual patients with amnesia respond differently to distinct interventions. The factors underlying these differences have not yet been identified. Behavioral management of memory failures therefore often relies on a careful description of environmental factors and measurement of associated behavioral disorders such as unawareness of memory failures. The current evidence suggests that patients with less severe disorders benefit from self-management techniques and mnemonics whereas rehabilitation of severely amnesic patients should focus on behavior management, the transmission of domain-specific knowledge through implicit memory processes and the compensation for memory deficits with memory aids. PMID:20700383
Robust stability of bidirectional associative memory neural networks with time delays

NASA Astrophysics Data System (ADS)

Park, Ju H.

2006-01-01

Based on the Lyapunov Krasovskii functionals combined with linear matrix inequality approach, a novel stability criterion is proposed for asymptotic stability of bidirectional associative memory neural networks with time delays. A novel delay-dependent stability criterion is given in terms of linear matrix inequalities, which can be solved easily by various optimization algorithms.
Enforcing Memory Policy Specifications in Reconfigurable Hardware

DTIC Science & Technology

2008-10-01

we explain the algorithms behind our reference monitor design flow. In Section 4, we describe our access policy language including several example...NFA from this regular expression using Thompson’s Algorithm [1] as implemented by Gerzic [19]. Figure 4 shows the NFA for our policy. Notice that the... Algorithm [1] as implemented by Grail [49] to minimize the DFA. Figure 5 shows the minimized DFA for our policy. Processing the Ranges Before we can
Homophyly/kinship hypothesis: Natural communities, and predicting in networks

NASA Astrophysics Data System (ADS)

Li, Angsheng; Li, Jiankou; Pan, Yicheng

2015-02-01

It has been a longstanding challenge to understand natural communities in real world networks. We proposed a community finding algorithm based on fitness of networks, two algorithms for prediction, accurate prediction and confirmation of keywords for papers in the citation network Arxiv HEP-TH (high energy physics theory), and the measures of internal centrality, external de-centrality, internal and external slopes to characterize the structures of communities. We implemented our algorithms on 2 citation and 5 cooperation graphs. Our experiments explored and validated a homophyly/kinship principle of real world networks. The homophyly/kinship principle includes: (1) homophyly is the natural selection in real world networks, similar to Darwin's kinship selection in nature, (2) real world networks consist of natural communities generated by the natural selection of homophyly, (3) most individuals in a natural community share a short list of common attributes, (4) natural communities have an internal centrality (or internal heterogeneity) that a natural community has a few nodes dominating most of the individuals in the community, (5) natural communities have an external de-centrality (or external homogeneity) that external links of a natural community homogeneously distributed in different communities, and (6) natural communities of a given network have typical structures determined by the internal slopes, and have typical patterns of outgoing links determined by external slopes, etc. Our homophyly/kinship principle perfectly matches Darwin's observation that animals from ants to people form social groups in which most individuals work for the common good, and that kinship could encourage altruistic behavior. Our homophyly/kinship principle is the network version of Darwinian theory, and builds a bridge between Darwinian evolution and network science.
Improvement and speed optimization of numerical tsunami modelling program using OpenMP technology

NASA Astrophysics Data System (ADS)

Chernov, A.; Zaytsev, A.; Yalciner, A.; Kurkin, A.

2009-04-01

Currently, the basic problem of tsunami modeling is low speed of calculations which is unacceptable for services of the operative notification. Existing algorithms of numerical modeling of hydrodynamic processes of tsunami waves are developed without taking the opportunities of modern computer facilities. There is an opportunity to have considerable acceleration of process of calculations by using parallel algorithms. We discuss here new approach to parallelization tsunami modeling code using OpenMP Technology (for multiprocessing systems with the general memory). Nowadays, multiprocessing systems are easily accessible for everyone. The cost of the use of such systems becomes much lower comparing to the costs of clusters. This opportunity also benefits all programmers to apply multithreading algorithms on desktop computers of researchers. Other important advantage of the given approach is the mechanism of the general memory - there is no necessity to send data on slow networks (for example Ethernet). All memory is the common for all computing processes; it causes almost linear scalability of the program and processes. In the new version of NAMI DANCE using OpenMP technology and multi-threading algorithm provide 80% gain in speed in comparison with the one-thread version for dual-processor unit. The speed increased and 320% gain was attained for four core processor unit of PCs. Thus, it was possible to reduce considerably time of performance of calculations on the scientific workstations (desktops) without complete change of the program and user interfaces. The further modernization of algorithms of preparation of initial data and processing of results using OpenMP looks reasonable. The final version of NAMI DANCE with the increased computational speed can be used not only for research purposes but also in real time Tsunami Warning Systems.
Right unilateral electroconvulsive therapy does not cause more cognitive impairment than pharmacologic treatment in treatment-resistant bipolar depression: A 6-month randomized controlled trial follow-up study.

PubMed

Bjoerke-Bertheussen, Jeanette; Schoeyen, Helle; Andreassen, Ole A; Malt, Ulrik F; Oedegaard, Ketil J; Morken, Gunnar; Sundet, Kjetil; Vaaler, Arne E; Auestad, Bjoern; Kessler, Ute

2017-12-21

Electroconvulsive therapy is an effective treatment for bipolar depression, but there are concerns about whether it causes long-term neurocognitive impairment. In this multicenter randomized controlled trial, in-patients with treatment-resistant bipolar depression were randomized to either algorithm-based pharmacologic treatment or right unilateral electroconvulsive therapy. After the 6-week treatment period, all of the patients received maintenance pharmacotherapy as recommended by their clinician guided by a relevant treatment algorithm. Patients were assessed at baseline and at 6 months. Neurocognitive functions were assessed using the Measurement and Treatment Research to Improve Cognition in Schizophrenia (MATRICS) Consensus Cognitive Battery, and autobiographical memory consistency was assessed using the Autobiographical Memory Interview-Short Form. Seventy-three patients entered the trial, of whom 51 and 26 completed neurocognitive assessments at baseline and 6 months, respectively. The MATRICS Consensus Cognitive Battery composite score improved by 4.1 points in both groups (P = .042) from baseline to 6 months (from 40.8 to 44.9 and from 41.9 to 46.0 in the algorithm-based pharmacologic treatment and electroconvulsive therapy groups, respectively). The Autobiographical Memory Interview-Short Form consistency scores were reduced in both groups (72.3% vs 64.3% in the algorithm-based pharmacologic treatment and electroconvulsive therapy groups, respectively; P = .085). This study did not find that right unilateral electroconvulsive therapy caused long-term impairment in neurocognitive functions compared to algorithm-based pharmacologic treatment in bipolar depression as measured using standard neuropsychological tests, but due to the low number of patients in the study the results should be interpreted with caution. ClinicalTrials.gov: NCT00664976. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Havens: Explicit Reliable Memory Regions for HPC Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hukerikar, Saurabh; Engelmann, Christian

2016-01-01

Supporting error resilience in future exascale-class supercomputing systems is a critical challenge. Due to transistor scaling trends and increasing memory density, scientific simulations are expected to experience more interruptions caused by transient errors in the system memory. Existing hardware-based detection and recovery techniques will be inadequate to manage the presence of high memory fault rates. In this paper we propose a partial memory protection scheme based on region-based memory management. We define the concept of regions called havens that provide fault protection for program objects. We provide reliability for the regions through a software-based parity protection mechanism. Our approach enablesmore » critical program objects to be placed in these havens. The fault coverage provided by our approach is application agnostic, unlike algorithm-based fault tolerance techniques.« less
lsjk—a C++ library for arbitrary-precision numeric evaluation of the generalized log-sine functions

NASA Astrophysics Data System (ADS)

Kalmykov, M. Yu.; Sheplyakov, A.

2005-10-01

Generalized log-sine functions Lsj(k)(θ) appear in higher order ɛ-expansion of different Feynman diagrams. We present an algorithm for the numerical evaluation of these functions for real arguments. This algorithm is implemented as a C++ library with arbitrary-precision arithmetics for integer 0⩽k⩽9 and j⩾2. Some new relations and representations of the generalized log-sine functions are given. Program summaryTitle of program:lsjk Catalogue number:ADVS Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADVS Program obtained from: CPC Program Library, Queen's University of Belfast, N. Ireland Licensing terms: GNU General Public License Computers:all Operating systems:POSIX Programming language:C++ Memory required to execute:Depending on the complexity of the problem, at least 32 MB RAM recommended No. of lines in distributed program, including testing data, etc.:41 975 No. of bytes in distributed program, including testing data, etc.:309 156 Distribution format:tar.gz Other programs called:The CLN library for arbitrary-precision arithmetics is required at version 1.1.5 or greater External files needed:none Nature of the physical problem:Numerical evaluation of the generalized log-sine functions for real argument in the region 0<θ<π. These functions appear in Feynman integrals Method of solution:Series representation for the real argument in the region 0<θ<π Restriction on the complexity of the problem:Limited up to Lsj(9)(θ), and j is an arbitrary integer number. Thus, all function up to the weight 12 in the region 0<θ<π can be evaluated. The algorithm can be extended up to higher values of k(k>9) without modification Typical running time:Depending on the complexity of problem. See text below.

Limited Memory Block Krylov Subspace Optimization for Computing Dominant Singular Value Decompositions

DTIC Science & Technology

2012-03-22

with performance profiles, Math. Program., 91 (2002), pp. 201–213. [6] P. DRINEAS, R. KANNAN, AND M. W. MAHONEY , Fast Monte Carlo algorithms for matrices...computing invariant subspaces of non-Hermitian matri- ces, Numer. Math., 25 ( 1975 /76), pp. 123–136. [25] , Matrix algorithms Vol. II: Eigensystems
Fast maximum intensity projections of large medical data sets by exploiting hierarchical memory architectures.

PubMed

Kiefer, Gundolf; Lehmann, Helko; Weese, Jürgen

2006-04-01

Maximum intensity projections (MIPs) are an important visualization technique for angiographic data sets. Efficient data inspection requires frame rates of at least five frames per second at preserved image quality. Despite the advances in computer technology, this task remains a challenge. On the one hand, the sizes of computed tomography and magnetic resonance images are increasing rapidly. On the other hand, rendering algorithms do not automatically benefit from the advances in processor technology, especially for large data sets. This is due to the faster evolving processing power and the slower evolving memory access speed, which is bridged by hierarchical cache memory architectures. In this paper, we investigate memory access optimization methods and use them for generating MIPs on general-purpose central processing units (CPUs) and graphics processing units (GPUs), respectively. These methods can work on any level of the memory hierarchy, and we show that properly combined methods can optimize memory access on multiple levels of the hierarchy at the same time. We present performance measurements to compare different algorithm variants and illustrate the influence of the respective techniques. On current hardware, the efficient handling of the memory hierarchy for CPUs improves the rendering performance by a factor of 3 to 4. On GPUs, we observed that the effect is even larger, especially for large data sets. The methods can easily be adjusted to different hardware specifics, although their impact can vary considerably. They can also be used for other rendering techniques than MIPs, and their use for more general image processing task could be investigated in the future.
Parallelization strategies for continuum-generalized method of moments on the multi-thread systems

NASA Astrophysics Data System (ADS)

Bustamam, A.; Handhika, T.; Ernastuti, Kerami, D.

2017-07-01

Continuum-Generalized Method of Moments (C-GMM) covers the Generalized Method of Moments (GMM) shortfall which is not as efficient as Maximum Likelihood estimator by using the continuum set of moment conditions in a GMM framework. However, this computation would take a very long time since optimizing regularization parameter. Unfortunately, these calculations are processed sequentially whereas in fact all modern computers are now supported by hierarchical memory systems and hyperthreading technology, which allowing for parallel computing. This paper aims to speed up the calculation process of C-GMM by designing a parallel algorithm for C-GMM on the multi-thread systems. First, parallel regions are detected for the original C-GMM algorithm. There are two parallel regions in the original C-GMM algorithm, that are contributed significantly to the reduction of computational time: the outer-loop and the inner-loop. Furthermore, this parallel algorithm will be implemented with standard shared-memory application programming interface, i.e. Open Multi-Processing (OpenMP). The experiment shows that the outer-loop parallelization is the best strategy for any number of observations.
A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors

PubMed Central

Acevedo-Avila, Ricardo; Gonzalez-Mendoza, Miguel; Garcia-Garcia, Andres

2016-01-01

Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms. PMID:27240382
Memory: Enduring Traces of Perceptual and Reflective Attention

PubMed Central

Chun, Marvin M.; Johnson, Marcia K.

2011-01-01

Attention and memory are typically studied as separate topics, but they are highly intertwined. Here we discuss the relation between memory and two fundamental types of attention: perceptual and reflective. Memory is the persisting consequence of cognitive activities initiated by and/or focused on external information from the environment (perceptual attention) and initiated by and/or focused on internal mental representations (reflective attention). We consider three key questions for advancing a cognitive neuroscience of attention and memory: To what extent do perception and reflection share representational areas? To what extent are the control processes that select, maintain, and manipulate perceptual and reflective information subserved by common areas and networks? During perception and reflection, to what extent are common areas responsible for binding features together to create complex, episodic memories and for reviving them later? Considering similarities and differences in perceptual and reflective attention helps integrate a broad range of findings and raises important unresolved issues. PMID:22099456
Biasing the content of hippocampal replay during sleep

PubMed Central

Bendor, Daniel; Wilson, Matthew A.

2013-01-01

The hippocampus plays an essential role in encoding self-experienced events into memory. During sleep, neural activity in the hippocampus related to a recent experience has been observed to spontaneously reoccur, and this “replay” has been postulated to be important for memory consolidation. Task-related cues can enhance memory consolidation when presented during a post-training sleep session, and if memories are consolidated by hippocampal replay, a specific enhancement for this replay should also be observed. To test this, we have trained rats on an auditory-spatial association task, while recording from neuronal ensembles in the hippocampus. Here we report that during sleep, a task-related auditory cue biases reactivation events towards replaying the spatial memory associated with that cue. These results indicate that sleep replay can be manipulated by external stimulation, and provide further evidence for the role of hippocampal replay in memory consolidation. PMID:22941111
An efficient and robust algorithm for two dimensional time dependent incompressible Navier-Stokes equations: High Reynolds number flows

NASA Technical Reports Server (NTRS)

Goodrich, John W.

1991-01-01

An algorithm is presented for unsteady two-dimensional incompressible Navier-Stokes calculations. This algorithm is based on the fourth order partial differential equation for incompressible fluid flow which uses the streamfunction as the only dependent variable. The algorithm is second order accurate in both time and space. It uses a multigrid solver at each time step. It is extremely efficient with respect to the use of both CPU time and physical memory. It is extremely robust with respect to Reynolds number.
The relationship between cognitive function and life space: the potential role of personal control beliefs.

PubMed

Sartori, Andrea C; Wadley, Virginia G; Clay, Olivio J; Parisi, Jeanine M; Rebok, George W; Crowe, Michael

2012-06-01

We examined the relationship of cognitive and functional measures with life space (a measure of spatial mobility examining extent of movement within a person's environment) in older adults, and investigated the potential moderating role of personal control beliefs. Internal control beliefs reflect feelings of competence and personal agency, while attributions of external control imply a more dependent or passive point of view. Participants were 2,737 adults from the ACTIVE study, with a mean age of 74 years. Females comprised 76% of the sample, with good minority representation (27% African American). In multiple regression models controlling for demographic factors, cognitive domains of memory, reasoning, and processing speed were significantly associated with life space (p < .001 for each), and reasoning ability appeared most predictive (B = .117). Measures of everyday function also showed significant associations with life space, independent from the traditional cognitive measures. Interactions between cognitive function and control beliefs were tested, and external control beliefs moderated the relationship between memory and life space, with the combination of high objective memory and low external control beliefs yielding the highest life space (t = -2.07; p = .039). In conclusion, older adults with better cognitive function have a larger overall life space. Performance-based measures of everyday function may also be useful in assessing the functional outcome of life space. Additionally, subjective external control beliefs may moderate the relationship between objective cognitive function and life space. Future studies examining the relationships between these factors longitudinally appear worthwhile to further elucidate the interrelationships of cognitive function, control beliefs, and life space. PsycINFO Database Record (c) 2012 APA, all rights reserved
Electrophysiological evidence for parts and wholes in visual face memory.

PubMed

Towler, John; Eimer, Martin

2016-10-01

It is often assumed that upright faces are represented in a holistic fashion, while representations of inverted faces are essentially part-based. To assess this hypothesis, we recorded event-related potentials (ERPs) during a sequential face identity matching task where successively presented pairs of upright or inverted faces were either identical or differed with respect to their internal features, their external features, or both. Participants' task was to report on each trial whether the face pair was identical or different. To track the activation of visual face memory representations, we measured N250r components that emerge over posterior face-selective regions during the activation of visual face memory representations by a successful identity match. N250r components to full identity repetitions were smaller and emerged later for inverted as compared to upright faces, demonstrating that image inversion impairs face identity matching processes. For upright faces, N250r components were also elicited by partial repetitions of external or internal features, which suggest that the underlying identity matching processes are not exclusively based on non-decomposable holistic representations. However, the N250r to full identity repetitions was super-additive (i.e., larger than the sum of the two N250r components to partial repetitions of external or internal features) for upright faces, demonstrating that holistic representations were involved in identity matching processes. For inverted faces, N250r components to full and partial identity repetitions were strictly additive, indicating that the identity matching of external and internal features operated in an entirely part-based fashion. These results provide new electrophysiological evidence for qualitative differences between representations of upright and inverted faces in the occipital-temporal face processing system. Copyright © 2016 Elsevier Ltd. All rights reserved.
Use of the preconditioned conjugate gradient algorithm as a generic solver for mixed-model equations in animal breeding applications.

PubMed

Tsuruta, S; Misztal, I; Strandén, I

2001-05-01

Utility of the preconditioned conjugate gradient algorithm with a diagonal preconditioner for solving mixed-model equations in animal breeding applications was evaluated with 16 test problems. The problems included single- and multiple-trait analyses, with data on beef, dairy, and swine ranging from small examples to national data sets. Multiple-trait models considered low and high genetic correlations. Convergence was based on relative differences between left- and right-hand sides. The ordering of equations was fixed effects followed by random effects, with no special ordering within random effects. The preconditioned conjugate gradient program implemented with double precision converged for all models. However, when implemented in single precision, the preconditioned conjugate gradient algorithm did not converge for seven large models. The preconditioned conjugate gradient and successive overrelaxation algorithms were subsequently compared for 13 of the test problems. The preconditioned conjugate gradient algorithm was easy to implement with the iteration on data for general models. However, successive overrelaxation requires specific programming for each set of models. On average, the preconditioned conjugate gradient algorithm converged in three times fewer rounds of iteration than successive overrelaxation. With straightforward implementations, programs using the preconditioned conjugate gradient algorithm may be two or more times faster than those using successive overrelaxation. However, programs using the preconditioned conjugate gradient algorithm would use more memory than would comparable implementations using successive overrelaxation. Extensive optimization of either algorithm can influence rankings. The preconditioned conjugate gradient implemented with iteration on data, a diagonal preconditioner, and in double precision may be the algorithm of choice for solving mixed-model equations when sufficient memory is available and ease of implementation is essential.
External Verification of SCADA System Embedded Controller Firmware

DTIC Science & Technology

2012-03-01

microprocessor and read-only memory (ROM) or flash memory for storing firmware and control logic [5],[8]. A PLC typically has three software levels as shown in...implementing different firmware. Because PLCs are in effect a microprocessor device, an analysis of the current research on embedded devices is important...Electronics Engineers (IEEE) published a 15 best practices guide for firmware control on microprocessors [44]. IEEE suggests that microprocessors
Saturation: An efficient iteration strategy for symbolic state-space generation

NASA Technical Reports Server (NTRS)

Ciardo, Gianfranco; Luettgen, Gerald; Siminiceanu, Radu; Bushnell, Dennis M. (Technical Monitor)

2001-01-01

This paper presents a novel algorithm for generating state spaces of asynchronous systems using Multi-valued Decision Diagrams. In contrast to related work, the next-state function of a system is not encoded as a single Boolean function, but as cross-products of integer functions. This permits the application of various iteration strategies to build a system's state space. In particular, this paper introduces a new elegant strategy, called saturation, and implements it in the tool SMART. On top of usually performing several orders of magnitude faster than existing BDD-based state-space generators, the algorithm's required peak memory is often close to the nal memory needed for storing the overall state spaces.
A multiarchitecture parallel-processing development environment

NASA Technical Reports Server (NTRS)

Townsend, Scott; Blech, Richard; Cole, Gary

1993-01-01

A description is given of the hardware and software of a multiprocessor test bed - the second generation Hypercluster system. The Hypercluster architecture consists of a standard hypercube distributed-memory topology, with multiprocessor shared-memory nodes. By using standard, off-the-shelf hardware, the system can be upgraded to use rapidly improving computer technology. The Hypercluster's multiarchitecture nature makes it suitable for researching parallel algorithms in computational field simulation applications (e.g., computational fluid dynamics). The dedicated test-bed environment of the Hypercluster and its custom-built software allows experiments with various parallel-processing concepts such as message passing algorithms, debugging tools, and computational 'steering'. Such research would be difficult, if not impossible, to achieve on shared, commercial systems.
Convolution of large 3D images on GPU and its decomposition

NASA Astrophysics Data System (ADS)

Karas, Pavel; Svoboda, David

2011-12-01

In this article, we propose a method for computing convolution of large 3D images. The convolution is performed in a frequency domain using a convolution theorem. The algorithm is accelerated on a graphic card by means of the CUDA parallel computing model. Convolution is decomposed in a frequency domain using the decimation in frequency algorithm. We pay attention to keeping our approach efficient in terms of both time and memory consumption and also in terms of memory transfers between CPU and GPU which have a significant inuence on overall computational time. We also study the implementation on multiple GPUs and compare the results between the multi-GPU and multi-CPU implementations.
Study of self-compliance behaviors and internal filament characteristics in intrinsic SiO{sub x}-based resistive switching memory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chang, Yao-Feng, E-mail: yfchang@utexas.edu; Zhou, Fei; Chen, Ying-Chen

2016-01-18

Self-compliance characteristics and reliability optimization are investigated in intrinsic unipolar silicon oxide (SiO{sub x})-based resistive switching (RS) memory using TiW/SiO{sub x}/TiW device structures. The program window (difference between SET voltage and RESET voltage) is dependent on external series resistance, demonstrating that the SET process is due to a voltage-triggered mechanism. The program window has been optimized for program/erase disturbance immunity and reliability for circuit-level applications. The SET and RESET transitions have also been characterized using a dynamic conductivity method, which distinguishes the self-compliance behavior due to an internal series resistance effect (filament) in SiO{sub x}-based RS memory. By using amore » conceptual “filament/resistive gap (GAP)” model of the conductive filament and a proton exchange model with appropriate assumptions, the internal filament resistance and GAP resistance can be estimated for high- and low-resistance states (HRS and LRS), and are found to be independent of external series resistance. Our experimental results not only provide insights into potential reliability issues but also help to clarify the switching mechanisms and device operating characteristics of SiO{sub x}-based RS memory.« less
Common mechanisms of spatial attention in memory and perception: a tactile dual-task study.

PubMed

Katus, Tobias; Andersen, Søren K; Müller, Matthias M

2014-03-01

Orienting attention to locations in mnemonic representations engages processes that functionally and anatomically overlap the neural circuitry guiding prospective shifts of spatial attention. The attention-based rehearsal account predicts that the requirement to withdraw attention from a memorized location impairs memory accuracy. In a dual-task study, we simultaneously presented retro-cues and pre-cues to guide spatial attention in short-term memory (STM) and perception, respectively. The spatial direction of each cue was independent of the other. The locations indicated by the combined cues could be compatible (same hand) or incompatible (opposite hands). Incompatible directional cues decreased lateralized activity in brain potentials evoked by visual cues, indicating interference in the generation of prospective attention shifts. The detection of external stimuli at the prospectively cued location was impaired when the memorized location was part of the perceptually ignored hand. The disruption of attention-based rehearsal by means of incompatible pre-cues reduced memory accuracy and affected encoding of tactile test stimuli at the retrospectively cued hand. These findings highlight the functional significance of spatial attention for spatial STM. The bidirectional interactions between both tasks demonstrate that spatial attention is a shared neural resource of a capacity-limited system that regulates information processing in internal and external stimulus representations.
A Pervasive Parallel Processing Framework for Data Visualization and Analysis at Extreme Scale

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ma, Kwan-Liu

Most of today’s visualization libraries and applications are based off of what is known today as the visualization pipeline. In the visualization pipeline model, algorithms are encapsulated as “filtering” components with inputs and outputs. These components can be combined by connecting the outputs of one filter to the inputs of another filter. The visualization pipeline model is popular because it provides a convenient abstraction that allows users to combine algorithms in powerful ways. Unfortunately, the visualization pipeline cannot run effectively on exascale computers. Experts agree that the exascale machine will comprise processors that contain many cores. Furthermore, physical limitations willmore » prevent data movement in and out of the chip (that is, between main memory and the processing cores) from keeping pace with improvements in overall compute performance. To use these processors to their fullest capability, it is essential to carefully consider memory access. This is where the visualization pipeline fails. Each filtering component in the visualization library is expected to take a data set in its entirety, perform some computation across all of the elements, and output the complete results. The process of iterating over all elements must be repeated in each filter, which is one of the worst possible ways to traverse memory when trying to maximize the number of executions per memory access. This project investigates a new type of visualization framework that exhibits a pervasive parallelism necessary to run on exascale machines. Our framework achieves this by defining algorithms in terms of functors, which are localized, stateless operations. Functors can be composited in much the same way as filters in the visualization pipeline. But, functors’ design allows them to be concurrently running on massive amounts of lightweight threads. Only with such fine-grained parallelism can we hope to fill the billions of threads we expect will be necessary for efficient computation on an exascale computer. This project concludes with a functional prototype containing pervasively parallel algorithms that perform demonstratively well on many-core processors. These algorithms are fundamental for performing data analysis and visualization at extreme scale.« less
Temporal Organization of Sound Information in Auditory Memory.

PubMed

Song, Kun; Luo, Huan

2017-01-01

Memory is a constructive and organizational process. Instead of being stored with all the fine details, external information is reorganized and structured at certain spatiotemporal scales. It is well acknowledged that time plays a central role in audition by segmenting sound inputs into temporal chunks of appropriate length. However, it remains largely unknown whether critical temporal structures exist to mediate sound representation in auditory memory. To address the issue, here we designed an auditory memory transferring study, by combining a previously developed unsupervised white noise memory paradigm with a reversed sound manipulation method. Specifically, we systematically measured the memory transferring from a random white noise sound to its locally temporal reversed version on various temporal scales in seven experiments. We demonstrate a U-shape memory-transferring pattern with the minimum value around temporal scale of 200 ms. Furthermore, neither auditory perceptual similarity nor physical similarity as a function of the manipulating temporal scale can account for the memory-transferring results. Our results suggest that sounds are not stored with all the fine spectrotemporal details but are organized and structured at discrete temporal chunks in long-term auditory memory representation.
Quantification of the memory effect of steady-state currents from interaction-induced transport in quantum systems

NASA Astrophysics Data System (ADS)

Lai, Chen-Yen; Chien, Chih-Chun

2017-09-01

Dynamics of a system in general depends on its initial state and how the system is driven, but in many-body systems the memory is usually averaged out during evolution. Here, interacting quantum systems without external relaxations are shown to retain long-time memory effects in steady states. To identify memory effects, we first show quasi-steady-state currents form in finite, isolated Bose- and Fermi-Hubbard models driven by interaction imbalance and they become steady-state currents in the thermodynamic limit. By comparing the steady-state currents from different initial states or ramping rates of the imbalance, long-time memory effects can be quantified. While the memory effects of initial states are more ubiquitous, the memory effects of switching protocols are mostly visible in interaction-induced transport in lattices. Our simulations suggest that the systems enter a regime governed by a generalized Fick's law and memory effects lead to initial-state-dependent diffusion coefficients. We also identify conditions for enhancing memory effects and discuss possible experimental implications.
Individual Differences Influencing Immediate Effects of Internal and External Focus Instructions on Children's Motor Performance.

PubMed

van Abswoude, Femke; Nuijen, Nienke B; van der Kamp, John; Steenbergen, Bert

2018-06-01

A large pool of evidence supports the beneficial effect of an external focus of attention on motor skill performance in adults. In children, this effect has been studied less and results are inconclusive. Importantly, individual differences are often not taken into account. We investigated the role of working memory, conscious motor control, and task-specific focus preferences on performance with an internal and external focus of attention in children. Twenty-five children practiced a golf putting task in both an internal focus condition and external focus condition. Performance was defined as the average distance toward the hole in 3 blocks of 10 trials. Task-specific focus preference was determined by asking how much effort it took to apply the instruction in each condition. In addition, working memory capacity and conscious motor control were assessed. Children improved performance in both the internal focus condition and external focus condition (ŋ p 2 = .47), with no difference between conditions (ŋ p 2 = .01). Task-specific focus preference was the only factor moderately related to the difference between performance with an internal focus and performance with an external focus (r = .56), indicating better performance for the preferred instruction in Block 3. Children can benefit from instruction with both an internal and external focus of attention to improve short-term motor performance. Individual, task-specific focus preference influenced the effect of the instructions, with children performing better with their preferred focus. The results highlight that individual differences are a key factor in the effectiveness in children's motor performance. The precise mechanisms underpinning this effect warrant further research.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.