The Effects of Block Size on the Performance of Coherent Caches in Shared-Memory Multiprocessors
1993-05-01
increase with the bandwidth and latency. For those applications with poor spatial locality, the best choice of cache line size is determined by the...observation was used in the design of two schemes: LimitLESS di- rectories and Tag caches. LimitLESS directories [15] were designed for the ALEWIFE...small packets may be used to avoid network congestion. The most important factor influencing the choice of cache line size for a multipro- cessor is the
Agulleiro, Jose-Ignacio; Fernandez, Jose-Jesus
2015-01-01
Cache blocking is a technique widely used in scientific computing to minimize the exchange of information with main memory by reusing the data kept in cache memory. In tomographic reconstruction on standard computers using vector instructions, cache blocking turns out to be central to optimize performance. To this end, sinograms of the tilt-series and slices of the volumes to be reconstructed have to be divided into small blocks that fit into the different levels of cache memory. The code is then reorganized so as to operate with a block as much as possible before proceeding with another one. This data article is related to the research article titled Tomo3D 2.0 – Exploitation of Advanced Vector eXtensions (AVX) for 3D reconstruction (Agulleiro and Fernandez, 2015) [1]. Here we present data of a thorough study of the performance of tomographic reconstruction by varying cache block sizes, which allows derivation of expressions for their automatic quasi-optimal tuning. PMID:26217710
Agulleiro, Jose-Ignacio; Fernandez, Jose-Jesus
2015-06-01
Cache blocking is a technique widely used in scientific computing to minimize the exchange of information with main memory by reusing the data kept in cache memory. In tomographic reconstruction on standard computers using vector instructions, cache blocking turns out to be central to optimize performance. To this end, sinograms of the tilt-series and slices of the volumes to be reconstructed have to be divided into small blocks that fit into the different levels of cache memory. The code is then reorganized so as to operate with a block as much as possible before proceeding with another one. This data article is related to the research article titled Tomo3D 2.0 - Exploitation of Advanced Vector eXtensions (AVX) for 3D reconstruction (Agulleiro and Fernandez, 2015) [1]. Here we present data of a thorough study of the performance of tomographic reconstruction by varying cache block sizes, which allows derivation of expressions for their automatic quasi-optimal tuning.
NASA Astrophysics Data System (ADS)
Tu, H.-Yu.; Tasneem, Sarah
Most of modern microprocessors employ on—chip cache memories to meet the memory bandwidth demand. These caches are now occupying a greater real es tate of chip area. Also, continuous down scaling of transistors increases the possi bility of defects in the cache area which already starts to occupies more than 50% of chip area. For this reason, various techniques have been proposed to tolerate defects in cache blocks. These techniques can be classified into three different cat egories, namely, cache line disabling, replacement with spare block, and decoder reconfiguration without spare blocks. This chapter examines each of those fault tol erant techniques with a fixed typical size and organization of L1 cache, through extended simulation using SPEC2000 benchmark on individual techniques. The de sign and characteristics of each technique are summarized with a view to evaluate the scheme. We then present our simulation results and comparative study of the three different methods.
EqualWrites: Reducing Intra-set Write Variations for Enhancing Lifetime of Non-volatile Caches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh; Vetter, Jeffrey S.
Driven by the trends of increasing core-count and bandwidth-wall problem, the size of last level caches (LLCs) has greatly increased and hence, the researchers have explored non-volatile memories (NVMs) which provide high density and consume low-leakage power. Since NVMs have low write-endurance and the existing cache management policies are write variation-unaware, effective wear-leveling techniques are required for achieving reasonable cache lifetimes using NVMs. We present EqualWrites, a technique for mitigating intra-set write variation. In this paper, our technique works by recording the number of writes on a block and changing the cache-block location of a hot data-item to redirect themore » future writes to a cold block to achieve wear-leveling. Simulation experiments have been performed using an x86-64 simulator and benchmarks from SPEC06 and HPC (high-performance computing) field. The results show that for single, dual and quad-core system configurations, EqualWrites improves cache lifetime by 6.31X, 8.74X and 10.54X, respectively. In addition, its implementation overhead is very small and it provides larger improvement in lifetime than three other intra-set wear-leveling techniques and a cache replacement policy.« less
EqualWrites: Reducing Intra-set Write Variations for Enhancing Lifetime of Non-volatile Caches
Mittal, Sparsh; Vetter, Jeffrey S.
2015-01-29
Driven by the trends of increasing core-count and bandwidth-wall problem, the size of last level caches (LLCs) has greatly increased and hence, the researchers have explored non-volatile memories (NVMs) which provide high density and consume low-leakage power. Since NVMs have low write-endurance and the existing cache management policies are write variation-unaware, effective wear-leveling techniques are required for achieving reasonable cache lifetimes using NVMs. We present EqualWrites, a technique for mitigating intra-set write variation. In this paper, our technique works by recording the number of writes on a block and changing the cache-block location of a hot data-item to redirect themore » future writes to a cold block to achieve wear-leveling. Simulation experiments have been performed using an x86-64 simulator and benchmarks from SPEC06 and HPC (high-performance computing) field. The results show that for single, dual and quad-core system configurations, EqualWrites improves cache lifetime by 6.31X, 8.74X and 10.54X, respectively. In addition, its implementation overhead is very small and it provides larger improvement in lifetime than three other intra-set wear-leveling techniques and a cache replacement policy.« less
An Effective Cache Algorithm for Heterogeneous Storage Systems
Li, Yong; Feng, Dan
2013-01-01
Modern storage environment is commonly composed of heterogeneous storage devices. However, traditional cache algorithms exhibit performance degradation in heterogeneous storage systems because they were not designed to work with the diverse performance characteristics. In this paper, we present a new cache algorithm called HCM for heterogeneous storage systems. The HCM algorithm partitions the cache among the disks and adopts an effective scheme to balance the work across the disks. Furthermore, it applies benefit-cost analysis to choose the best allocation of cache block to improve the performance. Conducting simulations with a variety of traces and a wide range of cache size, our experiments show that HCM significantly outperforms the existing state-of-the-art storage-aware cache algorithms. PMID:24453890
Binary mesh partitioning for cache-efficient visualization.
Tchiboukdjian, Marc; Danjean, Vincent; Raffin, Bruno
2010-01-01
One important bottleneck when visualizing large data sets is the data transfer between processor and memory. Cache-aware (CA) and cache-oblivious (CO) algorithms take into consideration the memory hierarchy to design cache efficient algorithms. CO approaches have the advantage to adapt to unknown and varying memory hierarchies. Recent CA and CO algorithms developed for 3D mesh layouts significantly improve performance of previous approaches, but they lack of theoretical performance guarantees. We present in this paper a {\\schmi O}(N\\log N) algorithm to compute a CO layout for unstructured but well shaped meshes. We prove that a coherent traversal of a N-size mesh in dimension d induces less than N/B+{\\schmi O}(N/M;{1/d}) cache-misses where B and M are the block size and the cache size, respectively. Experiments show that our layout computation is faster and significantly less memory consuming than the best known CO algorithm. Performance is comparable to this algorithm for classical visualization algorithm access patterns, or better when the BSP tree produced while computing the layout is used as an acceleration data structure adjusted to the layout. We also show that cache oblivious approaches lead to significant performance increases on recent GPU architectures.
An Analysis of Instruction-Cached SIMD Computer Architecture
1993-12-01
ASSEBLE SIMULATE SCHEDULE VERIFY :t og ... . .. ... V~JSRUCTONSFOR PECIIEDCOMPARE ASSEMBLEI SIMULATE Ift*U1II ~ ~ SCHEDULEIinw ;. & VERIFY...Cache to Place Blocks ................. 70 4.5.4 Step 4: Schedule Cache Blocks ............................. 70 4.5.5 Step 5: Store Cache Blocks...167 B.4 Scheduler .............................................. 167 B.4.1 Basic Block Definition
Temperature and leakage aware techniques to improve cache reliability
NASA Astrophysics Data System (ADS)
Akaaboune, Adil
Decreasing power consumption in small devices such as handhelds, cell phones and high-performance processors is now one of the most critical design concerns. On-chip cache memories dominate the chip area in microprocessors and thus arises the need for power efficient cache memories. Cache is the simplest cost effective method to attain high speed memory hierarchy and, its performance is extremely critical for high speed computers. Cache is used by the microprocessor for channeling the performance gap between processor and main memory (RAM) hence the memory bandwidth is frequently a bottleneck which can affect the peak throughput significantly. In the design of any cache system, the tradeoffs of area/cost, performance, power consumption, and thermal management must be taken into consideration. Previous work has mainly concentrated on performance and area/cost constraints. More recent works have focused on low power design especially for portable devices and media-processing systems, however fewer research has been done on the relationship between heat management, Leakage power and cost per die. Lately, the focus of power dissipation in the new generations of microprocessors has shifted from dynamic power to idle power, a previously underestimated form of power loss that causes battery charge to drain and shutdown too early due the waste of energy. The problem has been aggravated by the aggressive scaling of process; device level method used originally by designers to enhance performance, conserve dissipation and reduces the sizes of digital circuits that are increasingly condensed. This dissertation studies the impact of hotspots, in the cache memory, on leakage consumption and microprocessor reliability and durability. The work will first prove that by eliminating hotspots in the cache memory, leakage power will be reduced and therefore, the reliability will be improved. The second technique studied is data quality management that improves the quality of the data stored in the cache to reduce power consumption. The initial work done on this subject focuses on the type of data that increases leakage consumption and ways to manage without impacting the performance of the microprocessor. The second phase of the project focuses on managing the data storage in different blocks of the cache to smooth the leakage power as well as dynamic power consumption. The last technique is a voltage controlled cache to reduce the leakage consumption of the cache while in execution and even in idle state. Two blocks of the 4-way set associative cache go through a voltage regulator before getting to the voltage well, and the other two are directly connected to the voltage well. The idea behind this technique is to use the replacement algorithm information to increase or decrease voltage of the two blocks depending on the need of the information stored on them.
Efficacy of Code Optimization on Cache-based Processors
NASA Technical Reports Server (NTRS)
VanderWijngaart, Rob F.; Chancellor, Marisa K. (Technical Monitor)
1997-01-01
The current common wisdom in the U.S. is that the powerful, cost-effective supercomputers of tomorrow will be based on commodity (RISC) micro-processors with cache memories. Already, most distributed systems in the world use such hardware as building blocks. This shift away from vector supercomputers and towards cache-based systems has brought about a change in programming paradigm, even when ignoring issues of parallelism. Vector machines require inner-loop independence and regular, non-pathological memory strides (usually this means: non-power-of-two strides) to allow efficient vectorization of array operations. Cache-based systems require spatial and temporal locality of data, so that data once read from main memory and stored in high-speed cache memory is used optimally before being written back to main memory. This means that the most cache-friendly array operations are those that feature zero or unit stride, so that each unit of data read from main memory (a cache line) contains information for the next iteration in the loop. Moreover, loops ought to be 'fat', meaning that as many operations as possible are performed on cache data-provided instruction caches do not overflow and enough registers are available. If unit stride is not possible, for example because of some data dependency, then care must be taken to avoid pathological strides, just ads on vector computers. For cache-based systems the issues are more complex, due to the effects of associativity and of non-unit block (cache line) size. But there is more to the story. Most modern micro-processors are superscalar, which means that they can issue several (arithmetic) instructions per clock cycle, provided that there are enough independent instructions in the loop body. This is another argument for providing fat loop bodies. With these restrictions, it appears fairly straightforward to produce code that will run efficiently on any cache-based system. It can be argued that although some of the important computational algorithms employed at NASA Ames require different programming styles on vector machines and cache-based machines, respectively, neither architecture class appeared to be favored by particular algorithms in principle. Practice tells us that the situation is more complicated. This report presents observations and some analysis of performance tuning for cache-based systems. We point out several counterintuitive results that serve as a cautionary reminder that memory accesses are not the only factors that determine performance, and that within the class of cache-based systems, significant differences exist.
Memory hierarchy using row-based compression
Loh, Gabriel H.; O'Connor, James M.
2016-10-25
A system includes a first memory and a device coupleable to the first memory. The device includes a second memory to cache data from the first memory. The second memory includes a plurality of rows, each row including a corresponding set of compressed data blocks of non-uniform sizes and a corresponding set of tag blocks. Each tag block represents a corresponding compressed data block of the row. The device further includes decompression logic to decompress data blocks accessed from the second memory. The device further includes compression logic to compress data blocks to be stored in the second memory.
A Measurement and Simulation Based Methodology for Cache Performance Modeling and Tuning
NASA Technical Reports Server (NTRS)
Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)
1998-01-01
We present a cache performance modeling methodology that facilitates the tuning of uniprocessor cache performance for applications executing on shared memory multiprocessors by accurately predicting the effects of source code level modifications. Measurements on a single processor are initially used for identifying parts of code where cache utilization improvements may significantly impact the overall performance. Cache simulation based on trace-driven techniques can be carried out without gathering detailed address traces. Minimal runtime information for modeling cache performance of a selected code block includes: base virtual addresses of arrays, virtual addresses of variables, and loop bounds for that code block. Rest of the information is obtained from the source code. We show that the cache performance predictions are as reliable as those obtained through trace-driven simulations. This technique is particularly helpful to the exploration of various "what-if' scenarios regarding the cache performance impact for alternative code structures. We explain and validate this methodology using a simple matrix-matrix multiplication program. We then apply this methodology to predict and tune the cache performance of two realistic scientific applications taken from the Computational Fluid Dynamics (CFD) domain.
High Performance Analytics with the R3-Cache
NASA Astrophysics Data System (ADS)
Eavis, Todd; Sayeed, Ruhan
Contemporary data warehouses now represent some of the world’s largest databases. As these systems grow in size and complexity, however, it becomes increasingly difficult for brute force query processing approaches to meet the performance demands of end users. Certainly, improved indexing and more selective view materialization are helpful in this regard. Nevertheless, with warehouses moving into the multi-terabyte range, it is clear that the minimization of external memory accesses must be a primary performance objective. In this paper, we describe the R 3-cache, a natively multi-dimensional caching framework designed specifically to support sophisticated warehouse/OLAP environments. R 3-cache is based upon an in-memory version of the R-tree that has been extended to support buffer pages rather than disk blocks. A key strength of the R 3-cache is that it is able to utilize multi-dimensional fragments of previous query results so as to significantly minimize the frequency and scale of disk accesses. Moreover, the new caching model directly accommodates the standard relational storage model and provides mechanisms for pro-active updates that exploit the existence of query “hot spots”. The current prototype has been evaluated as a component of the Sidera DBMS, a “shared nothing” parallel OLAP server designed for multi-terabyte analytics. Experimental results demonstrate significant performance improvements relative to simpler alternatives.
NASA Astrophysics Data System (ADS)
Gardner, R. W.; Hanushevsky, A.; Vukotic, I.; Yang, W.
2017-10-01
As many LHC Tier-3 and some Tier-2 centers look toward streamlining operations, they are considering autonomously managed storage elements as part of the solution. These storage elements are essentially file caching servers. They can operate as whole file or data block level caches. Several implementations exist. In this paper we explore using XRootD caching servers that can operate in either mode. They can also operate autonomously (i.e. demand driven), be centrally managed (i.e. a Rucio managed cache), or operate in both modes. We explore the pros and cons of various configurations as well as practical requirements for caching to be effective. While we focus on XRootD caches, the analysis should apply to other kinds of caches as well.
An area model for on-chip memories and its application
NASA Technical Reports Server (NTRS)
Mulder, Johannes M.; Quach, Nhon T.; Flynn, Michael J.
1991-01-01
An area model suitable for comparing data buffers of different organizations and arbitrary sizes is described. The area model considers the supplied bandwidth of a memory cell and includes such buffer overhead as control logic, driver logic, and tag storage. The model gave less than 10 percent error when verified against real caches and register files. It is shown that, comparing caches and register files in terms of area for the same storage capacity, caches generally occupy more area per bit than register files for small caches because the overhead dominates the cache area at these sizes. For larger caches, the smaller storage cells in the cache provide a smaller total cache area per bit than the register set. Studying cache performance (traffic ratio) as a function of area, it is shown that, for small caches, direct-mapped caches perform significantly better than four-way set-associative caches and, for caches of medium areas, both direct-mapped and set-associative caches perform better than fully associative caches.
ASIC-based architecture for the real-time computation of 2D convolution with large kernel size
NASA Astrophysics Data System (ADS)
Shao, Rui; Zhong, Sheng; Yan, Luxin
2015-12-01
Bidimensional convolution is a low-level processing algorithm of interest in many areas, but its high computational cost constrains the size of the kernels, especially in real-time embedded systems. This paper presents a hardware architecture for the ASIC-based implementation of 2-D convolution with medium-large kernels. Aiming to improve the efficiency of storage resources on-chip, reducing off-chip bandwidth of these two issues, proposed construction of a data cache reuse. Multi-block SPRAM to cross cached images and the on-chip ping-pong operation takes full advantage of the data convolution calculation reuse, design a new ASIC data scheduling scheme and overall architecture. Experimental results show that the structure can achieve 40× 32 size of template real-time convolution operations, and improve the utilization of on-chip memory bandwidth and on-chip memory resources, the experimental results show that the structure satisfies the conditions to maximize data throughput output , reducing the need for off-chip memory bandwidth.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nicolae, Bogdan; Riteau, Pierre; Keahey, Kate
Storage elasticity on IaaS clouds is a crucial feature in the age of data-intensive computing, especially when considering fluctuations of I/O throughput. This paper provides a transparent solution that automatically boosts I/O bandwidth during peaks for underlying virtual disks, effectively avoiding over-provisioning without performance loss. The authors' proposal relies on the idea of leveraging short-lived virtual disks of better performance characteristics (and thus more expensive) to act during peaks as a caching layer for the persistent virtual disks where the application data is stored. Furthermore, they introduce a performance and cost prediction methodology that can be used both independently tomore » estimate in advance what trade-off between performance and cost is possible, as well as an optimization technique that enables better cache size selection to meet the desired performance level with minimal cost. The authors demonstrate the benefits of their proposal both for microbenchmarks and for two real-life applications using large-scale experiments.« less
Methods for compressible fluid simulation on GPUs using high-order finite differences
NASA Astrophysics Data System (ADS)
Pekkilä, Johannes; Väisälä, Miikka S.; Käpylä, Maarit J.; Käpylä, Petri J.; Anjum, Omer
2017-08-01
We focus on implementing and optimizing a sixth-order finite-difference solver for simulating compressible fluids on a GPU using third-order Runge-Kutta integration. Since graphics processing units perform well in data-parallel tasks, this makes them an attractive platform for fluid simulation. However, high-order stencil computation is memory-intensive with respect to both main memory and the caches of the GPU. We present two approaches for simulating compressible fluids using 55-point and 19-point stencils. We seek to reduce the requirements for memory bandwidth and cache size in our methods by using cache blocking and decomposing a latency-bound kernel into several bandwidth-bound kernels. Our fastest implementation is bandwidth-bound and integrates 343 million grid points per second on a Tesla K40t GPU, achieving a 3 . 6 × speedup over a comparable hydrodynamics solver benchmarked on two Intel Xeon E5-2690v3 processors. Our alternative GPU implementation is latency-bound and achieves the rate of 168 million updates per second.
The effect of code expanding optimizations on instruction cache design
NASA Technical Reports Server (NTRS)
Chen, William Y.; Chang, Pohua P.; Conte, Thomas M.; Hwu, Wen-Mei W.
1991-01-01
It is shown that code expanding optimizations have strong and non-intuitive implications on instruction cache design. Three types of code expanding optimizations are studied: instruction placement, function inline expansion, and superscalar optimizations. Overall, instruction placement reduces the miss ratio of small caches. Function inline expansion improves the performance for small cache sizes, but degrades the performance of medium caches. Superscalar optimizations increases the cache size required for a given miss ratio. On the other hand, they also increase the sequentiality of instruction access so that a simple load-forward scheme effectively cancels the negative effects. Overall, it is shown that with load forwarding, the three types of code expanding optimizations jointly improve the performance of small caches and have little effect on large caches.
Efficient implementation of parallel three-dimensional FFT on clusters of PCs
NASA Astrophysics Data System (ADS)
Takahashi, Daisuke
2003-05-01
In this paper, we propose a high-performance parallel three-dimensional fast Fourier transform (FFT) algorithm on clusters of PCs. The three-dimensional FFT algorithm can be altered into a block three-dimensional FFT algorithm to reduce the number of cache misses. We show that the block three-dimensional FFT algorithm improves performance by utilizing the cache memory effectively. We use the block three-dimensional FFT algorithm to implement the parallel three-dimensional FFT algorithm. We succeeded in obtaining performance of over 1.3 GFLOPS on an 8-node dual Pentium III 1 GHz PC SMP cluster.
A set-associative, fault-tolerant cache design
NASA Technical Reports Server (NTRS)
Lamet, Dan; Frenzel, James F.
1992-01-01
The design of a defect-tolerant control circuit for a set-associative cache memory is presented. The circuit maintains the stack ordering necessary for implementing the Least Recently Used (LRU) replacement algorithm. A discussion of programming techniques for bypassing defective blocks is included.
dCache on Steroids - Delegated Storage Solutions
Mkrtchyan, Tigran; Adeyemi, F.; Ashish, A.; ...
2017-11-23
For over a decade, dCache.org has delivered a robust software used at more than 80 Universities and research institutes around the world, allowing these sites to provide reliable storage services for the WLCG experiments as well as many other scientific communities. The flexible architecture of dCache allows running it in a wide variety of configurations and platforms - from a SoC based all-in-one Raspberry-Pi up to hundreds of nodes in a multipetabyte installation. Due to lack of managed storage at the time, dCache implemented data placement, replication and data integrity directly. Today, many alternatives are available: S3, GlusterFS, CEPH andmore » others. While such solutions position themselves as scalable storage systems, they cannot be used by many scientific communities out of the box. The absence of community-accepted authentication and authorization mechanisms, the use of product specific protocols and the lack of namespace are some of the reasons that prevent wide-scale adoption of these alternatives. Most of these limitations are already solved by dCache. By delegating low-level storage management functionality to the above-mentioned new systems and providing the missing layer through dCache, we provide a solution which combines the benefits of both worlds - industry standard storage building blocks with the access protocols and authentication required by scientific communities. In this paper, we focus on CEPH, a popular software for clustered storage that supports file, block and object interfaces. CEPH is often used in modern computing centers, for example as a backend to OpenStack services. We will show prototypes of dCache running with a CEPH backend and discuss the benefits and limitations of such an approach. As a result, we will also outline the roadmap for supporting ‘delegated storage’ within the dCache releases.« less
dCache on Steroids - Delegated Storage Solutions
NASA Astrophysics Data System (ADS)
Mkrtchyan, T.; Adeyemi, F.; Ashish, A.; Behrmann, G.; Fuhrmann, P.; Litvintsev, D.; Millar, P.; Rossi, A.; Sahakyan, M.; Starek, J.
2017-10-01
For over a decade, dCache.org has delivered a robust software used at more than 80 Universities and research institutes around the world, allowing these sites to provide reliable storage services for the WLCG experiments as well as many other scientific communities. The flexible architecture of dCache allows running it in a wide variety of configurations and platforms - from a SoC based all-in-one Raspberry-Pi up to hundreds of nodes in a multipetabyte installation. Due to lack of managed storage at the time, dCache implemented data placement, replication and data integrity directly. Today, many alternatives are available: S3, GlusterFS, CEPH and others. While such solutions position themselves as scalable storage systems, they cannot be used by many scientific communities out of the box. The absence of community-accepted authentication and authorization mechanisms, the use of product specific protocols and the lack of namespace are some of the reasons that prevent wide-scale adoption of these alternatives. Most of these limitations are already solved by dCache. By delegating low-level storage management functionality to the above-mentioned new systems and providing the missing layer through dCache, we provide a solution which combines the benefits of both worlds - industry standard storage building blocks with the access protocols and authentication required by scientific communities. In this paper, we focus on CEPH, a popular software for clustered storage that supports file, block and object interfaces. CEPH is often used in modern computing centers, for example as a backend to OpenStack services. We will show prototypes of dCache running with a CEPH backend and discuss the benefits and limitations of such an approach. We will also outline the roadmap for supporting ‘delegated storage’ within the dCache releases.
dCache on Steroids - Delegated Storage Solutions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mkrtchyan, Tigran; Adeyemi, F.; Ashish, A.
For over a decade, dCache.org has delivered a robust software used at more than 80 Universities and research institutes around the world, allowing these sites to provide reliable storage services for the WLCG experiments as well as many other scientific communities. The flexible architecture of dCache allows running it in a wide variety of configurations and platforms - from a SoC based all-in-one Raspberry-Pi up to hundreds of nodes in a multipetabyte installation. Due to lack of managed storage at the time, dCache implemented data placement, replication and data integrity directly. Today, many alternatives are available: S3, GlusterFS, CEPH andmore » others. While such solutions position themselves as scalable storage systems, they cannot be used by many scientific communities out of the box. The absence of community-accepted authentication and authorization mechanisms, the use of product specific protocols and the lack of namespace are some of the reasons that prevent wide-scale adoption of these alternatives. Most of these limitations are already solved by dCache. By delegating low-level storage management functionality to the above-mentioned new systems and providing the missing layer through dCache, we provide a solution which combines the benefits of both worlds - industry standard storage building blocks with the access protocols and authentication required by scientific communities. In this paper, we focus on CEPH, a popular software for clustered storage that supports file, block and object interfaces. CEPH is often used in modern computing centers, for example as a backend to OpenStack services. We will show prototypes of dCache running with a CEPH backend and discuss the benefits and limitations of such an approach. As a result, we will also outline the roadmap for supporting ‘delegated storage’ within the dCache releases.« less
Cache Scheme Based on Pre-Fetch Operation in ICN
Duan, Jie; Wang, Xiong; Xu, Shizhong; Liu, Yuanni; Xu, Chuan; Zhao, Guofeng
2016-01-01
Many recent researches focus on ICN (Information-Centric Network), in which named content becomes the first citizen instead of end-host. In ICN, Named content can be further divided into many small sized chunks, and chunk-based communication has merits over content-based communication. The universal in-network cache is one of the fundamental infrastructures for ICN. In this work, a chunk-level cache mechanism based on pre-fetch operation is proposed. The main idea is that, routers with cache store should pre-fetch and cache the next chunks which may be accessed in the near future according to received requests and cache policy for reducing the users’ perceived latency. Two pre-fetch driven modes are present to answer when and how to pre-fetch. The LRU (Least Recently Used) is employed for the cache replacement. Simulation results show that the average user perceived latency and hops can be decreased by employed this cache mechanism based on pre-fetch operation. Furthermore, we also demonstrate that the results are influenced by many factors, such as the cache capacity, Zipf parameters and pre-fetch window size. PMID:27362478
EqualChance: Addressing Intra-set Write Variation to Increase Lifetime of Non-volatile Caches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh; Vetter, Jeffrey S
To address the limitations of SRAM such as high-leakage and low-density, researchers have explored use of non-volatile memory (NVM) devices, such as ReRAM (resistive RAM) and STT-RAM (spin transfer torque RAM) for designing on-chip caches. A crucial limitation of NVMs, however, is that their write endurance is low and the large intra-set write variation introduced by existing cache management policies may further exacerbate this problem, thereby reducing the cache lifetime significantly. We present EqualChance, a technique to increase cache lifetime by reducing intra-set write variation. EqualChance works by periodically changing the physical cache-block location of a write-intensive data item withinmore » a set to achieve wear-leveling. Simulations using workloads from SPEC CPU2006 suite and HPC (high-performance computing) field show that EqualChance improves the cache lifetime by 4.29X. Also, its implementation overhead is small, and it incurs very small performance and energy loss.« less
Effective Padding of Multi-Dimensional Arrays to Avoid Cache Conflict Misses
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hong, Changwan; Bao, Wenlei; Cohen, Albert
Caches are used to significantly improve performance. Even with high degrees of set-associativity, the number of accessed data elements mapping to the same set in a cache can easily exceed the degree of associativity, causing conflict misses and lowered performance, even if the working set is much smaller than cache capacity. Array padding (increasing the size of array dimensions) is a well known optimization technique that can reduce conflict misses. In this paper, we develop the first algorithms for optimal padding of arrays for a set associative cache for arbitrary tile sizes, In addition, we develop the first solution tomore » padding for nested tiles and multi-level caches. The techniques are in implemented in PAdvisor tool. Experimental results with multiple benchmarks demonstrate significant performance improvement from use of PAdvisor for padding.« less
Solutions and debugging for data consistency in multiprocessors with noncoherent caches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bernstein, D.; Mendelson, B.; Breternitz, M. Jr.
1995-02-01
We analyze two important problems that arise in shared-memory multiprocessor systems. The stale data problem involves ensuring that data items in local memory of individual processors are current, independent of writes done by other processors. False sharing occurs when two processors have copies of the same shared data block but update different portions of the block. The false sharing problem involves guaranteeing that subsequent writes are properly combined. In modern architectures these problems are usually solved in hardware, by exploiting mechanisms for hardware controlled cache consistency. This leads to more expensive and nonscalable designs. Therefore, we are concentrating on softwaremore » methods for ensuring cache consistency that would allow for affordable and scalable multiprocessing systems. Unfortunately, providing software control is nontrivial, both for the compiler writer and for the application programmer. For this reason we are developing a debugging environment that will facilitate the development of compiler-based techniques and will help the programmer to tune his or her application using explicit cache management mechanisms. We extend the notion of a race condition for IBM Shared Memory System POWER/4, taking into consideration its noncoherent caches, and propose techniques for detection of false sharing problems. Identification of the stale data problem is discussed as well, and solutions are suggested.« less
Is random access memory random?
NASA Technical Reports Server (NTRS)
Denning, P. J.
1986-01-01
Most software is contructed on the assumption that the programs and data are stored in random access memory (RAM). Physical limitations on the relative speeds of processor and memory elements lead to a variety of memory organizations that match processor addressing rate with memory service rate. These include interleaved and cached memory. A very high fraction of a processor's address requests can be satified from the cache without reference to the main memory. The cache requests information from main memory in blocks that can be transferred at the full memory speed. Programmers who organize algorithms for locality can realize the highest performance from these computers.
Behavior-aware cache hierarchy optimization for low-power multi-core embedded systems
NASA Astrophysics Data System (ADS)
Zhao, Huatao; Luo, Xiao; Zhu, Chen; Watanabe, Takahiro; Zhu, Tianbo
2017-07-01
In modern embedded systems, the increasing number of cores requires efficient cache hierarchies to ensure data throughput, but such cache hierarchies are restricted by their tumid size and interference accesses which leads to both performance degradation and wasted energy. In this paper, we firstly propose a behavior-aware cache hierarchy (BACH) which can optimally allocate the multi-level cache resources to many cores and highly improved the efficiency of cache hierarchy, resulting in low energy consumption. The BACH takes full advantage of the explored application behaviors and runtime cache resource demands as the cache allocation bases, so that we can optimally configure the cache hierarchy to meet the runtime demand. The BACH was implemented on the GEM5 simulator. The experimental results show that energy consumption of a three-level cache hierarchy can be saved from 5.29% up to 27.94% compared with other key approaches while the performance of the multi-core system even has a slight improvement counting in hardware overhead.
NASA Technical Reports Server (NTRS)
Gunawardena, J. A.
1992-01-01
This cache mechanism is transparent but does not contain associative circuits. It does not rely on locality of reference of instructions or data. No redundant instructions or data are encached. Items in the cache are accessed without address arithmetic. A cache miss is detected by the simplest test; compare two bits. These features would result in faster access, higher hit rate, reduced chip area, and less power dissipation in comparison with associative systems of similar size.
a Cache Design Method for Spatial Information Visualization in 3d Real-Time Rendering Engine
NASA Astrophysics Data System (ADS)
Dai, X.; Xiong, H.; Zheng, X.
2012-07-01
A well-designed cache system has positive impacts on the 3D real-time rendering engine. As the amount of visualization data getting larger, the effects become more obvious. They are the base of the 3D real-time rendering engine to smoothly browsing through the data, which is out of the core memory, or from the internet. In this article, a new kind of caches which are based on multi threads and large file are introduced. The memory cache consists of three parts, the rendering cache, the pre-rendering cache and the elimination cache. The rendering cache stores the data that is rendering in the engine; the data that is dispatched according to the position of the view point in the horizontal and vertical directions is stored in the pre-rendering cache; the data that is eliminated from the previous cache is stored in the eliminate cache and is going to write to the disk cache. Multi large files are used in the disk cache. When a disk cache file size reaches the limit length(128M is the top in the experiment), no item will be eliminated from the file, but a new large cache file will be created. If the large file number is greater than the maximum number that is pre-set, the earliest file will be deleted from the disk. In this way, only one file is opened for writing and reading, and the rest are read-only so the disk cache can be used in a high asynchronous way. The size of the large file is limited in order to map to the core memory to save loading time. Multi-thread is used to update the cache data. The threads are used to load data to the rendering cache as soon as possible for rendering, to load data to the pre-rendering cache for rendering next few frames, and to load data to the elimination cache which is not necessary for the moment. In our experiment, two threads are designed. The first thread is to organize the memory cache according to the view point, and created two threads: the adding list and the deleting list, the adding list index the data that should be loaded to the pre-rendering cache immediately, the deleting list index the data that is no longer visible in the rendering scene and should be moved to the eliminate cache; the other thread is to move the data in the memory and disk cache according to the adding and the deleting list, and create the download requests when the data is indexed in the adding but cannot be found either in memory cache or disk cache, eliminate cache data is moved to the disk cache when the adding list and deleting are empty. The cache designed as described above in our experiment shows reliable and efficient, and the data loading time and files I/O time decreased sharply, especially when the rendering data getting larger.
Don’t make cache too complex: A simple probability-based cache management scheme for SSDs
Cho, Sangyeun; Choi, Jongmoo
2017-01-01
Solid-state drives (SSDs) have recently become a common storage component in computer systems, and they are fueled by continued bit cost reductions achieved with smaller feature sizes and multiple-level cell technologies. However, as the flash memory stores more bits per cell, the performance and reliability of the flash memory degrade substantially. To solve this problem, a fast non-volatile memory (NVM-)based cache has been employed within SSDs to reduce the long latency required to write data. Absorbing small writes in a fast NVM cache can also reduce the number of flash memory erase operations. To maximize the benefits of an NVM cache, it is important to increase the NVM cache utilization. In this paper, we propose and study ProCache, a simple NVM cache management scheme, that makes cache-entrance decisions based on random probability testing. Our scheme is motivated by the observation that frequently written hot data will eventually enter the cache with a high probability, and that infrequently accessed cold data will not enter the cache easily. Owing to its simplicity, ProCache is easy to implement at a substantially smaller cost than similar previously studied techniques. We evaluate ProCache and conclude that it achieves comparable performance compared to a more complex reference counter-based cache-management scheme. PMID:28358897
Don't make cache too complex: A simple probability-based cache management scheme for SSDs.
Baek, Seungjae; Cho, Sangyeun; Choi, Jongmoo
2017-01-01
Solid-state drives (SSDs) have recently become a common storage component in computer systems, and they are fueled by continued bit cost reductions achieved with smaller feature sizes and multiple-level cell technologies. However, as the flash memory stores more bits per cell, the performance and reliability of the flash memory degrade substantially. To solve this problem, a fast non-volatile memory (NVM-)based cache has been employed within SSDs to reduce the long latency required to write data. Absorbing small writes in a fast NVM cache can also reduce the number of flash memory erase operations. To maximize the benefits of an NVM cache, it is important to increase the NVM cache utilization. In this paper, we propose and study ProCache, a simple NVM cache management scheme, that makes cache-entrance decisions based on random probability testing. Our scheme is motivated by the observation that frequently written hot data will eventually enter the cache with a high probability, and that infrequently accessed cold data will not enter the cache easily. Owing to its simplicity, ProCache is easy to implement at a substantially smaller cost than similar previously studied techniques. We evaluate ProCache and conclude that it achieves comparable performance compared to a more complex reference counter-based cache-management scheme.
Distributed Name Servers: Naming and Caching in Large Distributed Computing Environments
1985-12-01
transmission rate of the communication medium1, transmission over a 56K bps line costs approx- imately 54r, and similarly, communication over a 9.6K...memories for modem computer systems attempt to maximize the hit ratio for a fixed-size cache by utilizing intelligent cache replacement algorithms
A Survey of Architectural Techniques For Improving Cache Power Efficiency
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh
Modern processors are using increasingly larger sized on-chip caches. Also, with each CMOS technology generation, there has been a significant increase in their leakage energy consumption. For this reason, cache power management has become a crucial research issue in modern processor design. To address this challenge and also meet the goals of sustainable computing, researchers have proposed several techniques for improving energy efficiency of cache architectures. This paper surveys recent architectural techniques for improving cache power efficiency and also presents a classification of these techniques based on their characteristics. For providing an application perspective, this paper also reviews several real-worldmore » processor chips that employ cache energy saving techniques. The aim of this survey is to enable engineers and researchers to get insights into the techniques for improving cache power efficiency and motivate them to invent novel solutions for enabling low-power operation of caches.« less
Cache-Aware Asymptotically-Optimal Sampling-Based Motion Planning
Ichnowski, Jeffrey; Prins, Jan F.; Alterovitz, Ron
2014-01-01
We present CARRT* (Cache-Aware Rapidly Exploring Random Tree*), an asymptotically optimal sampling-based motion planner that significantly reduces motion planning computation time by effectively utilizing the cache memory hierarchy of modern central processing units (CPUs). CARRT* can account for the CPU’s cache size in a manner that keeps its working dataset in the cache. The motion planner progressively subdivides the robot’s configuration space into smaller regions as the number of configuration samples rises. By focusing configuration exploration in a region for periods of time, nearest neighbor searching is accelerated since the working dataset is small enough to fit in the cache. CARRT* also rewires the motion planning graph in a manner that complements the cache-aware subdivision strategy to more quickly refine the motion planning graph toward optimality. We demonstrate the performance benefit of our cache-aware motion planning approach for scenarios involving a point robot as well as the Rethink Robotics Baxter robot. PMID:25419474
Cache-Aware Asymptotically-Optimal Sampling-Based Motion Planning.
Ichnowski, Jeffrey; Prins, Jan F; Alterovitz, Ron
2014-05-01
We present CARRT* (Cache-Aware Rapidly Exploring Random Tree*), an asymptotically optimal sampling-based motion planner that significantly reduces motion planning computation time by effectively utilizing the cache memory hierarchy of modern central processing units (CPUs). CARRT* can account for the CPU's cache size in a manner that keeps its working dataset in the cache. The motion planner progressively subdivides the robot's configuration space into smaller regions as the number of configuration samples rises. By focusing configuration exploration in a region for periods of time, nearest neighbor searching is accelerated since the working dataset is small enough to fit in the cache. CARRT* also rewires the motion planning graph in a manner that complements the cache-aware subdivision strategy to more quickly refine the motion planning graph toward optimality. We demonstrate the performance benefit of our cache-aware motion planning approach for scenarios involving a point robot as well as the Rethink Robotics Baxter robot.
A performance study of the time-varying cache behavior: a study on APEX, Mantevo, NAS, and PARSEC
Siddique, Nafiul A.; Grubel, Patricia A.; Badawy, Abdel-Hameed A.; ...
2017-09-20
Cache has long been used to minimize the latency of main memory accesses by storing frequently used data near the processor. Processor performance depends on the underlying cache performance. Therefore, significant research has been done to identify the most crucial metrics of cache performance. Although the majority of research focuses on measuring cache hit rates and data movement as the primary cache performance metrics, cache utilization is significantly important. We investigate the application’s locality using cache utilization metrics. In addition, we present cache utilization and traditional cache performance metrics as the program progresses providing detailed insights into the dynamic applicationmore » behavior on parallel applications from four benchmark suites running on multiple cores. We explore cache utilization for APEX, Mantevo, NAS, and PARSEC, mostly scientific benchmark suites. Our results indicate that 40% of the data bytes in a cache line are accessed at least once before line eviction. Also, on average a byte is accessed two times before the cache line is evicted for these applications. Moreover, we present runtime cache utilization, as well as, conventional performance metrics that illustrate a holistic understanding of cache behavior. To facilitate this research, we build a memory simulator incorporated into the Structural Simulation Toolkit (Rodrigues et al. in SIGMETRICS Perform Eval Rev 38(4):37–42, 2011). Finally, our results suggest that variable cache line size can result in better performance and can also conserve power.« less
A performance study of the time-varying cache behavior: a study on APEX, Mantevo, NAS, and PARSEC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Siddique, Nafiul A.; Grubel, Patricia A.; Badawy, Abdel-Hameed A.
Cache has long been used to minimize the latency of main memory accesses by storing frequently used data near the processor. Processor performance depends on the underlying cache performance. Therefore, significant research has been done to identify the most crucial metrics of cache performance. Although the majority of research focuses on measuring cache hit rates and data movement as the primary cache performance metrics, cache utilization is significantly important. We investigate the application’s locality using cache utilization metrics. In addition, we present cache utilization and traditional cache performance metrics as the program progresses providing detailed insights into the dynamic applicationmore » behavior on parallel applications from four benchmark suites running on multiple cores. We explore cache utilization for APEX, Mantevo, NAS, and PARSEC, mostly scientific benchmark suites. Our results indicate that 40% of the data bytes in a cache line are accessed at least once before line eviction. Also, on average a byte is accessed two times before the cache line is evicted for these applications. Moreover, we present runtime cache utilization, as well as, conventional performance metrics that illustrate a holistic understanding of cache behavior. To facilitate this research, we build a memory simulator incorporated into the Structural Simulation Toolkit (Rodrigues et al. in SIGMETRICS Perform Eval Rev 38(4):37–42, 2011). Finally, our results suggest that variable cache line size can result in better performance and can also conserve power.« less
Communication-avoiding symmetric-indefinite factorization
Ballard, Grey Malone; Becker, Dulcenia; Demmel, James; ...
2014-11-13
We describe and analyze a novel symmetric triangular factorization algorithm. The algorithm is essentially a block version of Aasen's triangular tridiagonalization. It factors a dense symmetric matrix A as the product A=PLTL TP T where P is a permutation matrix, L is lower triangular, and T is block tridiagonal and banded. The algorithm is the first symmetric-indefinite communication-avoiding factorization: it performs an asymptotically optimal amount of communication in a two-level memory hierarchy for almost any cache-line size. Adaptations of the algorithm to parallel computers are likely to be communication efficient as well; one such adaptation has been recently published. Asmore » a result, the current paper describes the algorithm, proves that it is numerically stable, and proves that it is communication optimal.« less
Communication-avoiding symmetric-indefinite factorization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ballard, Grey Malone; Becker, Dulcenia; Demmel, James
We describe and analyze a novel symmetric triangular factorization algorithm. The algorithm is essentially a block version of Aasen's triangular tridiagonalization. It factors a dense symmetric matrix A as the product A=PLTL TP T where P is a permutation matrix, L is lower triangular, and T is block tridiagonal and banded. The algorithm is the first symmetric-indefinite communication-avoiding factorization: it performs an asymptotically optimal amount of communication in a two-level memory hierarchy for almost any cache-line size. Adaptations of the algorithm to parallel computers are likely to be communication efficient as well; one such adaptation has been recently published. Asmore » a result, the current paper describes the algorithm, proves that it is numerically stable, and proves that it is communication optimal.« less
Enabling MPEG-2 video playback in embedded systems through improved data cache efficiency
NASA Astrophysics Data System (ADS)
Soderquist, Peter; Leeser, Miriam E.
1999-01-01
Digital video decoding, enabled by the MPEG-2 Video standard, is an important future application for embedded systems, particularly PDAs and other information appliances. Many such system require portability and wireless communication capabilities, and thus face severe limitations in size and power consumption. This places a premium on integration and efficiency, and favors software solutions for video functionality over specialized hardware. The processors in most embedded system currently lack the computational power needed to perform video decoding, but a related and equally important problem is the required data bandwidth, and the need to cost-effectively insure adequate data supply. MPEG data sets are very large, and generate significant amounts of excess memory traffic for standard data caches, up to 100 times the amount required for decoding. Meanwhile, cost and power limitations restrict cache sizes in embedded systems. Some systems, including many media processors, eliminate caches in favor of memories under direct, painstaking software control in the manner of digital signal processors. Yet MPEG data has locality which caches can exploit if properly optimized, providing fast, flexible, and automatic data supply. We propose a set of enhancements which target the specific needs of the heterogeneous types within the MPEG decoder working set. These optimizations significantly improve the efficiency of small caches, reducing cache-memory traffic by almost 70 percent, and can make an enhanced 4 KB cache perform better than a standard 1 MB cache. This performance improvement can enable high-resolution, full frame rate video playback in cheaper, smaller system than woudl otherwise be possible.
DARPA Status Report - November 1988
1988-11-01
style used in the applic4#ons reference to that block was by processor j. where j It. We was influenced by it. MACH is a multiprocessor operating S call...it can be order they occurred. However. the exact time at which the treated specially in memory management , and so most of the reference wa, made is...on cache consistency performance, sophisti- peak can be explained as clinging references that occur when cated cache management schemes that take
Effects of simulated mountain lion caching on decomposition of ungulate carcasses
Bischoff-Mattson, Z.; Mattson, D.
2009-01-01
Caching of animal remains is common among carnivorous species of all sizes, yet the effects of caching on larger prey are unstudied. We conducted a summer field experiment designed to test the effects of simulated mountain lion (Puma concolor) caching on mass loss, relative temperature, and odor dissemination of 9 prey-like carcasses. We deployed all but one of the carcasses in pairs, with one of each pair exposed and the other shaded and shallowly buried (cached). Caching substantially reduced wastage during dry and hot (drought) but not wet and cool (monsoon) periods, and it also reduced temperature and discernable odor to some degree during both seasons. These results are consistent with the hypotheses that caching serves to both reduce competition from arthropods and microbes and reduce odds of detection by larger vertebrates such as bears (Ursus spp.), wolves (Canis lupus), or other lions.
Optoelectronic-cache memory system architecture.
Chiarulli, D M; Levitan, S P
1996-05-10
We present an investigation of the architecture of an optoelectronic cache that can integrate terabit optical memories with the electronic caches associated with high-performance uniprocessors and multiprocessors. The use of optoelectronic-cache memories enables these terabit technologies to provide transparently low-latency secondary memory with frame sizes comparable with disk pages but with latencies that approach those of electronic secondary-cache memories. This enables the implementation of terabit memories with effective access times comparable with the cycle times of current microprocessors. The cache design is based on the use of a smart-pixel array and combines parallel free-space optical input-output to-and-from optical memory with conventional electronic communication to the processor caches. This cache and the optical memory system to which it will interface provide a large random-access memory space that has a lower overall latency than that of magnetic disks and disk arrays. In addition, as a consequence of the high-bandwidth parallel input-output capabilities of optical memories, fault service times for the optoelectronic cache are substantially less than those currently achievable with any rotational media.
Prefetching in file systems for MIMD multiprocessors
NASA Technical Reports Server (NTRS)
Kotz, David F.; Ellis, Carla Schlatter
1990-01-01
The question of whether prefetching blocks on the file into the block cache can effectively reduce overall execution time of a parallel computation, even under favorable assumptions, is considered. Experiments have been conducted with an interleaved file system testbed on the Butterfly Plus multiprocessor. Results of these experiments suggest that (1) the hit ratio, the accepted measure in traditional caching studies, may not be an adequate measure of performance when the workload consists of parallel computations and parallel file access patterns, (2) caching with prefetching can significantly improve the hit ratio and the average time to perform an I/O (input/output) operation, and (3) an improvement in overall execution time has been observed in most cases. In spite of these gains, prefetching sometimes results in increased execution times (a negative result, given the optimistic nature of the study). The authors explore why it is not trivial to translate savings on individual I/O requests into consistently better overall performance and identify the key problems that need to be addressed in order to improve the potential of prefetching techniques in the environment.
NASA Astrophysics Data System (ADS)
Bauerdick, L. A. T.; Bloom, K.; Bockelman, B.; Bradley, D. C.; Dasu, S.; Dost, J. M.; Sfiligoi, I.; Tadel, A.; Tadel, M.; Wuerthwein, F.; Yagil, A.; Cms Collaboration
2014-06-01
Following the success of the XRootd-based US CMS data federation, the AAA project investigated extensions of the federation architecture by developing two sample implementations of an XRootd, disk-based, caching proxy. The first one simply starts fetching a whole file as soon as a file open request is received and is suitable when completely random file access is expected or it is already known that a whole file be read. The second implementation supports on-demand downloading of partial files. Extensions to the Hadoop Distributed File System have been developed to allow for an immediate fallback to network access when local HDFS storage fails to provide the requested block. Both cache implementations are in pre-production testing at UCSD.
Evaluating the effect of online data compression on the disk cache of a mass storage system
NASA Technical Reports Server (NTRS)
Pentakalos, Odysseas I.; Yesha, Yelena
1994-01-01
A trace driven simulation of the disk cache of a mass storage system was used to evaluate the effect of an online compression algorithm on various performance measures. Traces from the system at NASA's Center for Computational Sciences were used to run the simulation and disk cache hit ratios, number of files and bytes migrating to tertiary storage were measured. The measurements were performed for both an LRU and a size based migration algorithm. In addition to seeing the effect of online data compression on the disk cache performance measure, the simulation provided insight into the characteristics of the interactive references, suggesting that hint based prefetching algorithms are the only alternative for any future improvements to the disk cache hit ratio.
Pravosudov, V V; Lavenex, P; Clayton, N S
2002-05-01
Earlier reports suggested that seasonal variation in food-caching behavior (caching intensity and cache retrieval accuracy) might correlate with morphological changes in the hippocampal formation, a brain structure thought to play a role in remembering cache locations. We demonstrated that changes in cache retrieval accuracy can also be triggered by experimental variation in food supply: captive mountain chickadees (Poecile gambeli) maintained on limited and unpredictable food supply were more accurate at recovering their caches and performed better on spatial memory tests than birds maintained on ad libitum food. In this study, we investigated whether these two treatment groups also differed in the volume and neuron number of the hippocampal formation. If variation in memory for food caches correlates with hippocampal size, then our birds with enhanced cache recovery and spatial memory performance should have larger hippocampal volumes and total neuron numbers. Contrary to this prediction we found no significant differences in volume or total neuron number of the hippocampal formation between the two treatment groups. Our results therefore indicate that changes in food-caching behavior and spatial memory performance, as mediated by experimental variations in food supply, are not necessarily accompanied by morphological changes in volume or neuron number of the hippocampal formation in fully developed, experienced food-caching birds. Copyright 2002 Wiley Periodicals, Inc.
WATCHMAN: A Data Warehouse Intelligent Cache Manager
NASA Technical Reports Server (NTRS)
Scheuermann, Peter; Shim, Junho; Vingralek, Radek
1996-01-01
Data warehouses store large volumes of data which are used frequently by decision support applications. Such applications involve complex queries. Query performance in such an environment is critical because decision support applications often require interactive query response time. Because data warehouses are updated infrequently, it becomes possible to improve query performance by caching sets retrieved by queries in addition to query execution plans. In this paper we report on the design of an intelligent cache manager for sets retrieved by queries called WATCHMAN, which is particularly well suited for data warehousing environment. Our cache manager employs two novel, complementary algorithms for cache replacement and for cache admission. WATCHMAN aims at minimizing query response time and its cache replacement policy swaps out entire retrieved sets of queries instead of individual pages. The cache replacement and admission algorithms make use of a profit metric, which considers for each retrieved set its average rate of reference, its size, and execution cost of the associated query. We report on a performance evaluation based on the TPC-D and Set Query benchmarks. These experiments show that WATCHMAN achieves a substantial performance improvement in a decision support environment when compared to a traditional LRU replacement algorithm.
Novel dynamic caching for hierarchically distributed video-on-demand systems
NASA Astrophysics Data System (ADS)
Ogo, Kenta; Matsuda, Chikashi; Nishimura, Kazutoshi
1998-02-01
It is difficult to simultaneously serve the millions of video streams that will be needed in the age of 'Mega-Media' networks by using only one high-performance server. To distribute the service load, caching servers should be location near users. However, in previously proposed caching mechanisms, the grade of service depends on whether the data is already cached at a caching server. To make the caching servers transparent to the users, the ability to randomly access the large volume of data stored in the central server should be supported, and the operational functions of the provided service should not be narrowly restricted. We propose a mechanism for constructing a video-stream-caching server that is transparent to the users and that will always support all special playback functions for all available programs to all the contents with a latency of only 1 or 2 seconds. This mechanism uses Variable-sized-quantum-segment- caching technique derived from an analysis of the historical usage log data generated by a line-on-demand-type service experiment and based on the basic techniques used by a time- slot-based multiple-stream video-on-demand server.
Optimizing transformations of stencil operations for parallel cache-based architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bassetti, F.; Davis, K.
This paper describes a new technique for optimizing serial and parallel stencil- and stencil-like operations for cache-based architectures. This technique takes advantage of the semantic knowledge implicity in stencil-like computations. The technique is implemented as a source-to-source program transformation; because of its specificity it could not be expected of a conventional compiler. Empirical results demonstrate a uniform factor of two speedup. The experiments clearly show the benefits of this technique to be a consequence, as intended, of the reduction in cache misses. The test codes are based on a 5-point stencil obtained by the discretization of the Poisson equation andmore » applied to a two-dimensional uniform grid using the Jacobi method as an iterative solver. Results are presented for a 1-D tiling for a single processor, and in parallel using 1-D data partition. For the parallel case both blocking and non-blocking communication are tested. The same scheme of experiments has bee n performed for the 2-D tiling case. However, for the parallel case the 2-D partitioning is not discussed here, so the parallel case handled for 2-D is 2-D tiling with 1-D data partitioning.« less
Cache-enabled small cell networks: modeling and tradeoffs.
Baştuǧ, Ejder; Bennis, Mehdi; Kountouris, Marios; Debbah, Mérouane
We consider a network model where small base stations (SBSs) have caching capabilities as a means to alleviate the backhaul load and satisfy users' demand. The SBSs are stochastically distributed over the plane according to a Poisson point process (PPP) and serve their users either (i) by bringing the content from the Internet through a finite rate backhaul or (ii) by serving them from the local caches. We derive closed-form expressions for the outage probability and the average delivery rate as a function of the signal-to-interference-plus-noise ratio (SINR), SBS density, target file bitrate, storage size, file length, and file popularity. We then analyze the impact of key operating parameters on the system performance. It is shown that a certain outage probability can be achieved either by increasing the number of base stations or the total storage size. Our results and analysis provide key insights into the deployment of cache-enabled small cell networks (SCNs), which are seen as a promising solution for future heterogeneous cellular networks.
Value-Based Caching in Information-Centric Wireless Body Area Networks
Al-Turjman, Fadi M.; Imran, Muhammad; Vasilakos, Athanasios V.
2017-01-01
We propose a resilient cache replacement approach based on a Value of sensed Information (VoI) policy. To resolve and fetch content when the origin is not available due to isolated in-network nodes (fragmentation) and harsh operational conditions, we exploit a content caching approach. Our approach depends on four functional parameters in sensory Wireless Body Area Networks (WBANs). These four parameters are: age of data based on periodic request, popularity of on-demand requests, communication interference cost, and the duration for which the sensor node is required to operate in active mode to capture the sensed readings. These parameters are considered together to assign a value to the cached data to retain the most valuable information in the cache for prolonged time periods. The higher the value, the longer the duration for which the data will be retained in the cache. This caching strategy provides significant availability for most valuable and difficult to retrieve data in the WBANs. Extensive simulations are performed to compare the proposed scheme against other significant caching schemes in the literature while varying critical aspects in WBANs (e.g., data popularity, cache size, publisher load, connectivity-degree, and severe probabilities of node failures). These simulation results indicate that the proposed VoI-based approach is a valid tool for the retrieval of cached content in disruptive and challenging scenarios, such as the one experienced in WBANs, since it allows the retrieval of content for a long period even while experiencing severe in-network node failures. PMID:28106817
Wang, Bo; Ives, Anthony R
2017-03-01
Individual variation in seed size and seed production is high in many plant species. How does this variation affect seed-dispersing animals and, in turn, the fitness of individual plants? In this study, we first surveyed intraspecific variation in seed mass and production in a population of a Chinese white pine, Pinus armandii. For 134 target trees investigated in 2012, there was very high variation in seed size, with mean seed mass varying among trees almost tenfold, from 0.038 to 0.361 g. Furthermore, 30 of the 134 trees produced seeds 2 years later, and for these individuals there was a correlation in seed mass of 0.59 between years, implying consistent differences among individuals. For a subset of 67 trees, we monitored the foraging preferences of scatter-hoarding rodents on a total of 15,301 seeds: 8380 were ignored, 3184 were eaten in situ, 2651 were eaten after being cached, and 395 were successfully dispersed (cached and left intact). At the scale of individual seeds, seed mass affected almost every decision that rodents made to eat, remove, and cache individual seeds. At the level of individual trees, larger seeds had increased probabilities of both predation and successful dispersal: the effects of mean seed size on costs (predation) and benefits (caching) balanced out. Thus, despite seed size affecting rodent decisions, variation among trees in dispersal success associated with mean seed size was small once seeds were harvested. This might explain, at least in part, the maintenance of high variation in mean seed mass among tree individuals.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gala, Alan; Ohmacht, Martin
A multiprocessor system includes nodes. Each node includes a data path that includes a core, a TLB, and a first level cache implementing disambiguation. The system also includes at least one second level cache and a main memory. For thread memory access requests, the core uses an address associated with an instruction format of the core. The first level cache uses an address format related to the size of the main memory plus an offset corresponding to hardware thread meta data. The second level cache uses a physical main memory address plus software thread meta data to store the memorymore » access request. The second level cache accesses the main memory using the physical address with neither the offset nor the thread meta data after resolving speculation. In short, this system includes mapping of a virtual address to a different physical addresses for value disambiguation for different threads.« less
A Scalable proxy cache for Grid Data Access
NASA Astrophysics Data System (ADS)
Cristian Cirstea, Traian; Just Keijser, Jan; Koeroo, Oscar Arthur; Starink, Ronald; Templon, Jeffrey Alan
2012-12-01
We describe a prototype grid proxy cache system developed at Nikhef, motivated by a desire to construct the first building block of a future https-based Content Delivery Network for grid infrastructures. Two goals drove the project: firstly to provide a “native view” of the grid for desktop-type users, and secondly to improve performance for physics-analysis type use cases, where multiple passes are made over the same set of data (residing on the grid). We further constrained the design by requiring that the system should be made of standard components wherever possible. The prototype that emerged from this exercise is a horizontally-scalable, cooperating system of web server / cache nodes, fronted by a customized webDAV server. The webDAV server is custom only in the sense that it supports http redirects (providing horizontal scaling) and that the authentication module has, as back end, a proxy delegation chain that can be used by the cache nodes to retrieve files from the grid. The prototype was deployed at Nikhef and tested at a scale of several terabytes of data and approximately one hundred fast cores of computing. Both small and large files were tested, in a number of scenarios, and with various numbers of cache nodes, in order to understand the scaling properties of the system. For properly-dimensioned cache-node hardware, the system showed speedup of several integer factors for the analysis-type use cases. These results and others are presented and discussed.
Single-pass memory system evaluation for multiprogramming workloads
NASA Technical Reports Server (NTRS)
Conte, Thomas M.; Hwu, Wen-Mei W.
1990-01-01
Modern memory systems are composed of levels of cache memories, a virtual memory system, and a backing store. Varying more than a few design parameters and measuring the performance of such systems has traditionally be constrained by the high cost of simulation. Models of cache performance recently introduced reduce the cost simulation but at the expense of accuracy of performance prediction. Stack-based methods predict performance accurately using one pass over the trace for all cache sizes, but these techniques have been limited to fully-associative organizations. This paper presents a stack-based method of evaluating the performance of cache memories using a recurrence/conflict model for the miss ratio. Unlike previous work, the performance of realistic cache designs, such as direct-mapped caches, are predicted by the method. The method also includes a new approach to the problem of the effects of multiprogramming. This new technique separates the characteristics of the individual program from that of the workload. The recurrence/conflict method is shown to be practical, general, and powerful by comparing its performance to that of a popular traditional cache simulator. The authors expect that the availability of such a tool will have a large impact on future architectural studies of memory systems.
Vectorization, threading, and cache-blocking considerations for hydrocodes on emerging architectures
Fung, J.; Aulwes, R. T.; Bement, M. T.; ...
2015-07-14
This work reports on considerations for improving computational performance in preparation for current and expected changes to computer architecture. The algorithms studied will include increasingly complex prototypes for radiation hydrodynamics codes, such as gradient routines and diffusion matrix assembly (e.g., in [1-6]). The meshes considered for the algorithms are structured or unstructured meshes. The considerations applied for performance improvements are meant to be general in terms of architecture (not specifically graphical processing unit (GPUs) or multi-core machines, for example) and include techniques for vectorization, threading, tiling, and cache blocking. Out of a survey of optimization techniques on applications such asmore » diffusion and hydrodynamics, we make general recommendations with a view toward making these techniques conceptually accessible to the applications code developer. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.« less
Improving energy efficiency of Embedded DRAM Caches for High-end Computing Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh; Vetter, Jeffrey S; Li, Dong
2014-01-01
With increasing system core-count, the size of last level cache (LLC) has increased and since SRAM consumes high leakage power, power consumption of LLCs is becoming a significant fraction of processor power consumption. To address this, researchers have used embedded DRAM (eDRAM) LLCs which consume low-leakage power. However, eDRAM caches consume a significant amount of energy in the form of refresh energy. In this paper, we propose ESTEEM, an energy saving technique for embedded DRAM caches. ESTEEM uses dynamic cache reconfiguration to turn-off a portion of the cache to save both leakage and refresh energy. It logically divides the cachemore » sets into multiple modules and turns-off possibly different number of ways in each module. Microarchitectural simulations confirm that ESTEEM is effective in improving performance and energy efficiency and provides better results compared to a recently-proposed eDRAM cache energy saving technique, namely Refrint. For single and dual-core simulations, the average saving in memory subsystem (LLC+main memory) on using ESTEEM is 25.8% and 32.6%, respectively and average weighted speedup are 1.09X and 1.22X, respectively. Additional experiments confirm that ESTEEM works well for a wide-range of system parameters.« less
A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-volatile On-chip Caches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh; Vetter, Jeffrey S; Li, Dong
Recent trends of CMOS scaling and increasing number of on-chip cores have led to a large increase in the size of on-chip caches. Since SRAM has low density and consumes large amount of leakage power, its use in designing on-chip caches has become more challenging. To address this issue, researchers are exploring the use of several emerging memory technologies, such as embedded DRAM, spin transfer torque RAM, resistive RAM, phase change RAM and domain wall memory. In this paper, we survey the architectural approaches proposed for designing memory systems and, specifically, caches with these emerging memory technologies. To highlight theirmore » similarities and differences, we present a classification of these technologies and architectural approaches based on their key characteristics. We also briefly summarize the challenges in using these technologies for architecting caches. We believe that this survey will help the readers gain insights into the emerging memory device technologies, and their potential use in designing future computing systems.« less
Study of cache performance in distributed environment for data processing
NASA Astrophysics Data System (ADS)
Makatun, Dzmitry; Lauret, Jérôme; Šumbera, Michal
2014-06-01
Processing data in distributed environment has found its application in many fields of science (Nuclear and Particle Physics (NPP), astronomy, biology to name only those). Efficiently transferring data between sites is an essential part of such processing. The implementation of caching strategies in data transfer software and tools, such as the Reasoner for Intelligent File Transfer (RIFT) being developed in the STAR collaboration, can significantly decrease network load and waiting time by reusing the knowledge of data provenance as well as data placed in transfer cache to further expand on the availability of sources for files and data-sets. Though, a great variety of caching algorithms is known, a study is needed to evaluate which one can deliver the best performance in data access considering the realistic demand patterns. Records of access to the complete data-sets of NPP experiments were analyzed and used as input for computer simulations. Series of simulations were done in order to estimate the possible cache hits and cache hits per byte for known caching algorithms. The simulations were done for cache of different sizes within interval 0.001 - 90% of complete data-set and low-watermark within 0-90%. Records of data access were taken from several experiments and within different time intervals in order to validate the results. In this paper, we will discuss the different data caching strategies from canonical algorithms to hybrid cache strategies, present the results of our simulations for the diverse algorithms, debate and identify the choice for the best algorithm in the context of Physics Data analysis in NPP. While the results of those studies have been implemented in RIFT, they can also be used when setting up cache in any other computational work-flow (Cloud processing for example) or managing data storages with partial replicas of the entire data-set.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shoopman, J. D.
This report documents Livermore Computing (LC) activities in support of ASC L2 milestone 5589: Modernization and Expansion of LLNL Archive Disk Cache, due March 31, 2016. The full text of the milestone is included in Attachment 1. The description of the milestone is: Description: Configuration of archival disk cache systems will be modernized to reduce fragmentation, and new, higher capacity disk subsystems will be deployed. This will enhance archival disk cache capability for ASC archive users, enabling files written to the archives to remain resident on disk for many (6–12) months, regardless of file size. The milestone was completed inmore » three phases. On August 26, 2015 subsystems with 6PB of disk cache were deployed for production use in LLNL’s unclassified HPSS environment. Following that, on September 23, 2015 subsystems with 9 PB of disk cache were deployed for production use in LLNL’s classified HPSS environment. On January 31, 2016, the milestone was fully satisfied when the legacy Data Direct Networks (DDN) archive disk cache subsystems were fully retired from production use in both LLNL’s unclassified and classified HPSS environments, and only the newly deployed systems were in use.« less
Can High Bandwidth and Latency Justify Large Cache Blocks in Scalable Multiprocessors?
1994-01-01
400 MB/second. 4 Dubnicki’s work used trace-driven simulation, with traces collected on an 8-processor machine. We would expect such small-scale...312 1 6 32 64 of odk Sb* Bad64.M Figure 17: Miss rate of Ind Blocked LU. Figure 18: MCPR of Ind Blocked LU. overall miss rate of TGauss is a factor of...easily. 17 (’his approach assunics that the model paramelers we collect from simulations with infinite band- width (such as the miss rate and the
Empirical study of parallel LRU simulation algorithms
NASA Technical Reports Server (NTRS)
Carr, Eric; Nicol, David M.
1994-01-01
This paper reports on the performance of five parallel algorithms for simulating a fully associative cache operating under the LRU (Least-Recently-Used) replacement policy. Three of the algorithms are SIMD, and are implemented on the MasPar MP-2 architecture. Two other algorithms are parallelizations of an efficient serial algorithm on the Intel Paragon. One SIMD algorithm is quite simple, but its cost is linear in the cache size. The two other SIMD algorithm are more complex, but have costs that are independent on the cache size. Both the second and third SIMD algorithms compute all stack distances; the second SIMD algorithm is completely general, whereas the third SIMD algorithm presumes and takes advantage of bounds on the range of reference tags. Both MIMD algorithm implemented on the Paragon are general and compute all stack distances; they differ in one step that may affect their respective scalability. We assess the strengths and weaknesses of these algorithms as a function of problem size and characteristics, and compare their performance on traces derived from execution of three SPEC benchmark programs.
Domain Wall Fermion Inverter on Pentium 4
NASA Astrophysics Data System (ADS)
Pochinsky, Andrew
2005-03-01
A highly optimized domain wall fermion inverter has been developed as part of the SciDAC lattice initiative. By designing the code to minimize memory bus traffic, it achieves high cache reuse and performance in excess of 2 GFlops for out of L2 cache problem sizes on a GigE cluster with 2.66 GHz Xeon processors. The code uses the SciDAC QMP communication library.
Tier 3 batch system data locality via managed caches
NASA Astrophysics Data System (ADS)
Fischer, Max; Giffels, Manuel; Jung, Christopher; Kühn, Eileen; Quast, Günter
2015-05-01
Modern data processing increasingly relies on data locality for performance and scalability, whereas the common HEP approaches aim for uniform resource pools with minimal locality, recently even across site boundaries. To combine advantages of both, the High- Performance Data Analysis (HPDA) Tier 3 concept opportunistically establishes data locality via coordinated caches. In accordance with HEP Tier 3 activities, the design incorporates two major assumptions: First, only a fraction of data is accessed regularly and thus the deciding factor for overall throughput. Second, data access may fallback to non-local, making permanent local data availability an inefficient resource usage strategy. Based on this, the HPDA design generically extends available storage hierarchies into the batch system. Using the batch system itself for scheduling file locality, an array of independent caches on the worker nodes is dynamically populated with high-profile data. Cache state information is exposed to the batch system both for managing caches and scheduling jobs. As a result, users directly work with a regular, adequately sized storage system. However, their automated batch processes are presented with local replications of data whenever possible.
Shrager, Jeff; Billman, Dorrit; Convertino, Gregorio; Massar, J P; Pirolli, Peter
2010-01-01
Science is a form of distributed analysis involving both individual work that produces new knowledge and collaborative work to exchange information with the larger community. There are many particular ways in which individual and community can interact in science, and it is difficult to assess how efficient these are, and what the best way might be to support them. This paper reports on a series of experiments in this area and a prototype implementation using a research platform called CACHE. CACHE both supports experimentation with different structures of interaction between individual and community cognition and serves as a prototype for computational support for those structures. We particularly focus on CACHE-BC, the Bayes community version of CACHE, within which the community can break up analytical tasks into "mind-sized" units and use provenance tracking to keep track of the relationship between these units. Copyright © 2009 Cognitive Science Society, Inc.
Efficient image data distribution and management with application to web caching architectures
NASA Astrophysics Data System (ADS)
Han, Keesook J.; Suter, Bruce W.
2003-03-01
We present compact image data structures and associated packet delivery techniques for effective Web caching architectures. Presently, images on a web page are inefficiently stored, using a single image per file. Our approach is to use clustering to merge similar images into a single file in order to exploit the redundancy between images. Our studies indicate that a 30-50% image data size reduction can be achieved by eliminating the redundancies of color indexes. Attached to this file is new metadata to permit an easy extraction of images. This approach will permit a more efficient use of the cache, since a shorter list of cache references will be required. Packet and transmission delays can be reduced by 50% eliminating redundant TCP/IP headers and connection time. Thus, this innovative paradigm for the elimination of redundancy may provide valuable benefits for optimizing packet delivery in IP networks by reducing latency and minimizing the bandwidth requirements.
Interference Lattice-based Loop Nest Tilings for Stencil Computations
NASA Technical Reports Server (NTRS)
VanderWijngaart, Rob F.; Frumkin, Michael
2000-01-01
A common method for improving performance of stencil operations on structured multi-dimensional discretization grids is loop tiling. Tile shapes and sizes are usually determined heuristically, based on the size of the primary data cache. We provide a lower bound on the numbers of cache misses that must be incurred by any tiling, and a close achievable bound using a particular tiling based on the grid interference lattice. The latter tiling is used to derive highly efficient loop orderings. The total number of cache misses of a code is the sum of (necessary) cold misses and misses caused by elements being dropped from the cache between successive loads (replacement misses). Maximizing temporal locality is equivalent to minimizing replacement misses. Temporal locality of loop nests implementing stencil operations is optimized by tilings that avoid data conflicts. We divide the loop nest iteration space into conflict-free tiles, derived from the cache miss equation. The tiling involves the definition of the grid interference lattice an equivalence class of grid points whose images in main memory map to the same location in the cache-and the construction of a special basis for the lattice. Conflicts only occur on the boundaries of the tiles, unless the tiles are too thin. We show that the surface area of the tiles is bounded for grids of any dimensionality, and for caches of any associativity, provided the eccentricity of the fundamental parallelepiped (the tile spanned by the basis) of the lattice is bounded. Eccentricity is determined by two factors, aspect ratio and skewness. The aspect ratio of the parallelepiped can be bounded by appropriate array padding. The skewness can be bounded by the choice of a proper basis. Combining these two strategies ensures that pathologically thin tiles are avoided. They do not, however, minimize replacement misses per se. The reason is that tile visitation order influences the number of data conflicts on the tile boundaries. If two adjacent tiles are visited successively, there will be no replacement misses on the shared boundary. The iteration space may be covered with pencils larger than the size of the cache while avoiding data conflicts if the pencils are traversed by a scanning-face method. Replacement misses are incurred only on the boundaries of the pencils, and the number of misses is minimized by maximizing the volume of the scanning face, not the volume of the tile. We present an algorithm for constructing the most efficient scanning face for a given grid and stencil operator. In two dimensions it is based on a continued fraction algorithm. In three dimensions it follows Voronoi's successive minima algorithm. We show experimental results of using the scanning face, and compare with canonical loop orderings.
Cache-Oblivious parallel SIMD Viterbi decoding for sequence search in HMMER.
Ferreira, Miguel; Roma, Nuno; Russo, Luis M S
2014-05-30
HMMER is a commonly used bioinformatics tool based on Hidden Markov Models (HMMs) to analyze and process biological sequences. One of its main homology engines is based on the Viterbi decoding algorithm, which was already highly parallelized and optimized using Farrar's striped processing pattern with Intel SSE2 instruction set extension. A new SIMD vectorization of the Viterbi decoding algorithm is proposed, based on an SSE2 inter-task parallelization approach similar to the DNA alignment algorithm proposed by Rognes. Besides this alternative vectorization scheme, the proposed implementation also introduces a new partitioning of the Markov model that allows a significantly more efficient exploitation of the cache locality. Such optimization, together with an improved loading of the emission scores, allows the achievement of a constant processing throughput, regardless of the innermost-cache size and of the dimension of the considered model. The proposed optimized vectorization of the Viterbi decoding algorithm was extensively evaluated and compared with the HMMER3 decoder to process DNA and protein datasets, proving to be a rather competitive alternative implementation. Being always faster than the already highly optimized ViterbiFilter implementation of HMMER3, the proposed Cache-Oblivious Parallel SIMD Viterbi (COPS) implementation provides a constant throughput and offers a processing speedup as high as two times faster, depending on the model's size.
Interactive high-resolution isosurface ray casting on multicore processors.
Wang, Qin; JaJa, Joseph
2008-01-01
We present a new method for the interactive rendering of isosurfaces using ray casting on multi-core processors. This method consists of a combination of an object-order traversal that coarsely identifies possible candidate 3D data blocks for each small set of contiguous pixels, and an isosurface ray casting strategy tailored for the resulting limited-size lists of candidate 3D data blocks. While static screen partitioning is widely used in the literature, our scheme performs dynamic allocation of groups of ray casting tasks to ensure almost equal loads among the different threads running on multi-cores while maintaining spatial locality. We also make careful use of memory management environment commonly present in multi-core processors. We test our system on a two-processor Clovertown platform, each consisting of a Quad-Core 1.86-GHz Intel Xeon Processor, for a number of widely different benchmarks. The detailed experimental results show that our system is efficient and scalable, and achieves high cache performance and excellent load balancing, resulting in an overall performance that is superior to any of the previous algorithms. In fact, we achieve an interactive isosurface rendering on a 1024(2) screen for all the datasets tested up to the maximum size of the main memory of our platform.
High-bandwidth prefetcher for high-bandwidth memory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mehta, Sanyam; Kohn, James Robert; Ernst, Daniel Jonathan
A method for prefetching data into a cache is provided. The method allocates an outstanding request buffer ("ORB"). The method stores in an address field of the ORB an address and a number of blocks. The method issues prefetch requests for a degree number of blocks starting at the address. When a prefetch response is received for all the prefetch requests, the method adjusts the address of the next block to prefetch and adjusts the number of blocks remaining to be retrieved and then issues prefetch requests for a degree number of blocks starting at the adjusted address. The prefetchingmore » pauses when a maximum distance between the reads of the prefetched blocks and the last prefetched block is reached. When a read request for a prefetched block is received, the method resumes prefetching when a resume criterion is satisfied.« less
A Case for Tamper-Resistant and Tamper-Evident Computer Systems
2007-02-01
such as Kerberos is hard to apply [2] B . Gassend, G. Sub, D. Clarke, M. Dijk, and S. Devadas . Caches and Hash Trees for Efficient Memory Integrity...the block’s data from DRAM. For authentication, Merkle [14] G. Suh, D. Clarke, B . Gassend, M. van Dijk, and S. Devadas . Efficient Memory Integrity...wwi4serverwatch.com/news/article.php/ tion where a data block is encrypted or decrypted through an XOR 1399451, 2000. [11] B . Rogers, Y. Solihin
Cache-Oblivious parallel SIMD Viterbi decoding for sequence search in HMMER
2014-01-01
Background HMMER is a commonly used bioinformatics tool based on Hidden Markov Models (HMMs) to analyze and process biological sequences. One of its main homology engines is based on the Viterbi decoding algorithm, which was already highly parallelized and optimized using Farrar’s striped processing pattern with Intel SSE2 instruction set extension. Results A new SIMD vectorization of the Viterbi decoding algorithm is proposed, based on an SSE2 inter-task parallelization approach similar to the DNA alignment algorithm proposed by Rognes. Besides this alternative vectorization scheme, the proposed implementation also introduces a new partitioning of the Markov model that allows a significantly more efficient exploitation of the cache locality. Such optimization, together with an improved loading of the emission scores, allows the achievement of a constant processing throughput, regardless of the innermost-cache size and of the dimension of the considered model. Conclusions The proposed optimized vectorization of the Viterbi decoding algorithm was extensively evaluated and compared with the HMMER3 decoder to process DNA and protein datasets, proving to be a rather competitive alternative implementation. Being always faster than the already highly optimized ViterbiFilter implementation of HMMER3, the proposed Cache-Oblivious Parallel SIMD Viterbi (COPS) implementation provides a constant throughput and offers a processing speedup as high as two times faster, depending on the model’s size. PMID:24884826
Protecting privacy in a clinical data warehouse.
Kong, Guilan; Xiao, Zhichun
2015-06-01
Peking University has several prestigious teaching hospitals in China. To make secondary use of massive medical data for research purposes, construction of a clinical data warehouse is imperative in Peking University. However, a big concern for clinical data warehouse construction is how to protect patient privacy. In this project, we propose to use a combination of symmetric block ciphers, asymmetric ciphers, and cryptographic hashing algorithms to protect patient privacy information. The novelty of our privacy protection approach lies in message-level data encryption, the key caching system, and the cryptographic key management system. The proposed privacy protection approach is scalable to clinical data warehouse construction with any size of medical data. With the composite privacy protection approach, the clinical data warehouse can be secure enough to keep the confidential data from leaking to the outside world. © The Author(s) 2014.
Dynamic Allocation of SPM Based on Time-Slotted Cache Conflict Graph for System Optimization
NASA Astrophysics Data System (ADS)
Wu, Jianping; Ling, Ming; Zhang, Yang; Mei, Chen; Wang, Huan
This paper proposes a novel dynamic Scratch-pad Memory allocation strategy to optimize the energy consumption of the memory sub-system. Firstly, the whole program execution process is sliced into several time slots according to the temporal dimension; thereafter, a Time-Slotted Cache Conflict Graph (TSCCG) is introduced to model the behavior of Data Cache (D-Cache) conflicts within each time slot. Then, Integer Nonlinear Programming (INP) is implemented, which can avoid time-consuming linearization process, to select the most profitable data pages. Virtual Memory System (VMS) is adopted to remap those data pages, which will cause severe Cache conflicts within a time slot, to SPM. In order to minimize the swapping overhead of dynamic SPM allocation, a novel SPM controller with a tightly coupled DMA is introduced to issue the swapping operations without CPU's intervention. Last but not the least, this paper discusses the fluctuation of system energy profit based on different MMU page size as well as the Time Slot duration quantitatively. According to our design space exploration, the proposed method can optimize all of the data segments, including global data, heap and stack data in general, and reduce the total energy consumption by 27.28% on average, up to 55.22% with a marginal performance promotion. And comparing to the conventional static CCG (Cache Conflicts Graph), our approach can obtain 24.7% energy profit on average, up to 30.5% with a sight boost in performance.
Distributed shared memory for roaming large volumes.
Castanié, Laurent; Mion, Christophe; Cavin, Xavier; Lévy, Bruno
2006-01-01
We present a cluster-based volume rendering system for roaming very large volumes. This system allows to move a gigabyte-sized probe inside a total volume of several tens or hundreds of gigabytes in real-time. While the size of the probe is limited by the total amount of texture memory on the cluster, the size of the total data set has no theoretical limit. The cluster is used as a distributed graphics processing unit that both aggregates graphics power and graphics memory. A hardware-accelerated volume renderer runs in parallel on the cluster nodes and the final image compositing is implemented using a pipelined sort-last rendering algorithm. Meanwhile, volume bricking and volume paging allow efficient data caching. On each rendering node, a distributed hierarchical cache system implements a global software-based distributed shared memory on the cluster. In case of a cache miss, this system first checks page residency on the other cluster nodes instead of directly accessing local disks. Using two Gigabit Ethernet network interfaces per node, we accelerate data fetching by a factor of 4 compared to directly accessing local disks. The system also implements asynchronous disk access and texture loading, which makes it possible to overlap data loading, volume slicing and rendering for optimal volume roaming.
The Acceleration of Structural Microarchitectural Simulation via Scheduling
2006-11-01
193 viii List of Tables 1.1 Size of Intel R ©Processors...Table 1.1 shows the total and estimated non-cache transistor counts in succeeding generations of Intel R ©microprocessors. (Cache array transistors are...Intel486TM 1989 1,200,000 800,000 Intel R ©Pentium R © 1993 3,100,000 2,300,000 Intel R ©Pentium R ©II 1997 7,500,000 5,500,000 Intel R ©Pentium R ©III 1999
Using shadow page cache to improve isolated drivers performance.
Zheng, Hao; Dong, Xiaoshe; Wang, Endong; Chen, Baoke; Zhu, Zhengdong; Liu, Chengzhe
2015-01-01
With the advantage of the reusability property of the virtualization technology, users can reuse various types and versions of existing operating systems and drivers in a virtual machine, so as to customize their application environment. In order to prevent users' virtualization environments being impacted by driver faults in virtual machine, Chariot examines the correctness of driver's write operations by the method of combining a driver's write operation capture and a driver's private access control table. However, this method needs to keep the write permission of shadow page table as read-only, so as to capture isolated driver's write operations through page faults, which adversely affect the performance of the driver. Based on delaying setting frequently used shadow pages' write permissions to read-only, this paper proposes an algorithm using shadow page cache to improve the performance of isolated drivers and carefully study the relationship between the performance of drivers and the size of shadow page cache. Experimental results show that, through the shadow page cache, the performance of isolated drivers can be greatly improved without impacting Chariot's reliability too much.
Using Shadow Page Cache to Improve Isolated Drivers Performance
Dong, Xiaoshe; Wang, Endong; Chen, Baoke; Zhu, Zhengdong; Liu, Chengzhe
2015-01-01
With the advantage of the reusability property of the virtualization technology, users can reuse various types and versions of existing operating systems and drivers in a virtual machine, so as to customize their application environment. In order to prevent users' virtualization environments being impacted by driver faults in virtual machine, Chariot examines the correctness of driver's write operations by the method of combining a driver's write operation capture and a driver's private access control table. However, this method needs to keep the write permission of shadow page table as read-only, so as to capture isolated driver's write operations through page faults, which adversely affect the performance of the driver. Based on delaying setting frequently used shadow pages' write permissions to read-only, this paper proposes an algorithm using shadow page cache to improve the performance of isolated drivers and carefully study the relationship between the performance of drivers and the size of shadow page cache. Experimental results show that, through the shadow page cache, the performance of isolated drivers can be greatly improved without impacting Chariot's reliability too much. PMID:25815373
Programmable partitioning for high-performance coherence domains in a multiprocessor system
Blumrich, Matthias A [Ridgefield, CT; Salapura, Valentina [Chappaqua, NY
2011-01-25
A multiprocessor computing system and a method of logically partitioning a multiprocessor computing system are disclosed. The multiprocessor computing system comprises a multitude of processing units, and a multitude of snoop units. Each of the processing units includes a local cache, and the snoop units are provided for supporting cache coherency in the multiprocessor system. Each of the snoop units is connected to a respective one of the processing units and to all of the other snoop units. The multiprocessor computing system further includes a partitioning system for using the snoop units to partition the multitude of processing units into a plurality of independent, memory-consistent, adjustable-size processing groups. Preferably, when the processor units are partitioned into these processing groups, the partitioning system also configures the snoop units to maintain cache coherency within each of said groups.
Brew, D.A.; Karl, S.M.; Barnes, D.F.; Jachens, R.C.; Ford, A.B.; Horner, R.
1991-01-01
The 155 km wide, 310 km long Sitka Sound - Atlin Lake continent-ocean transect includes almost all the geologic, geophysical, and geotectonic elements of the Canadian Cordillera. It crosses the Chugach, Wrangellia, Alexander, Stikine, and Cache Creek terranes, the Gravina and Laberge overlap assemblages, intrusive and metamorphic belts, and neotectonic faults that bound major blocks. -from Authors
77 FR 15369 - Mobility Fund Phase I Auction GIS Data of Potentially Eligible Census Blocks
Federal Register 2010, 2011, 2012, 2013, 2014
2012-03-15
....fcc.gov/auctions/901/ , are the following: Downloadable shapefile Web mapping service MapBox map tiles... GIS software allows you to add this service as a layer to your session or project. 6. MapBox map tiles are cached map tiles of the data. With this open source software approach, these image tiles can be...
Megafloods and Clovis cache at Wenatchee, Washington
NASA Astrophysics Data System (ADS)
Waitt, Richard B.
2016-05-01
Immense late Wisconsin floods from glacial Lake Missoula drowned the Wenatchee reach of Washington's Columbia valley by different routes. The earliest debacles, nearly 19,000 cal yr BP, raged 335 m deep down the Columbia and built high Pangborn bar at Wenatchee. As advancing ice blocked the northwest of Columbia valley, several giant floods descended Moses Coulee and backflooded up the Columbia past Wenatchee. Ice then blocked Moses Coulee, and Grand Coulee to Quincy basin became the westmost floodway. From Quincy basin many Missoula floods backflowed 50 km upvalley to Wenatchee 18,000 to 15,500 years ago. Receding ice dammed glacial Lake Columbia centuries more-till it burst about 15,000 years ago. After Glacier Peak ashfall about 13,600 years ago, smaller great flood(s) swept down the Columbia from glacial Lake Kootenay in British Columbia. The East Wenatchee cache of huge fluted Clovis points had been laid atop Pangborn bar after the Glacier Peak ashfall, then buried by loess. Clovis people came five and a half millennia after the early gigantic Missoula floods, two and a half millennia after the last small Missoula flood, and two millennia after the glacial Lake Columbia flood. People likely saw outburst flood(s) from glacial Lake Kootenay.
Soft-core processor study for node-based architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Houten, Jonathan Roger; Jarosz, Jason P.; Welch, Benjamin James
2008-09-01
Node-based architecture (NBA) designs for future satellite projects hold the promise of decreasing system development time and costs, size, weight, and power and positioning the laboratory to address other emerging mission opportunities quickly. Reconfigurable Field Programmable Gate Array (FPGA) based modules will comprise the core of several of the NBA nodes. Microprocessing capabilities will be necessary with varying degrees of mission-specific performance requirements on these nodes. To enable the flexibility of these reconfigurable nodes, it is advantageous to incorporate the microprocessor into the FPGA itself, either as a hardcore processor built into the FPGA or as a soft-core processor builtmore » out of FPGA elements. This document describes the evaluation of three reconfigurable FPGA based processors for use in future NBA systems--two soft cores (MicroBlaze and non-fault-tolerant LEON) and one hard core (PowerPC 405). Two standard performance benchmark applications were developed for each processor. The first, Dhrystone, is a fixed-point operation metric. The second, Whetstone, is a floating-point operation metric. Several trials were run at varying code locations, loop counts, processor speeds, and cache configurations. FPGA resource utilization was recorded for each configuration. Cache configurations impacted the results greatly; for optimal processor efficiency it is necessary to enable caches on the processors. Processor caches carry a penalty; cache error mitigation is necessary when operating in a radiation environment.« less
Sidhu, Swati; Datta, Aparajita
2015-01-01
Rodents affect the post-dispersal fate of seeds by acting either as on-site seed predators or as secondary dispersers when they scatter-hoard seeds. The tropical forests of north-east India harbour a high diversity of little-studied terrestrial murid and hystricid rodents. We examined the role played by these rodents in determining the seed fates of tropical evergreen tree species in a forest site in north-east India. We selected ten tree species (3 mammal-dispersed and 7 bird-dispersed) that varied in seed size and followed the fates of 10,777 tagged seeds. We used camera traps to determine the identity of rodent visitors, visitation rates and their seed-handling behavior. Seeds of all tree species were handled by at least one rodent taxon. Overall rates of seed removal (44.5%) were much higher than direct on-site seed predation (9.9%), but seed-handling behavior differed between the terrestrial rodent groups: two species of murid rodents removed and cached seeds, and two species of porcupines were on-site seed predators. In addition, a true cricket, Brachytrupes sp., cached seeds of three species underground. We found 309 caches formed by the rodents and the cricket; most were single-seeded (79%) and seeds were moved up to 19 m. Over 40% of seeds were re-cached from primary cache locations, while about 12% germinated in the primary caches. Seed removal rates varied widely amongst tree species, from 3% in Beilschmiedia assamica to 97% in Actinodaphne obovata. Seed predation was observed in nine species. Chisocheton cumingianus (57%) and Prunus ceylanica (25%) had moderate levels of seed predation while the remaining species had less than 10% seed predation. We hypothesized that seed traits that provide information on resource quantity would influence rodent choice of a seed, while traits that determine resource accessibility would influence whether seeds are removed or eaten. Removal rates significantly decreased (p < 0.001) while predation rates increased (p = 0.06) with seed size. Removal rates were significantly lower for soft seeds (p = 0.002), whereas predation rates were significantly higher on soft seeds (p = 0.01). Our results show that murid rodents play a very important role in affecting the seed fates of tropical trees in the Eastern Himalayas. We also found that the different rodent groups differed in their seed handling behavior and responses to changes in seed characteristics. PMID:26247616
Caching Joint Shortcut Routing to Improve Quality of Service for Information-Centric Networking.
Huang, Baixiang; Liu, Anfeng; Zhang, Chengyuan; Xiong, Naixue; Zeng, Zhiwen; Cai, Zhiping
2018-05-29
Hundreds of thousands of ubiquitous sensing (US) devices have provided an enormous number of data for Information-Centric Networking (ICN), which is an emerging network architecture that has the potential to solve a great variety of issues faced by the traditional network. A Caching Joint Shortcut Routing (CJSR) scheme is proposed in this paper to improve the Quality of service (QoS) for ICN. The CJSR scheme mainly has two innovations which are different from other in-network caching schemes: (1) Two routing shortcuts are set up to reduce the length of routing paths. Because of some inconvenient transmission processes, the routing paths of previous schemes are prolonged, and users can only request data from Data Centers (DCs) until the data have been uploaded from Data Producers (DPs) to DCs. Hence, the first kind of shortcut is built from DPs to users directly. This shortcut could release the burden of whole network and reduce delay. Moreover, in the second shortcut routing method, a Content Router (CR) which could yield shorter length of uploading routing path from DPs to DCs is chosen, and then data packets are uploaded through this chosen CR. In this method, the uploading path shares some segments with the pre-caching path, thus the overall length of routing paths is reduced. (2) The second innovation of the CJSR scheme is that a cooperative pre-caching mechanism is proposed so that QoS could have a further increase. Besides being used in downloading routing, the pre-caching mechanism can also be used when data packets are uploaded towards DCs. Combining uploading and downloading pre-caching, the cooperative pre-caching mechanism exhibits high performance in different situations. Furthermore, to address the scarcity of storage size, an algorithm that could make use of storage from idle CRs is proposed. After comparing the proposed scheme with five existing schemes via simulations, experiments results reveal that the CJSR scheme could reduce the total number of processed interest packets by 54.8%, enhance the cache hits of each CR and reduce the number of total hop counts by 51.6% and cut down the length of routing path for users to obtain their interested data by 28.6⁻85.7% compared with the traditional NDN scheme. Moreover, the length of uploading routing path could be decreased by 8.3⁻33.3%.
Enhancement web proxy cache performance using Wrapper Feature Selection methods with NB and J48
NASA Astrophysics Data System (ADS)
Mahmoud Al-Qudah, Dua'a.; Funke Olanrewaju, Rashidah; Wong Azman, Amelia
2017-11-01
Web proxy cache technique reduces response time by storing a copy of pages between client and server sides. If requested pages are cached in the proxy, there is no need to access the server. Due to the limited size and excessive cost of cache compared to the other storages, cache replacement algorithm is used to determine evict page when the cache is full. On the other hand, the conventional algorithms for replacement such as Least Recently Use (LRU), First in First Out (FIFO), Least Frequently Use (LFU), Randomized Policy etc. may discard important pages just before use. Furthermore, using conventional algorithm cannot be well optimized since it requires some decision to intelligently evict a page before replacement. Hence, most researchers propose an integration among intelligent classifiers and replacement algorithm to improves replacement algorithms performance. This research proposes using automated wrapper feature selection methods to choose the best subset of features that are relevant and influence classifiers prediction accuracy. The result present that using wrapper feature selection methods namely: Best First (BFS), Incremental Wrapper subset selection(IWSS)embedded NB and particle swarm optimization(PSO)reduce number of features and have a good impact on reducing computation time. Using PSO enhance NB classifier accuracy by 1.1%, 0.43% and 0.22% over using NB with all features, using BFS and using IWSS embedded NB respectively. PSO rises J48 accuracy by 0.03%, 1.91 and 0.04% over using J48 classifier with all features, using IWSS-embedded NB and using BFS respectively. While using IWSS embedded NB fastest NB and J48 classifiers much more than BFS and PSO. However, it reduces computation time of NB by 0.1383 and reduce computation time of J48 by 2.998.
Massively parallel algorithms for trace-driven cache simulations
NASA Technical Reports Server (NTRS)
Nicol, David M.; Greenberg, Albert G.; Lubachevsky, Boris D.
1991-01-01
Trace driven cache simulation is central to computer design. A trace is a very long sequence of reference lines from main memory. At the t(exp th) instant, reference x sub t is hashed into a set of cache locations, the contents of which are then compared with x sub t. If at the t sup th instant x sub t is not present in the cache, then it is said to be a miss, and is loaded into the cache set, possibly forcing the replacement of some other memory line, and making x sub t present for the (t+1) sup st instant. The problem of parallel simulation of a subtrace of N references directed to a C line cache set is considered, with the aim of determining which references are misses and related statistics. A simulation method is presented for the Least Recently Used (LRU) policy, which regradless of the set size C runs in time O(log N) using N processors on the exclusive read, exclusive write (EREW) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/log N processors. Timings are presented of the second algorithm's implementation on the MasPar MP-1, a machine with 16384 processors. A broad class of reference based line replacement policies are considered, which includes LRU as well as the Least Frequently Used and Random replacement policies. A simulation method is presented for any such policy that on any trace of length N directed to a C line set runs in the O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation.
Performance assessment of EMR systems based on post-relational database.
Yu, Hai-Yan; Li, Jing-Song; Zhang, Xiao-Guang; Tian, Yu; Suzuki, Muneou; Araki, Kenji
2012-08-01
Post-relational databases provide high performance and are currently widely used in American hospitals. As few hospital information systems (HIS) in either China or Japan are based on post-relational databases, here we introduce a new-generation electronic medical records (EMR) system called Hygeia, which was developed with the post-relational database Caché and the latest platform Ensemble. Utilizing the benefits of a post-relational database, Hygeia is equipped with an "integration" feature that allows all the system users to access data-with a fast response time-anywhere and at anytime. Performance tests of databases in EMR systems were implemented in both China and Japan. First, a comparison test was conducted between a post-relational database, Caché, and a relational database, Oracle, embedded in the EMR systems of a medium-sized first-class hospital in China. Second, a user terminal test was done on the EMR system Izanami, which is based on the identical database Caché and operates efficiently at the Miyazaki University Hospital in Japan. The results proved that the post-relational database Caché works faster than the relational database Oracle and showed perfect performance in the real-time EMR system.
NASA Astrophysics Data System (ADS)
Johnsson, L.; Netzer, G.
2016-10-01
Moore's law, the doubling of transistors per unit area for each CMOS technology generation, is expected to continue throughout the decade, while Dennard voltage scaling resulting in constant power per unit area stopped about a decade ago. The semiconductor industry's response to the loss of Dennard scaling and the consequent challenges in managing power distribution and dissipation has been leveled off clock rates, a die performance gain reduced from about a factor of 2.8 to 1.4 per technology generation, and multi-core processor dies with increased cache sizes. Increased caches sizes offers performance benefits for many applications as well as energy savings. Accessing data in cache is considerably more energy efficient than main memory accesses. Further, caches consume less power than a corresponding amount of functional logic. As feature sizes continue to be scaled down an increasing fraction of the die must be “underutilized” or “dark” due to power constraints. With power being a prime design constraint there is a concerted effort to find significantly more energy efficient chip architectures than dominant in servers today, with chips potentially incorporating several types of cores to cover a range of applications, or different functions in an application, as is already common for the mobile processor market. Digital Signal Processors (DSPs), largely targeting the embedded and mobile processor markets, typically have been designed for a power consumption of 10% or less of a typical x86 CPU, yet with much more than 10% of the floating-point capability of the same technology generation x86 CPUs. Thus, DSPs could potentially offer an energy efficient alternative to x86 CPUs. Here we report an assessment of the Texas Instruments TMS320C6678 DSP in regards to its energy efficiency for two common HPC benchmarks: STREAM (memory system benchmark) and HPL (CPU benchmark)
The ontogeny of food-caching behaviour in New Zealand robins (Petroica longipes).
Clark, Lisabertha L; Shaw, Rachael C
2018-06-01
Hoarding or caching behaviour is a widely-used paradigm for examining a range of cognitive processes in birds, such as social cognition and spatial memory. However, much is still unknown about how caching develops in young birds, especially in the wild. Studying the ontogeny of caching in the wild will help researchers to identify the mechanisms that shape this advantageous foraging strategy. We examined the ontogeny of food caching behaviour in a wild New Zealand passerine, the North Island robin (Petroica longipes). For 12-weeks following fledging, we observed 34 juveniles to examine the development of caching and cache retrieval. Additionally, we compared the caching behaviour of juveniles at 12 weeks post-fledging to 35 adult robins to determine whether juveniles had developed adult-like caching behaviour by this age. Juveniles began caching mealworms shortly after achieving foraging independency. Multivariate analyses revealed that caching rate increased and handling time decreased with increasing age. Juveniles spontaneously began retrieving caches as soon as they had begun to cache and their retrieval rates then remained constant throughout their ensuing development. Likewise, the number of sites used by juveniles did not change with age. Juvenile sex, caregiver sex and the duration of post-fledging parental care did not influence the development of caching, cache retrieval, the number of cache sites used and the time juveniles spent handling mealworms. At 12 weeks post-fledging, juveniles demonstrated levels of caching, cache retrieval and cache site usage that were comparable to adults. However, juvenile prey handling time was still longer than adults. The spontaneous emergence of cache retrieval and the consistency in the number of cache sites used throughout development suggests that these aspects of caching in North Island robins are likely to be innate, but that age and experience have an important role in the development of adult caching behaviours. Copyright © 2018 Elsevier B.V. All rights reserved.
Clark’s Nutcrackers (Nucifraga columbiana) Flexibly Adapt Caching Behavior to a Cooperative Context
Clary, Dawson; Kelly, Debbie M.
2016-01-01
Corvids recognize when their caches are at risk of being stolen by others and have developed strategies to protect these caches from pilferage. For instance, Clark’s nutcrackers will suppress the number of caches they make if being observed by a potential thief. However, cache protection has most often been studied using competitive contexts, so it is unclear whether corvids can adjust their caching in beneficial ways to accommodate non-competitive situations. Therefore, we examined whether Clark’s nutcrackers, a non-social corvid, would flexibly adapt their caching behaviors to a cooperative context. To do so, birds were given a caching task during which caches made by one individual were reciprocally exchanged for the caches of a partner bird over repeated trials. In this scenario, if caching behaviors can be flexibly deployed, then the birds should recognize the cooperative nature of the task and maintain or increase caching levels over time. However, if cache protection strategies are applied independent of social context and simply in response to cache theft, then cache suppression should occur. In the current experiment, we found that the birds maintained caching throughout the experiment. We report that males increased caching in response to a manipulation in which caches were artificially added, suggesting the birds could adapt to the cooperative nature of the task. Additionally, we show that caching decisions were not solely due to motivational factors, instead showing an additional influence attributed to the behavior of the partner bird. PMID:27826273
Automatic Blocking Of QR and LU Factorizations for Locality
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yi, Q; Kennedy, K; You, H
2004-03-26
QR and LU factorizations for dense matrices are important linear algebra computations that are widely used in scientific applications. To efficiently perform these computations on modern computers, the factorization algorithms need to be blocked when operating on large matrices to effectively exploit the deep cache hierarchy prevalent in today's computer memory systems. Because both QR (based on Householder transformations) and LU factorization algorithms contain complex loop structures, few compilers can fully automate the blocking of these algorithms. Though linear algebra libraries such as LAPACK provides manually blocked implementations of these algorithms, by automatically generating blocked versions of the computations, moremore » benefit can be gained such as automatic adaptation of different blocking strategies. This paper demonstrates how to apply an aggressive loop transformation technique, dependence hoisting, to produce efficient blockings for both QR and LU with partial pivoting. We present different blocking strategies that can be generated by our optimizer and compare the performance of auto-blocked versions with manually tuned versions in LAPACK, both using reference BLAS, ATLAS BLAS and native BLAS specially tuned for the underlying machine architectures.« less
A Scalable QoS-Aware VoD Resource Sharing Scheme for Next Generation Networks
NASA Astrophysics Data System (ADS)
Huang, Chenn-Jung; Luo, Yun-Cheng; Chen, Chun-Hua; Hu, Kai-Wen
In network-aware concept, applications are aware of network conditions and are adaptable to the varying environment to achieve acceptable and predictable performance. In this work, a solution for video on demand service that integrates wireless and wired networks by using the network aware concepts is proposed to reduce the blocking probability and dropping probability of mobile requests. Fuzzy logic inference system is employed to select appropriate cache relay nodes to cache published video streams and distribute them to different peers through service oriented architecture (SOA). SIP-based control protocol and IMS standard are adopted to ensure the possibility of heterogeneous communication and provide a framework for delivering real-time multimedia services over an IP-based network to ensure interoperability, roaming, and end-to-end session management. The experimental results demonstrate that effectiveness and practicability of the proposed work.
Megafloods and Clovis cache at Wenatchee, Washington
Waitt, Richard B.
2016-01-01
Immense late Wisconsin floods from glacial Lake Missoula drowned the Wenatchee reach of Washington's Columbia valley by different routes. The earliest debacles, nearly 19,000 cal yr BP, raged 335 m deep down the Columbia and built high Pangborn bar at Wenatchee. As advancing ice blocked the northwest of Columbia valley, several giant floods descended Moses Coulee and backflooded up the Columbia past Wenatchee. Ice then blocked Moses Coulee, and Grand Coulee to Quincy basin became the westmost floodway. From Quincy basin many Missoula floods backflowed 50 km upvalley to Wenatchee 18,000 to 15,500 years ago. Receding ice dammed glacial Lake Columbia centuries more—till it burst about 15,000 years ago. After Glacier Peak ashfall about 13,600 years ago, smaller great flood(s) swept down the Columbia from glacial Lake Kootenay in British Columbia. The East Wenatchee cache of huge fluted Clovis points had been laid atop Pangborn bar after the Glacier Peak ashfall, then buried by loess. Clovis people came five and a half millennia after the early gigantic Missoula floods, two and a half millennia after the last small Missoula flood, and two millennia after the glacial Lake Columbia flood. People likely saw outburst flood(s) from glacial Lake Kootenay.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Lingda; Hayes, Ari; Song, Shuaiwen
Modern GPUs employ cache to improve memory system efficiency. However, large amount of cache space is underutilized due to irregular memory accesses and poor spatial locality which exhibited commonly in GPU applications. Our experiments show that using smaller cache lines could improve cache space utilization, but it also frequently suffers from significant performance loss by introducing large amount of extra cache requests. In this work, we propose a novel cache design named tag-split cache (TSC) that enables fine-grained cache storage to address the problem of cache space underutilization while keeping memory request number unchanged. TSC divides tag into two partsmore » to reduce storage overhead, and it supports multiple cache line replacement in one cycle.« less
ASA-FTL: An adaptive separation aware flash translation layer for solid state drives
Xie, Wei; Chen, Yong; Roth, Philip C
2016-11-03
Here, the flash-memory based Solid State Drive (SSD) presents a promising storage solution for increasingly critical data-intensive applications due to its low latency (high throughput), high bandwidth, and low power consumption. Within an SSD, its Flash Translation Layer (FTL) is responsible for exposing the SSD’s flash memory storage to the computer system as a simple block device. The FTL design is one of the dominant factors determining an SSD’s lifespan and performance. To reduce the garbage collection overhead and deliver better performance, we propose a new, low-cost, adaptive separation-aware flash translation layer (ASA-FTL) that combines sampling, data clustering and selectivemore » caching of recency information to accurately identify and separate hot/cold data while incurring minimal overhead. We use sampling for light-weight identification of separation criteria, and our dedicated selective caching mechanism is designed to save the limited RAM resource in contemporary SSDs. Using simulations of ASA-FTL with both real-world and synthetic workloads, we have shown that our proposed approach reduces the garbage collection overhead by up to 28% and the overall response time by 15% compared to one of the most advanced existing FTLs. We find that the data clustering using a small sample size provides significant performance benefit while only incurring a very small computation and memory cost. In addition, our evaluation shows that ASA-FTL is able to adapt to the changes in the access pattern of workloads, which is a major advantage comparing to existing fixed data separation methods.« less
Sample Acquisition and Caching architecture for the Mars Sample Return mission
NASA Astrophysics Data System (ADS)
Zacny, K.; Chu, P.; Cohen, J.; Paulsen, G.; Craft, J.; Szwarc, T.
This paper presents a Mars Sample Return (MSR) Sample Acquisition and Caching (SAC) study developed for the three rover platforms: MER, MER+, and MSL. The study took into account 26 SAC requirements provided by the NASA Mars Exploration Program Office. For this SAC architecture, the reduction of mission risk was chosen by us as having greater priority than mass or volume. For this reason, we selected a “ One Bit per Core” approach. The enabling technology for this architecture is Honeybee Robotics' “ eccentric tubes” core breakoff approach. The breakoff approach allows the drill bits to be relatively small in diameter and in turn lightweight. Hence, the bits could be returned to Earth with the cores inside them with only a modest increase to the total returned mass, but a significant decrease in complexity. Having dedicated bits allows a reduction in the number of core transfer steps and actuators. It also alleviates the bit life problem, eliminates cross contamination, and aids in hermetic sealing. An added advantage is faster drilling time, lower power, lower energy, and lower Weight on Bit (which reduces Arm preload requirements). Drill bits are based on the BigTooth bit concept, which allows re-use of the same bit multiple times, if necessary. The proposed SAC consists of a 1) Rotary-Percussive Core Drill, 2) Bit Storage Carousel, 3) Cache, 4) Robotic Arm, and 5) Rock Abrasion and Brushing Bit (RABBit), which is deployed using the Drill. The system also includes PreView bits (for viewing of cores prior to caching) and Powder bits for acquisition of regolith or cuttings. The SAC total system mass is less than 22 kg for MER and MER+ size rovers and less than 32 kg for the MSL-size rover.
Parallel Semi-Implicit Spectral Element Atmospheric Model
NASA Astrophysics Data System (ADS)
Fournier, A.; Thomas, S.; Loft, R.
2001-05-01
The shallow-water equations (SWE) have long been used to test atmospheric-modeling numerical methods. The SWE contain essential wave-propagation and nonlinear effects of more complete models. We present a semi-implicit (SI) improvement of the Spectral Element Atmospheric Model to solve the SWE (SEAM, Taylor et al. 1997, Fournier et al. 2000, Thomas & Loft 2000). SE methods are h-p finite element methods combining the geometric flexibility of size-h finite elements with the accuracy of degree-p spectral methods. Our work suggests that exceptional parallel-computation performance is achievable by a General-Circulation-Model (GCM) dynamical core, even at modest climate-simulation resolutions (>1o). The code derivation involves weak variational formulation of the SWE, Gauss(-Lobatto) quadrature over the collocation points, and Legendre cardinal interpolators. Appropriate weak variation yields a symmetric positive-definite Helmholtz operator. To meet the Ladyzhenskaya-Babuska-Brezzi inf-sup condition and avoid spurious modes, we use a staggered grid. The SI scheme combines leapfrog and Crank-Nicholson schemes for the nonlinear and linear terms respectively. The localization of operations to elements ideally fits the method to cache-based microprocessor computer architectures --derivatives are computed as collections of small (8x8), naturally cache-blocked matrix-vector products. SEAM also has desirable boundary-exchange communication, like finite-difference models. Timings on on the IBM SP and Compaq ES40 supercomputers indicate that the SI code (20-min timestep) requires 1/3 the CPU time of the explicit code (2-min timestep) for T42 resolutions. Both codes scale nearly linearly out to 400 processors. We achieved single-processor performance up to 30% of peak for both codes on the 375-MHz IBM Power-3 processors. Fast computation and linear scaling lead to a useful climate-simulation dycore only if enough model time is computed per unit wall-clock time. An efficient SI solver is essential to substantially increase this rate. Parallel preconditioning for an iterative conjugate-gradient elliptic solver is described. We are building a GCM dycore capable of 200 GF% lOPS sustained performance on clustered RISC/cache architectures using hybrid MPI/OpenMP programming.
Effects of experience and social context on prospective caching strategies by scrub jays.
Emery, N J; Clayton, N S
2001-11-22
Social life has costs associated with competition for resources such as food. Food storing may reduce this competition as the food can be collected quickly and hidden elsewhere; however, it is a risky strategy because caches can be pilfered by others. Scrub jays (Aphelocoma coerulescens) remember 'what', 'where' and 'when' they cached. Like other corvids, they remember where conspecifics have cached, pilfering them when given the opportunity, but may also adjust their own caching strategies to minimize potential pilfering. To test this, jays were allowed to cache either in private (when the other bird's view was obscured) or while a conspecific was watching, and then recover their caches in private. Here we show that jays with prior experience of pilfering another bird's caches subsequently re-cached food in new cache sites during recovery trials, but only when they had been observed caching. Jays without pilfering experience did not, even though they had observed other jays caching. Our results suggest that jays relate information about their previous experience as a pilferer to the possibility of future stealing by another bird, and modify their caching strategy accordingly.
Cache-Cache Comparison for Supporting Meaningful Learning
ERIC Educational Resources Information Center
Wang, Jingyun; Fujino, Seiji
2015-01-01
The paper presents a meaningful discovery learning environment called "cache-cache comparison" for a personalized learning support system. The processing of seeking hidden relations or concepts in "cache-cache comparison" is intended to encourage learners to actively locate new knowledge in their knowledge framework and check…
Optimizing Maintenance of Constraint-Based Database Caches
NASA Astrophysics Data System (ADS)
Klein, Joachim; Braun, Susanne
Caching data reduces user-perceived latency and often enhances availability in case of server crashes or network failures. DB caching aims at local processing of declarative queries in a DBMS-managed cache close to the application. Query evaluation must produce the same results as if done at the remote database backend, which implies that all data records needed to process such a query must be present and controlled by the cache, i. e., to achieve “predicate-specific” loading and unloading of such record sets. Hence, cache maintenance must be based on cache constraints such that “predicate completeness” of the caching units currently present can be guaranteed at any point in time. We explore how cache groups can be maintained to provide the data currently needed. Moreover, we design and optimize loading and unloading algorithms for sets of records keeping the caching units complete, before we empirically identify the costs involved in cache maintenance.
Ostojić, Ljerka; Legg, Edward W; Brecht, Katharina F; Lange, Florian; Deininger, Chantal; Mendl, Michael; Clayton, Nicola S
2017-01-23
Many corvid species accurately remember the locations where they have seen others cache food, allowing them to pilfer these caches efficiently once the cachers have left the scene [1]. To protect their caches, corvids employ a suite of different cache-protection strategies that limit the observers' visual or acoustic access to the cache site [2,3]. In cases where an observer's sensory access cannot be reduced it has been suggested that cachers might be able to minimise the risk of pilfering if they avoid caching food the observer is most motivated to pilfer [4]. In the wild, corvids have been reported to pilfer others' caches as soon as possible after the caching event [5], such that the cacher might benefit from adjusting its caching behaviour according to the observer's current desire. In the current study, observers pilfered according to their current desire: they preferentially pilfered food that they were not sated on. Cachers adjusted their caching behaviour accordingly: they protected their caches by selectively caching food that observers were not motivated to pilfer. The same cache-protection behaviour was found when cachers could not see on which food the observers were sated. Thus, the cachers' ability to respond to the observer's desire might have been driven by the observer's behaviour at the time of caching. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Pilfering Eurasian jays use visual and acoustic information to locate caches.
Shaw, Rachael C; Clayton, Nicola S
2014-11-01
Pilfering corvids use observational spatial memory to accurately locate caches that they have seen another individual make. Accordingly, many corvid cache-protection strategies limit the transfer of visual information to potential thieves. Eurasian jays (Garrulus glandarius) employ strategies that reduce the amount of visual and auditory information that is available to competitors. Here, we test whether or not the jays recall and use both visual and auditory information when pilfering other birds' caches. When jays had no visual or acoustic information about cache locations, the proportion of available caches that they found did not differ from the proportion expected if jays were searching at random. By contrast, after observing and listening to a conspecific caching in gravel or sand, jays located a greater proportion of caches, searched more frequently in the correct substrate type and searched in fewer empty locations to find the first cache than expected. After only listening to caching in gravel and sand, jays also found a larger proportion of caches and searched in the substrate type where they had heard caching take place more frequently than expected. These experiments demonstrate that Eurasian jays possess observational spatial memory and indicate that pilfering jays may gain information about cache location merely by listening to caching. This is the first evidence that a corvid may use recalled acoustic information to locate and pilfer caches.
The Effects of Cache Modification on Food Caching and Retrieval Behavior by Rats
ERIC Educational Resources Information Center
McKenzie, T.L.B.; Bird, L.R.; Roberts, W.A.
2005-01-01
Rats cached pieces of cheese on four different arms of an eight-arm radial maze. On a retrieval test given 45min later, rats learned to return to arms where food was cached before arms where food had not been cached. Tests were then performed in which cache sites on one side of the maze were always modified (pilfered or degraded), but cache sites…
Arias, Michelle R.; Alpers, Charles N.; Marvin-DiPasquale, Mark C.; Fuller, Christopher C.; Agee, Jennifer L.; Sneed, Michelle; Morita, Andrew Y.; Salas, Antonia
2017-10-31
Cache Creek Settling Basin was constructed in 1937 to trap sediment from Cache Creek before delivery to the Yolo Bypass, a flood conveyance for the Sacramento River system that is tributary to the Sacramento–San Joaquin Delta. Sediment management options being considered by stakeholders in the Cache Creek Settling Basin include sediment excavation; however, that could expose sediments containing elevated mercury concentrations from historical mercury mining in the watershed. In cooperation with the California Department of Water Resources, the U.S. Geological Survey undertook sediment coring campaigns in 2011–12 (1) to describe lateral and vertical distributions of mercury concentrations in deposits of sediment in the Cache Creek Settling Basin and (2) to improve constraint of estimates of the rate of sediment deposition in the basin.Sediment cores were collected in the Cache Creek Settling Basin, Yolo County, California, during October 2011 at 10 locations and during August 2012 at 5 other locations. Total core depths ranged from approximately 4.6 to 13.7 meters (15 to 45 feet), with penetration to about 9.1 meters (30 feet) at most locations. Unsplit cores were logged for two geophysical parameters (gamma bulk density and magnetic susceptibility); then, selected cores were split lengthwise. One half of each core was then photographed and archived, and the other half was subsampled. Initial subsamples from the cores (20-centimeter composite samples from five predetermined depths in each profile) were analyzed for total mercury, methylmercury, total reduced sulfur, iron speciation, organic content (as the percentage of weight loss on ignition), and grain-size distribution. Detailed follow-up subsampling (3-centimeter intervals) was done at six locations along an east-west transect in the southern part of the Cache Creek Settling Basin and at one location in the northern part of the basin for analyses of total mercury; organic content; and cesium-137, which was used for dating. This report documents site characteristics; field and laboratory methods; and results of the analyses of each core section and subsample of these sediment cores, including associated quality-assurance and quality-control data.
Analysis of cache for streaming tape drive
NASA Technical Reports Server (NTRS)
Chinnaswamy, V.
1993-01-01
A tape subsystem consists of a controller and a tape drive. Tapes are used for backup, data interchange, and software distribution. The backup operation is addressed. During a backup operation, data is read from disk, processed in CPU, and then sent to tape. The processing speeds of a disk subsystem, CPU, and a tape subsystem are likely to be different. A powerful CPU can read data from a fast disk, process it, and supply the data to the tape subsystem at a faster rate than the tape subsystem can handle. On the other hand, a slow disk drive and a slow CPU may not be able to supply data fast enough to keep a tape drive busy all the time. The backup process may supply data to tape drive in bursts. Each burst may be followed by an idle period. Depending on the nature of the file distribution in the disk, the input stream to the tape subsystem may vary significantly during backup. To compensate for these differences and optimize the utilization of a tape subsystem, a cache or buffer is introduced in the tape controller. Most of the tape drives today are streaming tape drives. A streaming tape drive goes into reposition when there is no data from the controller. Once the drive goes into reposition, the controller can receive data, but it cannot supply data to the tape drive until the drive completes its reposition. A controller can also receive data from the host and send data to the tape drive at the same time. The relationship of cache size, host transfer rate, drive transfer rate, reposition, and ramp up times for optimal performance of the tape subsystem are investigated. Formulas developed will also show the advantages of cache watermarks to increase the streaming time of the tape drive, maximum loss due to insufficient cache, tradeoffs between cache and reposition times and the effectiveness of cache on a streaming tape drive due to idle times or interruptions due in host transfers. Several mathematical formulas are developed to predict the performance of the tape drive. Some examples are given illustrating the usefulness of these formulas. Finally, a summary and some conclusions are provided.
Turtle: identifying frequent k-mers with cache-efficient algorithms.
Roy, Rajat Shuvro; Bhattacharya, Debashish; Schliep, Alexander
2014-07-15
Counting the frequencies of k-mers in read libraries is often a first step in the analysis of high-throughput sequencing data. Infrequent k-mers are assumed to be a result of sequencing errors. The frequent k-mers constitute a reduced but error-free representation of the experiment, which can inform read error correction or serve as the input to de novo assembly methods. Ideally, the memory requirement for counting should be linear in the number of frequent k-mers and not in the, typically much larger, total number of k-mers in the read library. We present a novel method that balances time, space and accuracy requirements to efficiently extract frequent k-mers even for high-coverage libraries and large genomes such as human. Our method is designed to minimize cache misses in a cache-efficient manner by using a pattern-blocked Bloom filter to remove infrequent k-mers from consideration in combination with a novel sort-and-compact scheme, instead of a hash, for the actual counting. Although this increases theoretical complexity, the savings in cache misses reduce the empirical running times. A variant of method can resort to a counting Bloom filter for even larger savings in memory at the expense of false-negative rates in addition to the false-positive rates common to all Bloom filter-based approaches. A comparison with the state-of-the-art shows reduced memory requirements and running times. The tools are freely available for download at http://bioinformatics.rutgers.edu/Software/Turtle and http://figshare.com/articles/Turtle/791582. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Acquiring Domain Knowledge for Planning by Experimentation
1992-01-01
visel partt holding grinderi visel p=r1 size-of pull LENGTH k size-of part! LENGTH k ufsee-finish pertl side! ROUGH surface-tinish pan] side...prof-not-cached <nodal> <node2>) (current-goal <nodal> ( goali )) (current-goal (node2> <goal2>) (d@.s-top-protection-violat ion <goal2>) (not-does-top
Reducing Router Forwarding Table Size Using Aggregation and Caching
ERIC Educational Resources Information Center
Liu, Yaoqing
2013-01-01
The fast growth of global routing table size has been causing concerns that the Forwarding Information Base (FIB) will not be able to fit in existing routers' expensive line-card memory, and upgrades will lead to a higher cost for network operators and customers. FIB Aggregation, a technique that merges multiple FIB entries into one, is probably…
Checkpointing in speculative versioning caches
Eichenberger, Alexandre E; Gara, Alan; Gschwind, Michael K; Ohmacht, Martin
2013-08-27
Mechanisms for generating checkpoints in a speculative versioning cache of a data processing system are provided. The mechanisms execute code within the data processing system, wherein the code accesses cache lines in the speculative versioning cache. The mechanisms further determine whether a first condition occurs indicating a need to generate a checkpoint in the speculative versioning cache. The checkpoint is a speculative cache line which is made non-speculative in response to a second condition occurring that requires a roll-back of changes to a cache line corresponding to the speculative cache line. The mechanisms also generate the checkpoint in the speculative versioning cache in response to a determination that the first condition has occurred.
Rapid effects of corticosterone on cache recovery in mountain chickadees (Parus gambeli).
Saldanha, C J; Schlinger, B A; Clayton, N S
2000-03-01
Environmental perturbations increase adrenal activity in several vertebrates. Increases in corticosterone may serve as a proximate trigger whereby organisms can rapidly adapt their behavior to survive environmental fluctuations. In food-caching songbirds, inclement weather may present the need to alter caching and/or retrieval behaviors to ensure food supplies. We hypothesized that corticosterone may increase the rate of caching and/or retrieval behaviors in the mountain chickadee, a food-storing songbird, and tested if these potential effects were mediated by alterations in appetite, activity, or memory for cache sites. Corticosterone or vehicle was administered to subjects 5 min prior to either caching or recovery in a naturalistic laboratory paradigm during which we recorded the number of caching events, sites visited, and seeds eaten (caching) or caches recovered, total sites visited, cache-related visits, and non-cache-related visits (recovery). Data were analyzed using nested ANOVA for treatment within sequential trial. There was no effect on any caching behaviors following treatment. However, birds treated with corticosterone during retrieval recovered more seeds and tended to visit more cache-related sites than did controls. Since groups did not differ in the number of seeds eaten or the total number of sites visited, it seems unlikely that corticosterone affected appetite or activity. Rapid surges in corticosterone may increase the efficacy of an underlying memory process for cache sites which is reflected in higher cache recovery in corticosterone-treated birds than in controls. Thus, rapid alterations in plasma corticosterone following environmental change may alter memory-reliant behaviors which promote survival in the food-caching mountain chickadee. Copyright 2000 Academic Press.
Cache write generate for parallel image processing on shared memory architectures.
Wittenbrink, C M; Somani, A K; Chen, C H
1996-01-01
We investigate cache write generate, our cache mode invention. We demonstrate that for parallel image processing applications, the new mode improves main memory bandwidth, CPU efficiency, cache hits, and cache latency. We use register level simulations validated by the UW-Proteus system. Many memory, cache, and processor configurations are evaluated.
Combining instruction prefetching with partial cache locking to improve WCET in real-time systems.
Ni, Fan; Long, Xiang; Wan, Han; Gao, Xiaopeng
2013-01-01
Caches play an important role in embedded systems to bridge the performance gap between fast processor and slow memory. And prefetching mechanisms are proposed to further improve the cache performance. While in real-time systems, the application of caches complicates the Worst-Case Execution Time (WCET) analysis due to its unpredictable behavior. Modern embedded processors often equip locking mechanism to improve timing predictability of the instruction cache. However, locking the whole cache may degrade the cache performance and increase the WCET of the real-time application. In this paper, we proposed an instruction-prefetching combined partial cache locking mechanism, which combines an instruction prefetching mechanism (termed as BBIP) with partial cache locking to improve the WCET estimates of real-time applications. BBIP is an instruction prefetching mechanism we have already proposed to improve the worst-case cache performance and in turn the worst-case execution time. The estimations on typical real-time applications show that the partial cache locking mechanism shows remarkable WCET improvement over static analysis and full cache locking.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh; Zhang, Zhao
With each CMOS technology generation, leakage energy consumption has been dramatically increasing and hence, managing leakage power consumption of large last-level caches (LLCs) has become a critical issue in modern processor design. In this paper, we present EnCache, a novel software-based technique which uses dynamic profiling-based cache reconfiguration for saving cache leakage energy. EnCache uses a simple hardware component called profiling cache, which dynamically predicts energy efficiency of an application for 32 possible cache configurations. Using these estimates, system software reconfigures the cache to the most energy efficient configuration. EnCache uses dynamic cache reconfiguration and hence, it does not requiremore » offline profiling or tuning the parameter for each application. Furthermore, EnCache optimizes directly for the overall memory subsystem (LLC and main memory) energy efficiency instead of the LLC energy efficiency alone. The experiments performed with an x86-64 simulator and workloads from SPEC2006 suite confirm that EnCache provides larger energy saving than a conventional energy saving scheme. For single core and dual-core system configurations, the average savings in memory subsystem energy over a shared baseline configuration are 30.0% and 27.3%, respectively.« less
Combining Instruction Prefetching with Partial Cache Locking to Improve WCET in Real-Time Systems
Ni, Fan; Long, Xiang; Wan, Han; Gao, Xiaopeng
2013-01-01
Caches play an important role in embedded systems to bridge the performance gap between fast processor and slow memory. And prefetching mechanisms are proposed to further improve the cache performance. While in real-time systems, the application of caches complicates the Worst-Case Execution Time (WCET) analysis due to its unpredictable behavior. Modern embedded processors often equip locking mechanism to improve timing predictability of the instruction cache. However, locking the whole cache may degrade the cache performance and increase the WCET of the real-time application. In this paper, we proposed an instruction-prefetching combined partial cache locking mechanism, which combines an instruction prefetching mechanism (termed as BBIP) with partial cache locking to improve the WCET estimates of real-time applications. BBIP is an instruction prefetching mechanism we have already proposed to improve the worst-case cache performance and in turn the worst-case execution time. The estimations on typical real-time applications show that the partial cache locking mechanism shows remarkable WCET improvement over static analysis and full cache locking. PMID:24386133
A two-level cache for distributed information retrieval in search engines.
Zhang, Weizhe; He, Hui; Ye, Jianwei
2013-01-01
To improve the performance of distributed information retrieval in search engines, we propose a two-level cache structure based on the queries of the users' logs. We extract the highest rank queries of users from the static cache, in which the queries are the most popular. We adopt the dynamic cache as an auxiliary to optimize the distribution of the cache data. We propose a distribution strategy of the cache data. The experiments prove that the hit rate, the efficiency, and the time consumption of the two-level cache have advantages compared with other structures of cache.
A Two-Level Cache for Distributed Information Retrieval in Search Engines
Zhang, Weizhe; He, Hui; Ye, Jianwei
2013-01-01
To improve the performance of distributed information retrieval in search engines, we propose a two-level cache structure based on the queries of the users' logs. We extract the highest rank queries of users from the static cache, in which the queries are the most popular. We adopt the dynamic cache as an auxiliary to optimize the distribution of the cache data. We propose a distribution strategy of the cache data. The experiments prove that the hit rate, the efficiency, and the time consumption of the two-level cache have advantages compared with other structures of cache. PMID:24363621
Way-Scaling to Reduce Power of Cache with Delay Variation
NASA Astrophysics Data System (ADS)
Goudarzi, Maziar; Matsumura, Tadayuki; Ishihara, Tohru
The share of leakage in cache power consumption increases with technology scaling. Choosing a higher threshold voltage (Vth) and/or gate-oxide thickness (Tox) for cache transistors improves leakage, but impacts cell delay. We show that due to uncorrelated random within-die delay variation, only some (not all) of cells actually violate the cache delay after the above change. We propose to add a spare cache way to replace delay-violating cache-lines separately in each cache-set. By SPICE and gate-level simulations in a commercial 90nm process, we show that choosing higher Vth, Tox and adding one spare way to a 4-way 16KB cache reduces leakage power by 42%, which depending on the share of leakage in total cache power, gives up to 22.59% and 41.37% reduction of total energy respectively in L1 instruction- and L2 unified-cache with a negligible delay penalty, but without sacrificing cache capacity or timing-yield.
Seedling Establishment of Coast Live Oak in Relation to Seed Caching by Jays
Joe R. McBride; Ed Norberg; Sheauchi Cheng; Ahmad Mossadegh
1991-01-01
The purpose of this study was to simulate the caching of acorns by jays and rodents to see if less costly procedures could be developed for the establishment of coast live oak (Quercus agrifolia). Four treatments [(1) random - single acorn cache, (2) regular - single acorn cache, (3) regular - 5 acorn cache, (4) regular - 10 acorn cache] were planted...
A trace-driven analysis of name and attribute caching in a distributed system
NASA Technical Reports Server (NTRS)
Shirriff, Ken W.; Ousterhout, John K.
1992-01-01
This paper presents the results of simulating file name and attribute caching on client machines in a distributed file system. The simulation used trace data gathered on a network of about 40 workstations. Caching was found to be advantageous: a cache on each client containing just 10 directories had a 91 percent hit rate on name look ups. Entry-based name caches (holding individual directory entries) had poorer performance for several reasons, resulting in a maximum hit rate of about 83 percent. File attribute caching obtained a 90 percent hit rate with a cache on each machine of the attributes for 30 files. The simulations show that maintaining cache consistency between machines is not a significant problem; only 1 in 400 name component look ups required invalidation of a remotely cached entry. Process migration to remote machines had little effect on caching. Caching was less successful in heavily shared and modified directories such as /tmp, but there weren't enough references to /tmp overall to affect the results significantly. We estimate that adding name and attribute caching to the Sprite operating system could reduce server load by 36 percent and the number of network packets by 30 percent.
Kelley, Laura A; Clayton, Nicola S
2017-07-01
Some animals hide food to consume later; however, these caches are susceptible to theft by conspecifics and heterospecifics. Caching animals can use protective strategies to minimize sensory cues available to potential pilferers, such as caching in shaded areas and in quiet substrate. Background matching (where object patterning matches the visual background) is commonly seen in prey animals to reduce conspicuousness, and caching animals may also use this tactic to hide caches, for example, by hiding coloured food in a similar coloured substrate. We tested whether California scrub-jays ( Aphelocoma californica ) camouflage their food in this way by offering them caching substrates that either matched or did not match the colour of food available for caching. We also determined whether this caching behaviour was sensitive to social context by allowing the birds to cache when a conspecific potential pilferer could be both heard and seen (acoustic and visual cues present), or unseen (acoustic cues only). When caching events could be both heard and seen by a potential pilferer, birds cached randomly in matching and non-matching substrates. However, they preferentially hid food in the substrate that matched the food colour when only acoustic cues were present. This is a novel cache protection strategy that also appears to be sensitive to social context. We conclude that studies of cache protection strategies should consider the perceptual capabilities of the cacher and potential pilferers. © 2017 The Author(s).
Consolidation and reconsolidation of memory in black-capped chickadees (Poecile atricapillus).
Barrett, Matthew C; Sherry, David F
2012-12-01
Multiple phases of protein synthesis are necessary for the synaptic modifications that consolidate long-term memory. The reconsolidation hypothesis supposes that information in long-term memory becomes labile and subject to change when retrieved and must be reconsolidated into long-term memory. The current study used the protein synthesis inhibitor anisomycin to examine memory consolidation in birds and to test the reconsolidation hypothesis. Black-capped chickadees store food and usually remember which of their caches they have emptied and which they have left full. In Experiment 1, anisomycin was injected either immediately and 2 hr after food caching, or 4 and 6 hr after food caching. Inhibition of protein synthesis impaired memory for cache sites 24 and 48 hr later. In Experiment 2, it was hypothesized that long-term memory for food caches becomes labile as predicted by the reconsolidation hypothesis when birds search for caches. Anisomycin was administered immediately after chickadees had searched for their caches. Inhibition of protein synthesis should disrupt memory for caches left full if these sites are retrieved from long-term memory and require reconsolidation. Control birds were later more likely to revisit full caches than caches they had emptied. Birds given anisomycin revisited both kinds of caches and did not distinguish between them. This result shows that reconsolidation of full caches into long-term memory is not necessary following search for cache sites, but also shows that protein synthesis-dependent consolidation is required for updating the status of emptied caches.
Application and testing of a procedure to evaluate transferability of habitat suitability criteria
Thomas, Jeff A.; Bovee, Ken D.
1993-01-01
A procedure designed to test the transferability of habitat suitability criteria was evaluated in the Cache la Poudre River, Colorado. Habitat suitability criteria were developed for active adult and juvenile rainbow trout in the South Platte River, Colorado. These criteria were tested by comparing microhabitat use predicted from the criteria with observed microhabitat use by adult rainbow trout in the Cache la Poudre River. A one-sided X2 test, using counts of occupied and unoccupied cells in each suitability classification, was used to test for non-random selection for optimum habitat use over usable habitat and for suitable over unsuitable habitat. Criteria for adult rainbow trout were judged to be transferable to the Cache la Poudre River, but juvenile criteria (applied to adults) were not transferable. Random subsampling of occupied and unoccupied cells was conducted to determine the effect of sample size on the reliability of the test procedure. The incidence of type I and type II errors increased rapidly as the sample size was reduced below 55 occupied and 200 unoccupied cells. Recommended modifications to the procedure included the adoption of a systematic or randomized sampling design and direct measurement of microhabitat variables. With these modifications, the procedure is economical, simple and reliable. Use of the procedure as a quality assurance device in routine applications of the instream flow incremental methodology was encouraged.
Mobility-Aware Caching and Computation Offloading in 5G Ultra-Dense Cellular Networks
Chen, Min; Hao, Yixue; Qiu, Meikang; Song, Jeungeun; Wu, Di; Humar, Iztok
2016-01-01
Recent trends show that Internet traffic is increasingly dominated by content, which is accompanied by the exponential growth of traffic. To cope with this phenomena, network caching is introduced to utilize the storage capacity of diverse network devices. In this paper, we first summarize four basic caching placement strategies, i.e., local caching, Device-to-Device (D2D) caching, Small cell Base Station (SBS) caching and Macrocell Base Station (MBS) caching. However, studies show that so far, much of the research has ignored the impact of user mobility. Therefore, taking the effect of the user mobility into consideration, we proposes a joint mobility-aware caching and SBS density placement scheme (MS caching). In addition, differences and relationships between caching and computation offloading are discussed. We present a design of a hybrid computation offloading and support it with experimental results, which demonstrate improved performance in terms of energy cost. Finally, we discuss the design of an incentive mechanism by considering network dynamics, differentiated user’s quality of experience (QoE) and the heterogeneity of mobile terminals in terms of caching and computing capabilities. PMID:27347975
Mobility-Aware Caching and Computation Offloading in 5G Ultra-Dense Cellular Networks.
Chen, Min; Hao, Yixue; Qiu, Meikang; Song, Jeungeun; Wu, Di; Humar, Iztok
2016-06-25
Recent trends show that Internet traffic is increasingly dominated by content, which is accompanied by the exponential growth of traffic. To cope with this phenomena, network caching is introduced to utilize the storage capacity of diverse network devices. In this paper, we first summarize four basic caching placement strategies, i.e., local caching, Device-to-Device (D2D) caching, Small cell Base Station (SBS) caching and Macrocell Base Station (MBS) caching. However, studies show that so far, much of the research has ignored the impact of user mobility. Therefore, taking the effect of the user mobility into consideration, we proposes a joint mobility-aware caching and SBS density placement scheme (MS caching). In addition, differences and relationships between caching and computation offloading are discussed. We present a design of a hybrid computation offloading and support it with experimental results, which demonstrate improved performance in terms of energy cost. Finally, we discuss the design of an incentive mechanism by considering network dynamics, differentiated user's quality of experience (QoE) and the heterogeneity of mobile terminals in terms of caching and computing capabilities.
A cache-aided multiprocessor rollback recovery scheme
NASA Technical Reports Server (NTRS)
Wu, Kun-Lung; Fuchs, W. Kent
1989-01-01
This paper demonstrates how previous uniprocessor cache-aided recovery schemes can be applied to multiprocessor architectures, for recovering from transient processor failures, utilizing private caches and a global shared memory. As with cache-aided uniprocessor recovery, the multiprocessor cache-aided recovery scheme of this paper can be easily integrated into standard bus-based snoopy cache coherence protocols. A consistent shared memory state is maintained without the necessity of global check-pointing.
A distributed parallel storage architecture and its potential application within EOSDIS
NASA Technical Reports Server (NTRS)
Johnston, William E.; Tierney, Brian; Feuquay, Jay; Butzer, Tony
1994-01-01
We describe the architecture, implementation, use of a scalable, high performance, distributed-parallel data storage system developed in the ARPA funded MAGIC gigabit testbed. A collection of wide area distributed disk servers operate in parallel to provide logical block level access to large data sets. Operated primarily as a network-based cache, the architecture supports cooperation among independently owned resources to provide fast, large-scale, on-demand storage to support data handling, simulation, and computation.
Accessing Data Federations with CVMFS
Weitzel, Derek; Bockelman, Brian; Dykstra, Dave; ...
2017-11-23
Data federations have become an increasingly common tool for large collaborations such as CMS and Atlas to efficiently distribute large data files. Unfortunately, these typically are implemented with weak namespace semantics and a non-POSIX API. On the other hand, CVMFS has provided a POSIX-compliant read-only interface for use cases with a small working set size (such as software distribution). The metadata required for the CVMFS POSIX interface is distributed through a caching hierarchy, allowing it to scale to the level of about a hundred thousand hosts. In this paper, we will describe our contributions to CVMFS that merges the datamore » scalability of XRootD-based data federations (such as AAA) with metadata scalability and POSIX interface of CVMFS. We modified CVMFS so it can serve unmodified files without copying them to the repository server. CVMFS 2.2.0 is also able to redirect requests for data files to servers outside of the CVMFS content distribution network. Finally, we added the ability to manage authorization and authentication using security credentials such as X509 proxy certificates. We combined these modifications with the OSGs StashCache regional XRootD caching infrastructure to create a cached data distribution network. Here, we will show performance metrics accessing the data federation through CVMFS compared to direct data federation access. Additionally, we will discuss the improved user experience of providing access to a data federation through a POSIX filesystem.« less
Accessing Data Federations with CVMFS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weitzel, Derek; Bockelman, Brian; Dykstra, Dave
Data federations have become an increasingly common tool for large collaborations such as CMS and Atlas to efficiently distribute large data files. Unfortunately, these typically are implemented with weak namespace semantics and a non-POSIX API. On the other hand, CVMFS has provided a POSIX-compliant read-only interface for use cases with a small working set size (such as software distribution). The metadata required for the CVMFS POSIX interface is distributed through a caching hierarchy, allowing it to scale to the level of about a hundred thousand hosts. In this paper, we will describe our contributions to CVMFS that merges the datamore » scalability of XRootD-based data federations (such as AAA) with metadata scalability and POSIX interface of CVMFS. We modified CVMFS so it can serve unmodified files without copying them to the repository server. CVMFS 2.2.0 is also able to redirect requests for data files to servers outside of the CVMFS content distribution network. Finally, we added the ability to manage authorization and authentication using security credentials such as X509 proxy certificates. We combined these modifications with the OSGs StashCache regional XRootD caching infrastructure to create a cached data distribution network. Here, we will show performance metrics accessing the data federation through CVMFS compared to direct data federation access. Additionally, we will discuss the improved user experience of providing access to a data federation through a POSIX filesystem.« less
Accessing Data Federations with CVMFS
NASA Astrophysics Data System (ADS)
Weitzel, Derek; Bockelman, Brian; Dykstra, Dave; Blomer, Jakob; Meusel, Ren
2017-10-01
Data federations have become an increasingly common tool for large collaborations such as CMS and Atlas to efficiently distribute large data files. Unfortunately, these typically are implemented with weak namespace semantics and a non-POSIX API. On the other hand, CVMFS has provided a POSIX-compliant read-only interface for use cases with a small working set size (such as software distribution). The metadata required for the CVMFS POSIX interface is distributed through a caching hierarchy, allowing it to scale to the level of about a hundred thousand hosts. In this paper, we will describe our contributions to CVMFS that merges the data scalability of XRootD-based data federations (such as AAA) with metadata scalability and POSIX interface of CVMFS. We modified CVMFS so it can serve unmodified files without copying them to the repository server. CVMFS 2.2.0 is also able to redirect requests for data files to servers outside of the CVMFS content distribution network. Finally, we added the ability to manage authorization and authentication using security credentials such as X509 proxy certificates. We combined these modifications with the OSGs StashCache regional XRootD caching infrastructure to create a cached data distribution network. We will show performance metrics accessing the data federation through CVMFS compared to direct data federation access. Additionally, we will discuss the improved user experience of providing access to a data federation through a POSIX filesystem.
Gara, Alan; Ohmacht, Martin
2014-09-16
In a multiprocessor system with at least two levels of cache, a speculative thread may run on a core processor in parallel with other threads. When the thread seeks to do a write to main memory, this access is to be written through the first level cache to the second level cache. After the write though, the corresponding line is deleted from the first level cache and/or prefetch unit, so that any further accesses to the same location in main memory have to be retrieved from the second level cache. The second level cache keeps track of multiple versions of data, where more than one speculative thread is running in parallel, while the first level cache does not have any of the versions during speculation. A switch allows choosing between modes of operation of a speculation blind first level cache.
Re-caching by Western scrub-jays (Aphelocoma californica) cannot be attributed to stress.
Thom, James M; Clayton, Nicola S
2013-01-01
Western scrub-jays (Aphelocoma californica) live double lives, storing food for the future while raiding the stores of other birds. One tactic scrub-jays employ to protect stores is "re-caching"-relocating caches out of sight of would-be thieves. Recent computational modelling work suggests that re-caching might be mediated not by complex cognition, but by a combination of memory failure and stress. The "Stress Model" asserts that re-caching is a manifestation of a general drive to cache, rather than a desire to protect existing stores. Here, we present evidence strongly contradicting the central assumption of these models: that stress drives caching, irrespective of social context. In Experiment (i), we replicate the finding that scrub-jays preferentially relocate food they were watched hiding. In Experiment (ii) we find no evidence that stress increases caching. In light of our results, we argue that the Stress Model cannot account for scrub-jay re-caching.
Jakopak, Rhiannon P.; Hall, L. Embere; Chalfoun, Anna D.
2017-01-01
Many mammals create food stores to enhance overwinter survival in seasonal environments. Strategic arrangement of food within caches may facilitate the physical integrity of the cache or improve access to high-quality food to ensure that cached resources meet future nutritional demands. We used the American pika (Ochotona princeps), a food-caching lagomorph, to evaluate variation in haypile (cache) structure (i.e., horizontal layering by plant functional group) in Wyoming, United States. Fifty-five percent of 62 haypiles contained at least 2 discrete layers of vegetation. Adults and juveniles layered haypiles in similar proportions. The probability of layering increased with haypile volume, but not haypile number per individual or nearby forage diversity. Vegetation cached in layered haypiles was also higher in nitrogen compared to vegetation in unlayered piles. We found that American pikas frequently structured their food caches, structured caches were larger, and the cached vegetation in structured piles was of higher nutritional quality. Improving access to stable, high-quality vegetation in haypiles, a critical overwinter food resource, may allow individuals to better persist amidst harsh conditions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh; Zhang, Zhao; Vetter, Jeffrey S
Recent trends of CMOS scaling and use of large last level caches (LLCs) have led to significant increase in the leakage energy consumption of LLCs and hence, managing their energy consumption has become extremely important in modern processor design. The conventional cache energy saving techniques require offline profiling or provide only coarse granularity of cache allocation. We present FlexiWay, a cache energy saving technique which uses dynamic cache reconfiguration. FlexiWay logically divides the cache sets into multiple (e.g. 16) modules and dynamically turns off suitable and possibly different number of cache ways in each module. FlexiWay has very small implementationmore » overhead and it provides fine-grain cache allocation even with caches of typical associativity, e.g. an 8-way cache. Microarchitectural simulations have been performed using an x86-64 simulator and workloads from SPEC2006 suite. Also, FlexiWay has been compared with two conventional energy saving techniques. The results show that FlexiWay provides largest energy saving and incurs only small loss in performance. For single, dual and quad core systems, the average energy saving using FlexiWay are 26.2%, 25.7% and 22.4%, respectively.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Millar, A. P.; Baranova, T.; Behrmann, G.
For over a decade, dCache has been synonymous with large-capacity, fault-tolerant storage using commodity hardware that supports seamless data migration to and from tape. In this paper we provide some recent news of changes within dCache and the community surrounding it. We describe the flexible nature of dCache that allows both externally developed enhancements to dCache facilities and the adoption of new technologies. Finally, we present information about avenues the dCache team is exploring for possible future improvements in dCache.
A Distributed Cache Update Deployment Strategy in CDN
NASA Astrophysics Data System (ADS)
E, Xinhua; Zhu, Binjie
2018-04-01
The CDN management system distributes content objects to the edge of the internet to achieve the user's near access. Cache strategy is an important problem in network content distribution. A cache strategy was designed in which the content effective diffusion in the cache group, so more content was storage in the cache, and it improved the group hit rate.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clark, M. A.; Strelchenko, Alexei; Vaquero, Alejandro
Lattice quantum chromodynamics simulations in nuclear physics have benefited from a tremendous number of algorithmic advances such as multigrid and eigenvector deflation. These improve the time to solution but do not alleviate the intrinsic memory-bandwidth constraints of the matrix-vector operation dominating iterative solvers. Batching this operation for multiple vectors and exploiting cache and register blocking can yield a super-linear speed up. Block-Krylov solvers can naturally take advantage of such batched matrix-vector operations, further reducing the iterations to solution by sharing the Krylov space between solves. However, practical implementations typically suffer from the quadratic scaling in the number of vector-vector operations.more » Using the QUDA library, we present an implementation of a block-CG solver on NVIDIA GPUs which reduces the memory-bandwidth complexity of vector-vector operations from quadratic to linear. We present results for the HISQ discretization, showing a 5x speedup compared to highly-optimized independent Krylov solves on NVIDIA's SaturnV cluster.« less
Smart caching based on mobile agent of power WebGIS platform.
Wang, Xiaohui; Wu, Kehe; Chen, Fei
2013-01-01
Power information construction is developing towards intensive, platform, distributed direction with the expansion of power grid and improvement of information technology. In order to meet the trend, power WebGIS was designed and developed. In this paper, we first discuss the architecture and functionality of power WebGIS, and then we study caching technology in detail, which contains dynamic display cache model, caching structure based on mobile agent, and cache data model. We have designed experiments of different data capacity to contrast performance between WebGIS with the proposed caching model and traditional WebGIS. The experimental results showed that, with the same hardware environment, the response time of WebGIS with and without caching model increased as data capacity growing, while the larger the data was, the higher the performance of WebGIS with proposed caching model improved.
Elements of episodic-like memory in animals.
Clayton, N S; Griffiths, D P; Emery, N J; Dickinson, A
2001-09-29
A number of psychologists have suggested that episodic memory is a uniquely human phenomenon and, until recently, there was little evidence that animals could recall a unique past experience and respond appropriately. Experiments on food-caching memory in scrub jays question this assumption. On the basis of a single caching episode, scrub jays can remember when and where they cached a variety of foods that differ in the rate at which they degrade, in a way that is inexplicable by relative familiarity. They can update their memory of the contents of a cache depending on whether or not they have emptied the cache site, and can also remember where another bird has hidden caches, suggesting that they encode rich representations of the caching event. They make temporal generalizations about when perishable items should degrade and also remember the relative time since caching when the same food is cached in distinct sites at different times. These results show that jays form integrated memories for the location, content and time of caching. This memory capability fulfils Tulving's behavioural criteria for episodic memory and is thus termed 'episodic-like'. We suggest that several features of episodic memory may not be unique to humans.
Pravosudov, V V; Clayton, N S
2001-02-22
Birds rely, at least in part, on spatial memory for recovering previously hidden caches but accurate cache recovery may be more critical for birds that forage in harsh conditions where the food supply is limited and unpredictable. Failure to find caches in these conditions may potentially result in death from starvation. In order to test this hypothesis we compared the cache recovery behaviour of 24 wild-caught mountain chickadees (Poecile gambeli), half of which were maintained on a limited and unpredictable food supply while the rest were maintained on an ad libitum food supply for 60 days. We then tested their cache retrieval accuracy by allowing birds from both groups to cache seeds in the experimental room and recover them 5 hours later. Our results showed that birds maintained on a limited and unpredictable food supply made significantly fewer visits to non-cache sites when recovering their caches compared to birds maintained on ad libitum food. We found the same difference in performance in two versions of a one-trial associative learning task in which the birds had to rely on memory to find previously encountered hidden food. In a non-spatial memory version of the task, in which the baited feeder was clearly marked, there were no significant differences between the two groups. We therefore concluded that the two groups differed in their efficiency at cache retrieval. We suggest that this difference is more likely to be attributable to a difference in memory (encoding or recall) than to a difference in their motivation to search for hidden food, although the possibility of some motivational differences still exists. Overall, our results suggest that demanding foraging conditions favour more accurate cache retrieval in food-caching birds.
Determinants of seed removal distance by scatter-hoarding rodents in deciduous forests.
Moore, Jeffrey E; McEuen, Amy B; Swihart, Robert K; Contreras, Thomas A; Steele, Michael A
2007-10-01
Scatter-hoarding rodents should space food caches to maximize cache recovery rate (to minimize loss to pilferers) relative to the energetic cost of carrying food items greater distances. Optimization models of cache spacing make two predictions. First, spacing of caches should be greater for food items with greater energy content. Second, the mean distance between caches should increase with food abundance. However, the latter prediction fails to account for the effect of food abundance on the behavior of potential pilferers or on the ability of caching individuals to acquire food by means other than recovering their own caches. When considering these factors, shorter cache distances may be predicted in conditions of higher food abundance. We predicted that seed caching distances would be greater for food items of higher energy content and during lower ambient food abundance and that the effect of seed type on cache distance variation would be lower during higher food abundance. We recorded distances moved for 8636 seeds of five seed types at 15 locations in three forested sites in Pennsylvania, USA, and 29 forest fragments in Indiana, U.S.A., across five different years. Seed production was poor in three years and high in two years. Consistent with previous studies, seeds with greater energy content were moved farther than less profitable food items. Seeds were dispersed less far in seed-rich years than in seed-poor years, contrary to predictions of conventional models. Interactions were important, with seed type effects more evident in seed-poor years. These results suggest that, when food is superabundant, optimal cache distances are more strongly determined by minimizing energy cost of caching than by minimizing pilfering rates and that cache loss rates may be more strongly density-dependent in times of low seed abundance.
Addressing Inter-set Write-Variation for Improving Lifetime of Non-Volatile Caches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh; Vetter, Jeffrey S
We propose a technique which minimizes inter-set write variation in NVM caches for improving its lifetime. Our technique uses cache coloring scheme to add a software-controlled mapping layer between groups of physical pages (called memory regions) and cache sets. Periodically, the number of writes to different colors of the cache is computed and based on this result, the mapping of a few colors is changed to channel the write traffic to least utilized cache colors. This change helps to achieve wear-leveling.
NASA Astrophysics Data System (ADS)
Steele, Michael A.; Bugdal, Melissa; Yuan, Amy; Bartlow, Andrew; Buzalewski, Jarrod; Lichti, Nathan; Swihart, Robert
2011-11-01
Scatter-hoarding mammals are thought to rely on spatial memory to relocate food caches. Yet, we know little about how long these granivores (primarily rodents) recall specific cache locations or whether individual hoarders have an advantage when recovering their own caches. Indeed, a few recent studies suggest that high rates of pilferage are common and that individual hoarders may not have a retriever's advantage. We tested this hypothesis in a high-density (>7 animals/ha) population of eastern gray squirrels ( Sciurus carolinensis) by presenting individually marked animals (>20) with tagged acorns, mapping cache sites, and following the fate of seed caches. PIT tags allowed us to monitor individual seeds without disturbing cache sites. Acorns only remained in the caches for 12-119 h (0.5-5 d). However, when we live-trapped and removed some animals from the site immediately after they stored seeds (thus simulating predation), their seed caches remained intact for significantly longer periods (16-27 d). Cache duration corresponded roughly to the time at which squirrels were returned to the study area. These results suggest that squirrels have a retriever's advantage and may remember specific cache sites longer than previously thought. We further suggest that predation of scatter hoarders who store seeds for long periods and also possess a recovery advantage may be one important mechanism by which seed establishment is achieved.
Smart Caching Based on Mobile Agent of Power WebGIS Platform
Wang, Xiaohui; Wu, Kehe; Chen, Fei
2013-01-01
Power information construction is developing towards intensive, platform, distributed direction with the expansion of power grid and improvement of information technology. In order to meet the trend, power WebGIS was designed and developed. In this paper, we first discuss the architecture and functionality of power WebGIS, and then we study caching technology in detail, which contains dynamic display cache model, caching structure based on mobile agent, and cache data model. We have designed experiments of different data capacity to contrast performance between WebGIS with the proposed caching model and traditional WebGIS. The experimental results showed that, with the same hardware environment, the response time of WebGIS with and without caching model increased as data capacity growing, while the larger the data was, the higher the performance of WebGIS with proposed caching model improved. PMID:24288504
Integrating deliberative planning in a robot architecture
NASA Technical Reports Server (NTRS)
Elsaesser, Chris; Slack, Marc G.
1994-01-01
The role of planning and reactive control in an architecture for autonomous agents is discussed. The postulated architecture seperates the general robot intelligence problem into three interacting pieces: (1) robot reactive skills, i.e., grasping, object tracking, etc.; (2) a sequencing capability to differentially ativate the reactive skills; and (3) a delibrative planning capability to reason in depth about goals, preconditions, resources, and timing constraints. Within the sequencing module, caching techniques are used for handling routine activities. The planning system then builds on these cached solutions to routine tasks to build larger grain sized primitives. This eliminates large numbers of essentially linear planning problems. The architecture will be used in the future to incorporate in robots cognitive capabilites normally associated with intelligent behavior.
Pravosudov, Vladimir V
2003-12-22
It is widely assumed that chronic stress and corresponding chronic elevations of glucocorticoid levels have deleterious effects on animals' brain functions such as learning and memory. Some animals, however, appear to maintain moderately elevated levels of glucocorticoids over long periods of time under natural energetically demanding conditions, and it is not clear whether such chronic but moderate elevations may be adaptive. I implanted wild-caught food-caching mountain chickadees (Poecile gambeli), which rely at least in part on spatial memory to find their caches, with 90-day continuous time-release corticosterone pellets designed to approximately double the baseline corticosterone levels. Corticosterone-implanted birds cached and consumed significantly more food and showed more efficient cache recovery and superior spatial memory performance compared with placebo-implanted birds. Thus, contrary to prevailing assumptions, long-term moderate elevations of corticosterone appear to enhance spatial memory in food-caching mountain chickadees. These results suggest that moderate chronic elevation of corticosterone may serve as an adaptation to unpredictable environments by facilitating feeding and food-caching behaviour and by improving cache-retrieval efficiency in food-caching birds.
NASA Astrophysics Data System (ADS)
Li, Hao; Xie, Lunguo
2013-03-01
The design of cache system for Chip Multiprocessor (CMP) face many challenges because future CMPs will have more cores and greater on-chip cache capacity. There are two base design schemes about L2 cache: private scheme in which each L2 slice is treated as a private L2 cache and shared scheme in which all L2 slices are treated as a large L2 cache shared by all cores. Private caches provide the lowest hit latency but reduce the total effective cache capacity. A shared L2 cache increases the effective cache capacity but has long hit latencies when data is on a remote tile. This paper present a new Controlled Replication (CR) policy to reduce the capacities occupied by redundant shared replicas. the new CR policy increases the effective capacity than victim replication scheme and has lower hit latency than shared scheme. We evaluate the various schemes using full-system simulation of parallel applications. Results show that CR reduces the average memory access latency of shared scheme by an average of 13%, providing better overall performance than victim replication and shared schemes.
Zhang, Mingming; Steele, Michael A; Yi, Xianfeng
2013-11-01
The question of how tannin affects feeding and hoarding preferences of rodents still remains poorly understood, in part, because it is difficult to control for other seed traits when considering the sole effect of tannin. Here, we constructed a series of artificial 'seeds' with different tannin levels, made from wheat flour, peanut powder and hydrolysable tannins, to determine the direct effects of tannin on both feeding and hoarding preferences. We first presented 'seeds' to individual rodents of two species (Tamias sibiricus and Apodemus peninsulae) confined in semi-natural enclosures and then monitored patterns of seed dispersal and consumption by free-ranging animals in a temperate forest in the Xiaoxing'an Mountains, Heilongjiang Province of China. Our results showed that small rodents displayed a significant preference for low-tannin 'seeds' for both consumption and caching in both captive and field experiments. Moreover, our two-year study consistently showed that tannin concentration was significantly and negatively correlated with the number of cached 'seeds' at both the individual and population levels. Seed size, compared with tannin concentrations, appeared to have little effect on dispersal distances and the number of 'seeds' cached. Low-tannin 'seeds' tended to be dispersed greater distances by rodents in the field than those with higher levels of tannin. These results failed to support those of previous reports indicating that acorns containing higher tannins are more likely to be cached by food hoarding animals. Copyright © 2013 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boyle, Peter; Christ, Norman; Gara, Alan
A list prefetch engine improves a performance of a parallel computing system. The list prefetch engine receives a current cache miss address. The list prefetch engine evaluates whether the current cache miss address is valid. If the current cache miss address is valid, the list prefetch engine compares the current cache miss address and a list address. A list address represents an address in a list. A list describes an arbitrary sequence of prior cache miss addresses. The prefetch engine prefetches data according to the list, if there is a match between the current cache miss address and the listmore » address.« less
Boyle, Peter [Edinburgh, GB; Christ, Norman [Irvington, NY; Gara, Alan [Yorktown Heights, NY; Kim,; Changhoan, [San Jose, CA; Mawhinney, Robert [New York, NY; Ohmacht, Martin [Yorktown Heights, NY; Sugavanam, Krishnan [Yorktown Heights, NY
2012-08-28
A list prefetch engine improves a performance of a parallel computing system. The list prefetch engine receives a current cache miss address. The list prefetch engine evaluates whether the current cache miss address is valid. If the current cache miss address is valid, the list prefetch engine compares the current cache miss address and a list address. A list address represents an address in a list. A list describes an arbitrary sequence of prior cache miss addresses. The prefetch engine prefetches data according to the list, if there is a match between the current cache miss address and the list address.
Analysis of DNS Cache Effects on Query Distribution
2013-01-01
This paper studies the DNS cache effects that occur on query distribution at the CN top-level domain (TLD) server. We first filter out the malformed DNS queries to purify the log data pollution according to six categories. A model for DNS resolution, more specifically DNS caching, is presented. We demonstrate the presence and magnitude of DNS cache effects and the cache sharing effects on the request distribution through analytic model and simulation. CN TLD log data results are provided and analyzed based on the cache model. The approximate TTL distribution for domain name is inferred quantificationally. PMID:24396313
Analysis of DNS cache effects on query distribution.
Wang, Zheng
2013-01-01
This paper studies the DNS cache effects that occur on query distribution at the CN top-level domain (TLD) server. We first filter out the malformed DNS queries to purify the log data pollution according to six categories. A model for DNS resolution, more specifically DNS caching, is presented. We demonstrate the presence and magnitude of DNS cache effects and the cache sharing effects on the request distribution through analytic model and simulation. CN TLD log data results are provided and analyzed based on the cache model. The approximate TTL distribution for domain name is inferred quantificationally.
An Adaptive Insertion and Promotion Policy for Partitioned Shared Caches
NASA Astrophysics Data System (ADS)
Mahrom, Norfadila; Liebelt, Michael; Raof, Rafikha Aliana A.; Daud, Shuhaizar; Hafizah Ghazali, Nur
2018-03-01
Cache replacement policies in chip multiprocessors (CMP) have been investigated extensively and proven able to enhance shared cache management. However, competition among multiple processors executing different threads that require simultaneous access to a shared memory may cause cache contention and memory coherence problems on the chip. These issues also exist due to some drawbacks of the commonly used Least Recently Used (LRU) policy employed in multiprocessor systems, which are because of the cache lines residing in the cache longer than required. In image processing analysis of for example extra pulmonary tuberculosis (TB), an accurate diagnosis for tissue specimen is required. Therefore, a fast and reliable shared memory management system to execute algorithms for processing vast amount of specimen image is needed. In this paper, the effects of the cache replacement policy in a partitioned shared cache are investigated. The goal is to quantify whether better performance can be achieved by using less complex replacement strategies. This paper proposes a Middle Insertion 2 Positions Promotion (MI2PP) policy to eliminate cache misses that could adversely affect the access patterns and the throughput of the processors in the system. The policy employs a static predefined insertion point, near distance promotion, and the concept of ownership in the eviction policy to effectively improve cache thrashing and to avoid resource stealing among the processors.
Comparing NetCDF and SciDB on managing and querying 5D hydrologic dataset
NASA Astrophysics Data System (ADS)
Liu, Haicheng; Xiao, Xiao
2016-11-01
Efficiently extracting information from high dimensional hydro-meteorological modelling datasets requires smart solutions. Traditional methods are mostly based on files, which can be edited and accessed handily. But they have problems of efficiency due to contiguous storage structure. Others propose databases as an alternative for advantages such as native functionalities for manipulating multidimensional (MD) arrays, smart caching strategy and scalability. In this research, NetCDF file based solutions and the multidimensional array database management system (DBMS) SciDB applying chunked storage structure are benchmarked to determine the best solution for storing and querying 5D large hydrologic modelling dataset. The effect of data storage configurations including chunk size, dimension order and compression on query performance is explored. Results indicate that dimension order to organize storage of 5D data has significant influence on query performance if chunk size is very large. But the effect becomes insignificant when chunk size is properly set. Compression of SciDB mostly has negative influence on query performance. Caching is an advantage but may be influenced by execution of different query processes. On the whole, NetCDF solution without compression is in general more efficient than the SciDB DBMS.
Cache Energy Optimization Techniques For Modern Processors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh
2013-01-01
Modern multicore processors are employing large last-level caches, for example Intel's E7-8800 processor uses 24MB L3 cache. Further, with each CMOS technology generation, leakage energy has been dramatically increasing and hence, leakage energy is expected to become a major source of energy dissipation, especially in last-level caches (LLCs). The conventional schemes of cache energy saving either aim at saving dynamic energy or are based on properties specific to first-level caches, and thus these schemes have limited utility for last-level caches. Further, several other techniques require offline profiling or per-application tuning and hence are not suitable for product systems. In thismore » book, we present novel cache leakage energy saving schemes for single-core and multicore systems; desktop, QoS, real-time and server systems. Also, we present cache energy saving techniques for caches designed with both conventional SRAM devices and emerging non-volatile devices such as STT-RAM (spin-torque transfer RAM). We present software-controlled, hardware-assisted techniques which use dynamic cache reconfiguration to configure the cache to the most energy efficient configuration while keeping the performance loss bounded. To profile and test a large number of potential configurations, we utilize low-overhead, micro-architecture components, which can be easily integrated into modern processor chips. We adopt a system-wide approach to save energy to ensure that cache reconfiguration does not increase energy consumption of other components of the processor. We have compared our techniques with state-of-the-art techniques and have found that our techniques outperform them in terms of energy efficiency and other relevant metrics. The techniques presented in this book have important applications in improving energy-efficiency of higher-end embedded, desktop, QoS, real-time, server processors and multitasking systems. This book is intended to be a valuable guide for both newcomers and veterans in the field of cache power management. It will help graduate students, CAD tool developers and designers in understanding the need of energy efficiency in modern computing systems. Further, it will be useful for researchers in gaining insights into algorithms and techniques for micro-architectural and system-level energy optimization using dynamic cache reconfiguration. We sincerely believe that the ``food for thought'' presented in this book will inspire the readers to develop even better ideas for designing ``green'' processors of tomorrow.« less
Cache as point of coherence in multiprocessor system
Blumrich, Matthias A.; Ceze, Luis H.; Chen, Dong; Gara, Alan; Heidelberger, Phlip; Ohmacht, Martin; Steinmacher-Burow, Burkhard; Zhuang, Xiaotong
2016-11-29
In a multiprocessor system, a conflict checking mechanism is implemented in the L2 cache memory. Different versions of speculative writes are maintained in different ways of the cache. A record of speculative writes is maintained in the cache directory. Conflict checking occurs as part of directory lookup. Speculative versions that do not conflict are aggregated into an aggregated version in a different way of the cache. Speculative memory access requests do not go to main memory.
A Refreshable, On-line Cache for HST Data Retrieval
NASA Astrophysics Data System (ADS)
Fraquelli, Dorothy A.; Ellis, Tracy A.; Ridgaway, Michael; DPAS Team
2016-01-01
We discuss upgrades to the HST Data Processing System, with an emphasis on the changes Hubble Space Telescope (HST) Archive users will experience. In particular, data are now held on-line (in a cache) removing the need to reprocess the data every time they are requested from the Archive. OTFR (on the fly reprocessing) has been replaced by a reprocessing system, which runs in the background. Data in the cache are automatically placed in the reprocessing queue when updated calibration reference files are received or when an improved calibration algorithm is installed. Data in the on-line cache are expected to be the most up to date version. These changes were phased in throughout 2015 for all active instruments.The on-line cache was populated instrument by instrument over the course of 2015. As data were placed in the cache, the flag that triggers OTFR was reset so that OTFR no longer runs on these data. "Hybrid" requests to the Archive are handled transparently, with data not yet in the cache provided via OTFR and the remaining data provided from the cache. Users do not need to make separate requests.Users of the MAST Portal will be able to download data from the cache immediately. For data not in the cache, the Portal will send the user to the standard "Retrieval Options Page," allowing the user to direct the Archive to process and deliver the data.The classic MAST Search and Retrieval interface has the same look and feel as previously. Minor changes, unrelated to the cache, have been made to the format of the Retrieval Options Page.
An Intelligent Cloud Storage Gateway for Medical Imaging.
Viana-Ferreira, Carlos; Guerra, António; Silva, João F; Matos, Sérgio; Costa, Carlos
2017-09-01
Historically, medical imaging repositories have been supported by indoor infrastructures. However, the amount of diagnostic imaging procedures has continuously increased over the last decades, imposing several challenges associated with the storage volume, data redundancy and availability. Cloud platforms are focused on delivering hardware and software services over the Internet, becoming an appealing solution for repository outsourcing. Although this option may bring financial and technological benefits, it also presents new challenges. In medical imaging scenarios, communication latency is a critical issue that still hinders the adoption of this paradigm. This paper proposes an intelligent Cloud storage gateway that optimizes data access times. This is achieved through a new cache architecture that combines static rules and pattern recognition for eviction and prefetching. The evaluation results, obtained from experiments over a real-world dataset, show that cache hit ratios can reach around 80%, leading to reductions of image retrieval times by over 60%. The combined use of eviction and prefetching policies proposed can significantly reduce communication latency, even when using a small cache in comparison to the total size of the repository. Apart from the performance gains, the proposed system is capable of adjusting to specific workflows of different institutions.
WriteShield: A Pseudo Thin Client for Prevention of Information Leakage
NASA Astrophysics Data System (ADS)
Kirihata, Yasuhiro; Sameshima, Yoshiki; Onoyama, Takashi; Komoda, Norihisa
While thin-client systems are diffusing as an effective security method in enterprises and organizations, there is a new approach called pseudo thin-client system. In this system, local disks of clients are write-protected and user data is forced to save on the central file server to realize the same security effect of conventional thin-client systems. Since it takes purely the software-based simple approach, it does not require the hardware enhancement of network and servers to reduce the installation cost. However there are several problems such as no write control to external media, memory depletion possibility, and lower security because of the exceptional write permission to the system processes. In this paper, we propose WriteShield, a pseudo thin-client system which solves these issues. In this system, the local disks are write-protected with volume filter driver and it has a virtual cache mechanism to extend the memory cache size for the write protection. This paper presents design and implementation details of WriteShield. Besides we describe the security analysis and simulation evaluation of paging algorithms for virtual cache mechanism and measure the disk I/O performance to verify its feasibility in the actual environment.
Conditional load and store in a shared memory
Blumrich, Matthias A; Ohmacht, Martin
2015-02-03
A method, system and computer program product for implementing load-reserve and store-conditional instructions in a multi-processor computing system. The computing system includes a multitude of processor units and a shared memory cache, and each of the processor units has access to the memory cache. In one embodiment, the method comprises providing the memory cache with a series of reservation registers, and storing in these registers addresses reserved in the memory cache for the processor units as a result of issuing load-reserve requests. In this embodiment, when one of the processor units makes a request to store data in the memory cache using a store-conditional request, the reservation registers are checked to determine if an address in the memory cache is reserved for that processor unit. If an address in the memory cache is reserved for that processor, the data are stored at this address.
A Survey Of Techniques for Managing and Leveraging Caches in GPUs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh
2014-09-01
Initially introduced as special-purpose accelerators for graphics applications, graphics processing units (GPUs) have now emerged as general purpose computing platforms for a wide range of applications. To address the requirements of these applications, modern GPUs include sizable hardware-managed caches. However, several factors, such as unique architecture of GPU, rise of CPU–GPU heterogeneous computing, etc., demand effective management of caches to achieve high performance and energy efficiency. Recently, several techniques have been proposed for this purpose. In this paper, we survey several architectural and system-level techniques proposed for managing and leveraging GPU caches. We also discuss the importance and challenges ofmore » cache management in GPUs. The aim of this paper is to provide the readers insights into cache management techniques for GPUs and motivate them to propose even better techniques for leveraging the full potential of caches in the GPUs of tomorrow.« less
Toward Millions of File System IOPS on Low-Cost, Commodity Hardware
Zheng, Da; Burns, Randal; Szalay, Alexander S.
2013-01-01
We describe a storage system that removes I/O bottlenecks to achieve more than one million IOPS based on a user-space file abstraction for arrays of commodity SSDs. The file abstraction refactors I/O scheduling and placement for extreme parallelism and non-uniform memory and I/O. The system includes a set-associative, parallel page cache in the user space. We redesign page caching to eliminate CPU overhead and lock-contention in non-uniform memory architecture machines. We evaluate our design on a 32 core NUMA machine with four, eight-core processors. Experiments show that our design delivers 1.23 million 512-byte read IOPS. The page cache realizes the scalable IOPS of Linux asynchronous I/O (AIO) and increases user-perceived I/O performance linearly with cache hit rates. The parallel, set-associative cache matches the cache hit rates of the global Linux page cache under real workloads. PMID:24402052
Toward Millions of File System IOPS on Low-Cost, Commodity Hardware.
Zheng, Da; Burns, Randal; Szalay, Alexander S
2013-01-01
We describe a storage system that removes I/O bottlenecks to achieve more than one million IOPS based on a user-space file abstraction for arrays of commodity SSDs. The file abstraction refactors I/O scheduling and placement for extreme parallelism and non-uniform memory and I/O. The system includes a set-associative, parallel page cache in the user space. We redesign page caching to eliminate CPU overhead and lock-contention in non-uniform memory architecture machines. We evaluate our design on a 32 core NUMA machine with four, eight-core processors. Experiments show that our design delivers 1.23 million 512-byte read IOPS. The page cache realizes the scalable IOPS of Linux asynchronous I/O (AIO) and increases user-perceived I/O performance linearly with cache hit rates. The parallel, set-associative cache matches the cache hit rates of the global Linux page cache under real workloads.
Cache coherency without line exclusivity in MP systems having store-in caches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pomerene, J.H.; Puzak, T.R.; Rechtschaffen, R.N.
1983-11-01
By modifying the function of the storage control unit, a multiprocessor (MP) system having store-in caches is enabled to operate with the same versatility as an MP system having store-through caches, thereby eliminating the requirement for line exclusivity and greatly reducing the occurrence of cross-interrogates.
Nature as a treasure map! Teaching geoscience with the help of earth caches?!
NASA Astrophysics Data System (ADS)
Zecha, Stefanie; Schiller, Thomas
2015-04-01
This presentation looks at how earth caches are influence the learning process in the field of geo science in non-formal education. The development of mobile technologies using Global Positioning System (GPS) data to point geographical location together with the evolving Web 2.0 supporting the creation and consumption of content, suggest a potential for collaborative informal learning linked to location. With the help of the GIS in smartphones you can go directly in nature, search for information by your smartphone, and learn something about nature. Earth caches are a very good opportunity, which are organized and supervised geocaches with special information about physical geography high lights. Interested people can inform themselves about aspects in geoscience area by earth caches. The main question of this presentation is how these caches are created in relation to learning processes. As is not possible, to analyze all existing earth caches, there was focus on Bavaria and a certain feature of earth caches. At the end the authors show limits and potentials for the use of earth caches and give some remark for the future.
Explicit Content Caching at Mobile Edge Networks with Cross-Layer Sensing
Chen, Lingyu; Su, Youxing; Luo, Wenbin; Hong, Xuemin; Shi, Jianghong
2018-01-01
The deployment density and computational power of small base stations (BSs) are expected to increase significantly in the next generation mobile communication networks. These BSs form the mobile edge network, which is a pervasive and distributed infrastructure that can empower a variety of edge/fog computing applications. This paper proposes a novel edge-computing application called explicit caching, which stores selective contents at BSs and exposes such contents to local users for interactive browsing and download. We formulate the explicit caching problem as a joint content recommendation, caching, and delivery problem, which aims to maximize the expected user quality-of-experience (QoE) with varying degrees of cross-layer sensing capability. Optimal and effective heuristic algorithms are presented to solve the problem. The theoretical performance bounds of the explicit caching system are derived in simplified scenarios. The impacts of cache storage space, BS backhaul capacity, cross-layer information, and user mobility on the system performance are simulated and discussed in realistic scenarios. Results suggest that, compared with conventional implicit caching schemes, explicit caching can better exploit the mobile edge network infrastructure for personalized content dissemination. PMID:29565313
Explicit Content Caching at Mobile Edge Networks with Cross-Layer Sensing.
Chen, Lingyu; Su, Youxing; Luo, Wenbin; Hong, Xuemin; Shi, Jianghong
2018-03-22
The deployment density and computational power of small base stations (BSs) are expected to increase significantly in the next generation mobile communication networks. These BSs form the mobile edge network, which is a pervasive and distributed infrastructure that can empower a variety of edge/fog computing applications. This paper proposes a novel edge-computing application called explicit caching, which stores selective contents at BSs and exposes such contents to local users for interactive browsing and download. We formulate the explicit caching problem as a joint content recommendation, caching, and delivery problem, which aims to maximize the expected user quality-of-experience (QoE) with varying degrees of cross-layer sensing capability. Optimal and effective heuristic algorithms are presented to solve the problem. The theoretical performance bounds of the explicit caching system are derived in simplified scenarios. The impacts of cache storage space, BS backhaul capacity, cross-layer information, and user mobility on the system performance are simulated and discussed in realistic scenarios. Results suggest that, compared with conventional implicit caching schemes, explicit caching can better exploit the mobile edge network infrastructure for personalized content dissemination.
On the Feasibility of Prefetching and Caching for Online TV Services: A Measurement Study on Hulu
NASA Astrophysics Data System (ADS)
Krishnappa, Dilip Kumar; Khemmarat, Samamon; Gao, Lixin; Zink, Michael
Lately researchers are looking at ways to reduce the delay on video playback through mechanisms like prefetching and caching for Video-on-Demand (VoD) services. The usage of prefetching and caching also has the potential to reduce the amount of network bandwidth usage, as most popular requests are served from a local cache rather than the server containing the original content. In this paper, we investigate the advantages of having such a prefetching and caching scheme for a free hosting service of professionally created video (movies and TV shows) named "hulu". We look into the advantages of using a prefetching scheme where the most popular videos of the week, as provided by the hulu website, are prefetched and compare this approach with a conventional LRU caching scheme with limited storage space and a combined scheme of prefetching and caching. Results from our measurement and analysis shows that employing a basic caching scheme at the proxy yields a hit ratio of up to 77.69%, but requires storage of about 236GB. Further analysis shows that a prefetching scheme where the top-100 popular videos of the week are downloaded to the proxy yields a hit ratio of 44% with a storage requirement of 10GB. A LRU caching scheme with a storage limitation of 20GB can achieve a hit ratio of 55% but downloads 4713 videos to achieve such high hit ratio compared to 100 videos in prefetching scheme, whereas a scheme with both prefetching and caching with the same storage yields a hit ratio of 59% with download requirement of 4439 videos. We find that employing a scheme of prefetching along with caching with trade-off on the storage will yield a better hit ratio and bandwidth saving than individual caching or prefetching schemes.
Corvid re-caching without 'theory of mind': a model.
van der Vaart, Elske; Verbrugge, Rineke; Hemelrijk, Charlotte K
2012-01-01
Scrub jays are thought to use many tactics to protect their caches. For instance, they predominantly bury food far away from conspecifics, and if they must cache while being watched, they often re-cache their worms later, once they are in private. Two explanations have been offered for such observations, and they are intensely debated. First, the birds may reason about their competitors' mental states, with a 'theory of mind'; alternatively, they may apply behavioral rules learned in daily life. Although this second hypothesis is cognitively simpler, it does seem to require a different, ad-hoc behavioral rule for every caching and re-caching pattern exhibited by the birds. Our new theory avoids this drawback by explaining a large variety of patterns as side-effects of stress and the resulting memory errors. Inspired by experimental data, we assume that re-caching is not motivated by a deliberate effort to safeguard specific caches from theft, but by a general desire to cache more. This desire is brought on by stress, which is determined by the presence and dominance of onlookers, and by unsuccessful recovery attempts. We study this theory in two experiments similar to those done with real birds with a kind of 'virtual bird', whose behavior depends on a set of basic assumptions about corvid cognition, and a well-established model of human memory. Our results show that the 'virtual bird' acts as the real birds did; its re-caching reflects whether it has been watched, how dominant its onlooker was, and how close to that onlooker it has cached. This happens even though it cannot attribute mental states, and it has only a single behavioral rule assumed to be previously learned. Thus, our simulations indicate that corvid re-caching can be explained without sophisticated social cognition. Given our specific predictions, our theory can easily be tested empirically.
Corvid Re-Caching without ‘Theory of Mind’: A Model
van der Vaart, Elske; Verbrugge, Rineke; Hemelrijk, Charlotte K.
2012-01-01
Scrub jays are thought to use many tactics to protect their caches. For instance, they predominantly bury food far away from conspecifics, and if they must cache while being watched, they often re-cache their worms later, once they are in private. Two explanations have been offered for such observations, and they are intensely debated. First, the birds may reason about their competitors' mental states, with a ‘theory of mind’; alternatively, they may apply behavioral rules learned in daily life. Although this second hypothesis is cognitively simpler, it does seem to require a different, ad-hoc behavioral rule for every caching and re-caching pattern exhibited by the birds. Our new theory avoids this drawback by explaining a large variety of patterns as side-effects of stress and the resulting memory errors. Inspired by experimental data, we assume that re-caching is not motivated by a deliberate effort to safeguard specific caches from theft, but by a general desire to cache more. This desire is brought on by stress, which is determined by the presence and dominance of onlookers, and by unsuccessful recovery attempts. We study this theory in two experiments similar to those done with real birds with a kind of ‘virtual bird’, whose behavior depends on a set of basic assumptions about corvid cognition, and a well-established model of human memory. Our results show that the ‘virtual bird’ acts as the real birds did; its re-caching reflects whether it has been watched, how dominant its onlooker was, and how close to that onlooker it has cached. This happens even though it cannot attribute mental states, and it has only a single behavioral rule assumed to be previously learned. Thus, our simulations indicate that corvid re-caching can be explained without sophisticated social cognition. Given our specific predictions, our theory can easily be tested empirically. PMID:22396799
Performance Analysis of a Hybrid Overset Multi-Block Application on Multiple Architectures
NASA Technical Reports Server (NTRS)
Djomehri, M. Jahed; Biswas, Rupak
2003-01-01
This paper presents a detailed performance analysis of a multi-block overset grid compu- tational fluid dynamics app!ication on multiple state-of-the-art computer architectures. The application is implemented using a hybrid MPI+OpenMP programming paradigm that exploits both coarse and fine-grain parallelism; the former via MPI message passing and the latter via OpenMP directives. The hybrid model also extends the applicability of multi-block programs to large clusters of SNIP nodes by overcoming the restriction that the number of processors be less than the number of grid blocks. A key kernel of the application, namely the LU-SGS linear solver, had to be modified to enhance the performance of the hybrid approach on the target machines. Investigations were conducted on cacheless Cray SX6 vector processors, cache-based IBM Power3 and Power4 architectures, and single system image SGI Origin3000 platforms. Overall results for complex vortex dynamics simulations demonstrate that the SX6 achieves the highest performance and outperforms the RISC-based architectures; however, the best scaling performance was achieved on the Power3.
44 CFR 208.24 - Purchase and maintenance of items not listed on Equipment Cache List.
Code of Federal Regulations, 2011 CFR
2011-10-01
... items not listed on Equipment Cache List. 208.24 Section 208.24 Emergency Management and Assistance... of items not listed on Equipment Cache List. (a) Requests for purchase or maintenance of equipment and supplies not appearing on the Equipment Cache List, or that exceed the number specified in the...
Effects of cacheing on multitasking efficiency and programming strategy on an ELXSI 6400
DOE Office of Scientific and Technical Information (OSTI.GOV)
Montry, G.R.; Benner, R.E.
1985-12-01
The impact of a cache/shared memory architecture, and, in particular, the cache coherency problem, upon concurrent algorithm and program development is discussed. In this context, a simple set of programming strategies are proposed which streamline code development and improve code performance when multitasking in a cache/shared memory or distributed memory environment.
Cache-based error recovery for shared memory multiprocessor systems
NASA Technical Reports Server (NTRS)
Wu, Kun-Lung; Fuchs, W. Kent; Patel, Janak H.
1989-01-01
A multiprocessor cache-based checkpointing and recovery scheme for of recovering from transient processor errors in a shared-memory multiprocessor with private caches is presented. New implementation techniques that use checkpoint identifiers and recovery stacks to reduce performance degradation in processor utilization during normal execution are examined. This cache-based checkpointing technique prevents rollback propagation, provides for rapid recovery, and can be integrated into standard cache coherence protocols. An analytical model is used to estimate the relative performance of the scheme during normal execution. Extensions that take error latency into account are presented.
Computer Cache. Natural Disasters: Earth, Wind, and Fire
ERIC Educational Resources Information Center
Brodie, Carolyn S.; Byerly, Greg
2005-01-01
Natural disasters come in all shapes and sizes and affect all areas of the earth, and studying natural disasters may make children more aware of their physical environment and their place in it. This column provides a list of websites on different types of natural disasters, including earthquakes, landslides, tsunamis, volcanoes, floods,…
Forest resources of the Wasatch-Cache National Forest
Renee A. O' Brien; Jesse Pope
1997-01-01
The 1,215,219 acres in the Wasatch-Cache National Forest encompass 863,906 acres of forest land, made up of 90 percent (776,239 acres) "timberland" and 10 percent (87,667 acres) "woodland." The other 351,313 acres of the Wasatch-Cache are nonforest or water (fig. 1). This report discusses forest land only. In the Wasatch-Cache, 26 percent...
Problems faced by food-caching corvids and the evolution of cognitive solutions
Grodzinski, Uri; Clayton, Nicola S.
2010-01-01
The scatter hoarding of food, or caching, is a widespread and well-studied behaviour. Recent experiments with caching corvids have provided evidence for episodic-like memory, future planning and possibly mental attribution, all cognitive abilities that were thought to be unique to humans. In addition to the complexity of making flexible, informed decisions about caching and recovering, this behaviour is underpinned by a motivationally controlled compulsion to cache. In this review, we shall first discuss the compulsive side of caching both during ontogeny and in the caching behaviour of adult corvids. We then consider some of the problems that these birds face and review the evidence for the cognitive abilities they use to solve them. Thus, the emergence of episodic-like memory is viewed as a solution for coping with food perishability, while the various cache-protection and pilfering strategies may be sophisticated tools to deprive competitors of information, either by reducing the quality of information they can gather, or invalidating the information they already have. Finally, we shall examine whether such future-oriented behaviour involves future planning and ask why this and other cognitive abilities might have evolved in corvids. PMID:20156820
A Morphometric Assessment of the Intended Function of Cached Clovis Points
Buchanan, Briggs; Kilby, J. David; Huckell, Bruce B.; O'Brien, Michael J.; Collard, Mark
2012-01-01
A number of functions have been proposed for cached Clovis points. The least complicated hypothesis is that they were intended to arm hunting weapons. It has also been argued that they were produced for use in rituals or in connection with costly signaling displays. Lastly, it has been suggested that some cached Clovis points may have been used as saws. Here we report a study in which we morphometrically compared Clovis points from caches with Clovis points recovered from kill and camp sites to test two predictions of the hypothesis that cached Clovis points were intended to arm hunting weapons: 1) cached points should be the same shape as, but generally larger than, points from kill/camp sites, and 2) cached points and points from kill/camp sites should follow the same allometric trajectory. The results of the analyses are consistent with both predictions and therefore support the hypothesis. A follow-up review of the fit between the results of the analyses and the predictions of the other hypotheses indicates that the analyses support only the hunting equipment hypothesis. We conclude from this that cached Clovis points were likely produced with the intention of using them to arm hunting weapons. PMID:22348012
Do Clark's nutcrackers demonstrate what-where-when memory on a cache-recovery task?
Gould, Kristy L; Ort, Amy J; Kamil, Alan C
2012-01-01
What-where-when (WWW) memory during cache recovery was investigated in six Clark's nutcrackers. During caching, both red- and blue-colored pine seeds were cached by the birds in holes filled with sand. Either a short (3 day) retention interval (RI) or a long (9 day) RI was followed by a recovery session during which caches were replaced with either a single seed or wooden bead depending upon the color of the cache and length of the retention interval. Knowledge of what was in the cache (seed or bead), where it was located, and when the cache had been made (3 or 9 days ago) were the three WWW memory components under investigation. Birds recovered items (bead or seed) at above chance levels, demonstrating accurate spatial memory. They also recovered seeds more than beads after the long RI, but not after the short RI, when they recovered seeds and beads equally often. The differential recovery after the long RI demonstrates that nutcrackers may have the capacity for WWW memory during this task, but it is not clear why it was influenced by RI duration.
Problems faced by food-caching corvids and the evolution of cognitive solutions.
Grodzinski, Uri; Clayton, Nicola S
2010-03-27
The scatter hoarding of food, or caching, is a widespread and well-studied behaviour. Recent experiments with caching corvids have provided evidence for episodic-like memory, future planning and possibly mental attribution, all cognitive abilities that were thought to be unique to humans. In addition to the complexity of making flexible, informed decisions about caching and recovering, this behaviour is underpinned by a motivationally controlled compulsion to cache. In this review, we shall first discuss the compulsive side of caching both during ontogeny and in the caching behaviour of adult corvids. We then consider some of the problems that these birds face and review the evidence for the cognitive abilities they use to solve them. Thus, the emergence of episodic-like memory is viewed as a solution for coping with food perishability, while the various cache-protection and pilfering strategies may be sophisticated tools to deprive competitors of information, either by reducing the quality of information they can gather, or invalidating the information they already have. Finally, we shall examine whether such future-oriented behaviour involves future planning and ask why this and other cognitive abilities might have evolved in corvids.
Short-term observational spatial memory in Jackdaws (Corvus monedula) and Ravens (Corvus corax).
Scheid, Christelle; Bugnyar, Thomas
2008-10-01
Observational spatial memory (OSM) refers to the ability of remembering food caches made by other individuals, enabling observers to find and pilfer the others' caches. Within birds, OSM has only been demonstrated in corvids, with more social species such as Mexican jays (Aphelocoma ultramarine) showing a higher accuracy of finding conspecific' caches than less social species such as Clark's nutcrackers (Nucifraga columbiana). However, socially dynamic corvids such as ravens (Corvus corax) are capable of sophisticated pilfering manoeuvres based on OSM. We here compared the performance of ravens and jackdaws (Corvus monedula) in a short-term OSM task. In contrast to ravens, jackdaws are socially cohesive but hardly cache and compete over food caches. Birds had to recover food pieces after watching a human experimenter hiding them in 2, 4 or 6 out of 10 possible locations. Results showed that for tests with two, four and six caches, ravens performed more accurately than expected by chance whereas jackdaws did not. Moreover, ravens made fewer re-visits to already inspected cache sites than jackdaws. These findings suggest that the development of observational spatial memory skills is linked with the species' reliance on food caches rather than with a social life style per se.
Store-operate-coherence-on-value
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Dong; Heidelberger, Philip; Kumar, Sameer
A system, method and computer program product for performing various store-operate instructions in a parallel computing environment that includes a plurality of processors and at least one cache memory device. A queue in the system receives, from a processor, a store-operate instruction that specifies under which condition a cache coherence operation is to be invoked. A hardware unit in the system runs the received store-operate instruction. The hardware unit evaluates whether a result of the running the received store-operate instruction satisfies the condition. The hardware unit invokes a cache coherence operation on a cache memory address associated with the receivedmore » store-operate instruction if the result satisfies the condition. Otherwise, the hardware unit does not invoke the cache coherence operation on the cache memory device.« less
Multicast for savings in cache-based video distribution
NASA Astrophysics Data System (ADS)
Griwodz, Carsten; Zink, Michael; Liepert, Michael; On, Giwon; Steinmetz, Ralf
1999-12-01
Internet video-on-demand (VoD) today streams videos directly from server to clients, because re-distribution is not established yet. Intranet solutions exist but are typically managed centrally. Caching may overcome these management needs, however existing web caching strategies are not applicable because they work in different conditions. We propose movie distribution by means of caching, and study the feasibility from the service providers' point of view. We introduce the combination of our reliable multicast protocol LCRTP for caching hierarchies combined with our enhancement to the patching technique for bandwidth friendly True VoD, not depending on network resource guarantees.
Store operations to maintain cache coherence
Evangelinos, Constantinos; Nair, Ravi; Ohmacht, Martin
2017-08-01
In one embodiment, a computer-implemented method includes encountering a store operation during a compile-time of a program, where the store operation is applicable to a memory line. It is determined, by a computer processor, that no cache coherence action is necessary for the store operation. A store-without-coherence-action instruction is generated for the store operation, responsive to determining that no cache coherence action is necessary. The store-without-coherence-action instruction specifies that the store operation is to be performed without a cache coherence action, and cache coherence is maintained upon execution of the store-without-coherence-action instruction.
Fixed-Rate Compressed Floating-Point Arrays.
Lindstrom, Peter
2014-12-01
Current compression schemes for floating-point data commonly take fixed-precision values and compress them to a variable-length bit stream, complicating memory management and random access. We present a fixed-rate, near-lossless compression scheme that maps small blocks of 4(d) values in d dimensions to a fixed, user-specified number of bits per block, thereby allowing read and write random access to compressed floating-point data at block granularity. Our approach is inspired by fixed-rate texture compression methods widely adopted in graphics hardware, but has been tailored to the high dynamic range and precision demands of scientific applications. Our compressor is based on a new, lifted, orthogonal block transform and embedded coding, allowing each per-block bit stream to be truncated at any point if desired, thus facilitating bit rate selection using a single compression scheme. To avoid compression or decompression upon every data access, we employ a software write-back cache of uncompressed blocks. Our compressor has been designed with computational simplicity and speed in mind to allow for the possibility of a hardware implementation, and uses only a small number of fixed-point arithmetic operations per compressed value. We demonstrate the viability and benefits of lossy compression in several applications, including visualization, quantitative data analysis, and numerical simulation.
Urhan, A Utku; Emilsson, Ellen; Brodin, Anders
2017-01-01
Many species in the family Paridae, such as marsh tits Poecile palustris , are large-scale scatter hoarders of food that make cryptic caches and disperse these in large year-round territories. The perhaps most well-known species in the family, the great tit Parus major , does not store food itself but is skilled in stealing caches from the other species. We have previously demonstrated that great tits are able to memorise positions of caches they have observed marsh tits make and later return and steal the food. As great tits are explorative in nature and unusually good learners, it is possible that such "memorisation of caches from a distance" is a unique ability of theirs. The other possibility is that this ability is general in the parid family. Here, we tested marsh tits in the same experimental set-up as where we previously have tested great tits. We allowed caged marsh tits to observe a caching conspecific in a specially designed indoor arena. After a retention interval of 1 or 24 h, we allowed the observer to enter the arena and search for the caches. The marsh tits showed no evidence of such observational memorization ability, and we believe that such ability is more useful for a non-hoarding species. Why should a marsh tit that memorises hundreds of their own caches in the field bother with the difficult task of memorising other individuals' caches? We argue that the close-up memorisation procedure that marsh tits use at their own caches may be a different type of observational learning than memorisation of caches made by others. For example, the latter must be done from a distance and hence may require the ability to adopt an allocentric perspective, i.e. the ability to visualise the cache from the hoarder's perspective. Members of the Paridae family are known to possess foraging techniques that are cognitively advanced. Previously, we have demonstrated that a non-hoarding parid species, the great tit P. major , is able to memorise positions of caches that they have observed marsh tits P. palustris make. However, it is unknown whether this cognitively advanced foraging strategy is unique to great tits or if it occurs also in other parids. Here, we demonstrated that "pilfering by observational memorization strategy" is not a general strategy in parids. We believe that such ability is important for a non-hoarding species such as the great tit and, most likely, birds owning many caches do not need this foraging strategy.
Cache directory look-up re-use as conflict check mechanism for speculative memory requests
Ohmacht, Martin
2013-09-10
In a cache memory, energy and other efficiencies can be realized by saving a result of a cache directory lookup for sequential accesses to a same memory address. Where the cache is a point of coherence for speculative execution in a multiprocessor system, with directory lookups serving as the point of conflict detection, such saving becomes particularly advantageous.
Improved cache performance in Monte Carlo transport calculations using energy banding
NASA Astrophysics Data System (ADS)
Siegel, A.; Smith, K.; Felker, K.; Romano, P.; Forget, B.; Beckman, P.
2014-04-01
We present an energy banding algorithm for Monte Carlo (MC) neutral particle transport simulations which depend on large cross section lookup tables. In MC codes, read-only cross section data tables are accessed frequently, exhibit poor locality, and are typically too much large to fit in fast memory. Thus, performance is often limited by long latencies to RAM, or by off-node communication latencies when the data footprint is very large and must be decomposed on a distributed memory machine. The proposed energy banding algorithm allows maximal temporal reuse of data in band sizes that can flexibly accommodate different architectural features. The energy banding algorithm is general and has a number of benefits compared to the traditional approach. In the present analysis we explore its potential to achieve improvements in time-to-solution on modern cache-based architectures.
Samelius, Gustaf; Alisauskas, Ray T; Hobson, Keith A; Larivière, Serge
2007-09-01
1. Many ecosystems are characterized by pulses of dramatically higher than normal levels of foods (pulsed resources) to which animals often respond by caching foods for future use. However, the extent to which animals use cached foods and how this varies in relation to fluctuations in other foods is poorly understood in most animals. 2. Arctic foxes Alopex lagopus (L.) cache thousands of eggs annually at large goose colonies where eggs are often superabundant during the nesting period by geese. We estimated the contribution of cached eggs to arctic fox diets in spring and autumn, when geese were not present in the study area, by comparing stable isotope ratios (delta(13)C and delta(15)N) of fox tissues with those of their foods using a multisource mixing model in Program IsoSource. 3. The contribution of cached eggs to arctic fox diets was inversely related to collared lemming Dicrostonyx groenlandicus (Traill) abundance; the contribution of cached eggs to overall fox diets increased from < 28% in years when collared lemmings were abundant to 30-74% in years when collared lemmings were scarce. 4. Further, arctic foxes used cached eggs well into the following spring (almost 1 year after eggs were acquired) - a pattern that differs from that of carnivores generally storing foods for only a few days before consumption. 5. This study showed that long-term use of eggs that were cached when geese were superabundant at the colony in summer varied with fluctuations in collared lemming abundance (a key component in arctic fox diets throughout most of their range) and suggests that cached eggs functioned as a buffer when collared lemmings were scarce.
Zwolak, Rafał; Bogdziewicz, Michał; Wróbel, Aleksandra; Crone, Elizabeth E
2016-03-01
The predator satiation and predator dispersal hypotheses provide alternative explanations for masting. Both assume satiation of seed-eating vertebrates. They differ in whether satiation occurs before or after seed removal and caching by granivores (predator satiation and predator dispersal, respectively). This difference is largely unrecognized, but it is demographically important because cached seeds are dispersed and often have a microsite advantage over nondispersed seeds. We conducted rodent exclosure experiments in two mast and two nonmast years to test predictions of the predator dispersal hypothesis in our study system of yellow-necked mice (Apodemus flavicollis) and European beech (Fagus sylvatica). Specifically, we tested whether the fraction of seeds removed from the forest floor is similar during mast and nonmast years (i.e., lack of satiation before seed caching), whether masting decreases the removal of cached seeds (i.e., satiation after seed storage), and whether seed caching increases the probability of seedling emergence. We found that masting did not result in satiation at the seed removal stage. However, masting decreased the removal of cached seeds, and seed caching dramatically increased the probability of seedling emergence relative to noncached seeds. European beech thus benefits from masting through the satiation of scatterhoarders that occurs only after seeds are removed and cached. Although these findings do not exclude other evolutionary advantages of beech masting, they indicate that fitness benefits of masting extend beyond the most commonly considered advantages of predator satiation and increased pollination efficiency.
Freas, C A; Bingman, K; Ladage, L D; Pravosudov, V V
2013-01-01
Variation in environmental conditions associated with differential selection on spatial memory has been hypothesized to result in evolutionary changes in the morphology of the hippocampus, a brain region involved in spatial memory. At the same time, it is well known that the morphology of the hippocampus might also be directly affected by environmental conditions. Understanding the role of environment-based plasticity is therefore critical when investigating potential adaptive evolutionary changes in the hippocampus associated with environmental variation. We previously demonstrated large elevation-related variation in hippocampus morphology in mountain chickadees over an extremely small spatial scale. We hypothesized that this variation is related to differential selection pressures associated with differences in winter climate severity along an elevation gradient, which make different demands on spatial memory used for food cache retrieval. Here, we tested whether such variation is experience based, generated by potential differences in the environment, by comparing the hippocampus morphology of chickadees from different elevations maintained in a uniform captive environment in a laboratory with those sampled directly from the wild. In addition, we compared hippocampal neuron soma size in chickadees sampled directly from the wild with those maintained in laboratory conditions with restricted and unrestricted spatial memory use via manipulation of food-caching experiences to test whether memory use can affect neuron soma size. There were significant elevation-related differences in hippocampus volume and the total number of hippocampal neurons, but not in neuron soma size, in captive birds. Captive environmental conditions were associated with a large reduction in hippocampus volume and neuron soma size, but not in the total number of neurons or in neuron soma size in other telencephalic regions. Restriction of memory use while in laboratory conditions produced no significant effects on hippocampal neuron soma size. Overall our results showed that captivity has a strong effect on hippocampus volume, which could be due, at least partly, to a reduction in neuron soma size specifically in the hippocampus, but it did not override elevation-related differences in hippocampus volume or in the total number of hippocampal neurons. These data are consistent with the idea of the adaptive nature of the elevation-related differences associated with selection on spatial memory, while at the same time demonstrating additional environment-based plasticity in hippocampus volume, but not in neuron numbers. Our results, however, cannot rule out that the differences between elevations might still be driven by some developmental or early posthatching conditions/experiences. © 2013 S. Karger AG, Basel.
Corvid caching: Insights from a cognitive model.
van der Vaart, Elske; Verbrugge, Rineke; Hemelrijk, Charlotte K
2011-07-01
Caching and recovery of food by corvids is well-studied, but some ambiguous results remain. To help clarify these, we built a computational cognitive model. It is inspired by similar models built for humans, and it assumes that memory strength depends on frequency and recency of use. We compared our model's behavior to that of real birds in previously published experiments. Our model successfully replicated the outcomes of two experiments on recovery behavior and two experiments on cache site choice. Our "virtual birds" reproduced declines in recovery accuracy across sessions, revisits to previously emptied cache sites, a lack of correlation between caching and recovery order, and a preference for caching in safe locations. The model also produced two new explanations. First, that Clark's nutcrackers may become less accurate as recovery progresses not because of differential memory for different cache sites, as was once assumed, but because of chance effects. And second, that Western scrub jays may choose their cache sites not on the basis of negative recovery experiences only, as was previously thought, but on the basis of positive recovery experiences instead. Alternatively, both "punishment" and "reward" may be playing a role. We conclude with a set of new insights, a testable prediction, and directions for future work. PsycINFO Database Record (c) 2011 APA, all rights reserved
OS friendly microprocessor architecture: Hardware level computer security
NASA Astrophysics Data System (ADS)
Jungwirth, Patrick; La Fratta, Patrick
2016-05-01
We present an introduction to the patented OS Friendly Microprocessor Architecture (OSFA) and hardware level computer security. Conventional microprocessors have not tried to balance hardware performance and OS performance at the same time. Conventional microprocessors have depended on the Operating System for computer security and information assurance. The goal of the OS Friendly Architecture is to provide a high performance and secure microprocessor and OS system. We are interested in cyber security, information technology (IT), and SCADA control professionals reviewing the hardware level security features. The OS Friendly Architecture is a switched set of cache memory banks in a pipeline configuration. For light-weight threads, the memory pipeline configuration provides near instantaneous context switching times. The pipelining and parallelism provided by the cache memory pipeline provides for background cache read and write operations while the microprocessor's execution pipeline is running instructions. The cache bank selection controllers provide arbitration to prevent the memory pipeline and microprocessor's execution pipeline from accessing the same cache bank at the same time. This separation allows the cache memory pages to transfer to and from level 1 (L1) caching while the microprocessor pipeline is executing instructions. Computer security operations are implemented in hardware. By extending Unix file permissions bits to each cache memory bank and memory address, the OSFA provides hardware level computer security.
Experimental evaluation of multiprocessor cache-based error recovery
NASA Technical Reports Server (NTRS)
Janssens, Bob; Fuchs, W. K.
1991-01-01
Several variations of cache-based checkpointing for rollback error recovery in shared-memory multiprocessors have been recently developed. By modifying the cache replacement policy, these techniques use the inherent redundancy in the memory hierarchy to periodically checkpoint the computation state. Three schemes, different in the manner in which they avoid rollback propagation, are evaluated. By simulation with address traces from parallel applications running on an Encore Multimax shared-memory multiprocessor, the performance effect of integrating the recovery schemes in the cache coherence protocol are evaluated. The results indicate that the cache-based schemes can provide checkpointing capability with low performance overhead but uncontrollable high variability in the checkpoint interval.
Considering User's Access Pattern in Multimedia File Systems
NASA Astrophysics Data System (ADS)
Cho, KyoungWoon; Ryu, YeonSeung; Won, Youjip; Koh, Kern
2002-12-01
Legacy buffer cache management schemes for multimedia server are grounded at the assumption that the application sequentially accesses the multimedia file. However, user access pattern may not be sequential in some circumstances, for example, in distance learning application, where the user may exploit the VCR-like function(rewind and play) of the system and accesses the particular segments of video repeatedly in the middle of sequential playback. Such a looping reference can cause a significant performance degradation of interval-based caching algorithms. And thus an appropriate buffer cache management scheme is required in order to deliver desirable performance even under the workload that exhibits looping reference behavior. We propose Adaptive Buffer cache Management(ABM) scheme which intelligently adapts to the file access characteristics. For each opened file, ABM applies either the LRU replacement or the interval-based caching depending on the Looping Reference Indicator, which indicates that how strong temporally localized access pattern is. According to our experiment, ABM exhibits better buffer cache miss ratio than interval-based caching or LRU, especially when the workload exhibits not only sequential but also looping reference property.
Lucas, Jeffrey R.; Freeberg, Todd M.; Egbert, Jeremy; Schwabl, Hubert
2006-01-01
We tested for hormonal and behavioral differences between Carolina chickadees (Poecile carolinensis) taken from a disturbed (recently logged) forest, an undisturbed forest, or a residential site. We measured fecal corticosterone and body mass levels in the field, and fecal corticosterone, body mass, and caching behavior in an aviary experiment. In the field, birds from the disturbed forest exhibited significantly higher fecal corticosterone levels than birds from either the undisturbed forest or from the residential site. Birds from the disturbed forest also exhibited lower body mass than those from the undisturbed forest but higher body mass than those from the residential site. Our aviary results suggest that these physiological differences between field sites are the result of short-term responses to ecological factors: Neither body mass nor fecal corticosterone levels varied between birds captured at different sites. Aviary sample sizes were sufficient to detect seasonal variation in fecal corticosterone (lowest in summer), body mass (highest in spring), and rate of gain in body mass (highest in winter). Under “closed-economy” aviary conditions (all food available from a feeder in the aviary), there were no site differences in the percent of seeds taken from the feeder that were cached. However, under “open-economy” conditions (food occasionally available ad libitum), significantly fewer seeds were cached by birds from the disturbed forest compared to the undisturbed or residential sites. On average, there was only a two-fold difference in population-levels of fecal corticosterone. This difference is about the same as an increase in fecal corticosterone induced by a two-hour increase in food deprivation, and can not be considered to be an acute stress response to disturbance. PMID:16458312
Load Balancing in Distributed Web Caching: A Novel Clustering Approach
NASA Astrophysics Data System (ADS)
Tiwari, R.; Kumar, K.; Khan, G.
2010-11-01
The World Wide Web suffers from scaling and reliability problems due to overloaded and congested proxy servers. Caching at local proxy servers helps, but cannot satisfy more than a third to half of requests; more requests are still sent to original remote origin servers. In this paper we have developed an algorithm for Distributed Web Cache, which incorporates cooperation among proxy servers of one cluster. This algorithm uses Distributed Web Cache concepts along with static hierarchies with geographical based clusters of level one proxy server with dynamic mechanism of proxy server during the congestion of one cluster. Congestion and scalability problems are being dealt by clustering concept used in our approach. This results in higher hit ratio of caches, with lesser latency delay for requested pages. This algorithm also guarantees data consistency between the original server objects and the proxy cache objects.
Version pressure feedback mechanisms for speculative versioning caches
Eichenberger, Alexandre E.; Gara, Alan; O& #x27; Brien, Kathryn M.; Ohmacht, Martin; Zhuang, Xiaotong
2013-03-12
Mechanisms are provided for controlling version pressure on a speculative versioning cache. Raw version pressure data is collected based on one or more threads accessing cache lines of the speculative versioning cache. One or more statistical measures of version pressure are generated based on the collected raw version pressure data. A determination is made as to whether one or more modifications to an operation of a data processing system are to be performed based on the one or more statistical measures of version pressure, the one or more modifications affecting version pressure exerted on the speculative versioning cache. An operation of the data processing system is modified based on the one or more determined modifications, in response to a determination that one or more modifications to the operation of the data processing system are to be performed, to affect the version pressure exerted on the speculative versioning cache.
MELOC - Memory and Location Optimized Caching for Mobile Ad Hoc Networks
2011-01-01
required for such environments. Moreover, nodes located at centre have to be chosen as cache location, since it reduces the chance of being attacked...Figure 1.1. MANET Formed by Armed Forces 47 Example 3: Sharing of music and videos are famous among mobile users. Instead of downloading...The two tier caching scheme discussed in this paper is acoustic . The characteristics of two-tier caching are as follows, the content of data to be
Visual landmark-directed scatter-hoarding of Siberian chipmunks Tamias sibiricus.
Zhang, Dongyuan; Li, Jia; Wang, Zhenyu; Yi, Xianfeng
2016-05-01
Spatial memory of cached food items plays an important role in cache recovery by scatter-hoarding animals. However, whether scatter-hoarding animals intentionally select cache sites with respect to visual landmarks in the environment and then rely on them to recover their cached seeds for later use has not been extensively explored. Furthermore, there is a lack of evidence on whether there are sex differences in visual landmark-based food-hoarding behaviors in small rodents even though male and female animals exhibit different spatial abilities. In the present study, we used a scatter-hoarding animal, the Siberian chipmunk, Tamias sibiricus to explore these questions in semi-natural enclosures. Our results showed that T. sibiricus preferred to establish caches in the shallow pits labeled with visual landmarks (branches of Pinus sylvestris, leaves of Athyrium brevifrons and PVC tubes). In addition, visual landmarks of P. sylvestris facilitated cache recovery by T. sibiricus. We also found significant sex differences in visual landmark-based food-hoarding strategies in Siberian chipmunks. Males, rather than females, chipmunks tended to establish their caches with respect to the visual landmarks. Our studies show that T. sibiricus rely on visual landmarks to establish and recover their caches, and that sex differences exist in visual landmark-based food hoarding in Siberian chipmunks. © 2015 International Society of Zoological Sciences, Institute of Zoology/Chinese Academy of Sciences and John Wiley & Sons Australia, Ltd.
Cost aware cache replacement policy in shared last-level cache for hybrid memory based fog computing
NASA Astrophysics Data System (ADS)
Jia, Gangyong; Han, Guangjie; Wang, Hao; Wang, Feng
2018-04-01
Fog computing requires a large main memory capacity to decrease latency and increase the Quality of Service (QoS). However, dynamic random access memory (DRAM), the commonly used random access memory, cannot be included into a fog computing system due to its high consumption of power. In recent years, non-volatile memories (NVM) such as Phase-Change Memory (PCM) and Spin-transfer torque RAM (STT-RAM) with their low power consumption have emerged to replace DRAM. Moreover, the currently proposed hybrid main memory, consisting of both DRAM and NVM, have shown promising advantages in terms of scalability and power consumption. However, the drawbacks of NVM, such as long read/write latency give rise to potential problems leading to asymmetric cache misses in the hybrid main memory. Current last level cache (LLC) policies are based on the unified miss cost, and result in poor performance in LLC and add to the cost of using NVM. In order to minimize the cache miss cost in the hybrid main memory, we propose a cost aware cache replacement policy (CACRP) that reduces the number of cache misses from NVM and improves the cache performance for a hybrid memory system. Experimental results show that our CACRP behaves better in LLC performance, improving performance up to 43.6% (15.5% on average) compared to LRU.
Optimal and Scalable Caching for 5G Using Reinforcement Learning of Space-Time Popularities
NASA Astrophysics Data System (ADS)
Sadeghi, Alireza; Sheikholeslami, Fatemeh; Giannakis, Georgios B.
2018-02-01
Small basestations (SBs) equipped with caching units have potential to handle the unprecedented demand growth in heterogeneous networks. Through low-rate, backhaul connections with the backbone, SBs can prefetch popular files during off-peak traffic hours, and service them to the edge at peak periods. To intelligently prefetch, each SB must learn what and when to cache, while taking into account SB memory limitations, the massive number of available contents, the unknown popularity profiles, as well as the space-time popularity dynamics of user file requests. In this work, local and global Markov processes model user requests, and a reinforcement learning (RL) framework is put forth for finding the optimal caching policy when the transition probabilities involved are unknown. Joint consideration of global and local popularity demands along with cache-refreshing costs allow for a simple, yet practical asynchronous caching approach. The novel RL-based caching relies on a Q-learning algorithm to implement the optimal policy in an online fashion, thus enabling the cache control unit at the SB to learn, track, and possibly adapt to the underlying dynamics. To endow the algorithm with scalability, a linear function approximation of the proposed Q-learning scheme is introduced, offering faster convergence as well as reduced complexity and memory requirements. Numerical tests corroborate the merits of the proposed approach in various realistic settings.
Efficient Aho-Corasick String Matching on Emerging Multicore Architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tumeo, Antonino; Villa, Oreste; Secchi, Simone
String matching algorithms are critical to several scientific fields. Beside text processing and databases, emerging applications such as DNA protein sequence analysis, data mining, information security software, antivirus, ma- chine learning, all exploit string matching algorithms [3]. All these applica- tions usually process large quantity of textual data, require high performance and/or predictable execution times. Among all the string matching algorithms, one of the most studied, especially for text processing and security applica- tions, is the Aho-Corasick algorithm. 1 2 Book title goes here Aho-Corasick is an exact, multi-pattern string matching algorithm which performs the search in a time linearlymore » proportional to the length of the input text independently from pattern set size. However, depending on the imple- mentation, when the number of patterns increase, the memory occupation may raise drastically. In turn, this can lead to significant variability in the performance, due to the memory access times and the caching effects. This is a significant concern for many mission critical applications and modern high performance architectures. For example, security applications such as Network Intrusion Detection Systems (NIDS), must be able to scan network traffic against very large dictionaries in real time. Modern Ethernet links reach up to 10 Gbps, and malicious threats are already well over 1 million, and expo- nentially growing [28]. When performing the search, a NIDS should not slow down the network, or let network packets pass unchecked. Nevertheless, on the current state-of-the-art cache based processors, there may be a large per- formance variability when dealing with big dictionaries and inputs that have different frequencies of matching patterns. In particular, when few patterns are matched and they are all in the cache, the procedure is fast. Instead, when they are not in the cache, often because many patterns are matched and the caches are continuously thrashed, they should be retrieved from the system memory and the procedure is slowed down by the increased latency. Efficient implementations of string matching algorithms have been the fo- cus of several works, targeting Field Programmable Gate Arrays [4, 25, 15, 5], highly multi-threaded solutions like the Cray XMT [34], multicore proces- sors [19] or heterogeneous processors like the Cell Broadband Engine [35, 22]. Recently, several researchers have also started to investigate the use Graphic Processing Units (GPUs) for string matching algorithms in security applica- tions [20, 10, 32, 33]. Most of these approaches mainly focus on reaching high peak performance, or try to optimize the memory occupation, rather than looking at performance stability. However, hardware solutions supports only small dictionary sizes due to lack of memory and are difficult to customize, while platforms such as the Cell/B.E. are very complex to program.« less
Salwiczek, Lucie H.; Schlinger, Barney; Emery, Nathan J.; Clayton, Nicola S.
2010-01-01
Recent studies on the food-caching behavior of corvids have revealed complex physical and social skills, yet little is known about the ontogeny of food caching in relation to the development of cognitive capacities. Piagetian object permanence is the understanding that objects continue to exist even when they are no longer visible. Here, the authors focus on Piagetian Stages 3 and 4, because they are hallmarks in the cognitive development of both young children and animals. Our aim is to determine in a food-caching corvid, the Western scrub-jay, whether (1) Piagetian Stage 4 competence and tentative caching (i.e., hiding an item invisibly and retrieving it without delay), emerge concomitantly or consecutively; (2) whether experiencing the reappearance of hidden objects enhances the timing of the appearance of object permanence; and (3) discuss how the development of object permanence is related to behavioral development and sensorimotor intelligence. Our findings suggest that object permanence Stage 4 emerges before tentative caching, and independent of environmental influences, but that once the birds have developed simple object-permanence, then social learning might advance the interval after which tentative caching commences. PMID:19685971
Salwiczek, Lucie H; Emery, Nathan J; Schlinger, Barney; Clayton, Nicola S
2009-08-01
Recent studies on the food-caching behavior of corvids have revealed complex physical and social skills, yet little is known about the ontogeny of food caching in relation to the development of cognitive capacities. Piagetian object permanence is the understanding that objects continue to exist even when they are no longer visible. Here, the authors focus on Piagetian Stages 3 and 4, because they are hallmarks in the cognitive development of both young children and animals. Our aim is to determine in a food-caching corvid, the Western scrub-jay, whether (1) Piagetian Stage 4 competence and tentative caching (i.e., hiding an item invisibly and retrieving it without delay), emerge concomitantly or consecutively; (2) whether experiencing the reappearance of hidden objects enhances the timing of the appearance of object permanence; and (3) discuss how the development of object permanence is related to behavioral development and sensorimotor intelligence. Our findings suggest that object permanence Stage 4 emerges before tentative caching, and independent of environmental influences, but that once the birds have developed simple object-permanence, then social learning might advance the interval after which tentative caching commences. Copyright 2009 APA, all rights reserved.
Dynamic storage in resource-scarce browsing multimedia applications
NASA Astrophysics Data System (ADS)
Elenbaas, Herman; Dimitrova, Nevenka
1998-10-01
In the convergence of information and entertainment there is a conflict between the consumer's expectation of fast access to high quality multimedia content through narrow bandwidth channels versus the size of this content. During the retrieval and information presentation of a multimedia application there are two problems that have to be solved: the limited bandwidth during transmission of the retrieved multimedia content and the limited memory for temporary caching. In this paper we propose an approach for latency optimization in information browsing applications. We proposed a method for flattening hierarchically linked documents in a manner convenient for network transport over slow channels to minimize browsing latency. Flattening of the hierarchy involves linearization, compression and bundling of the document nodes. After the transfer, the compressed hierarchy is stored on a local device where it can be partly unbundled to fit the caching limits at the local site while giving the user availability to the content.
Hoarding patterns in the southern flying squirrel (Glaucomys volans)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clemmer, S.M.
1984-05-01
Southern flying squirrels, Glaucomys volans, were individually offered two size classes of pecans in a 1:1 ratio to establish preference. All but one squirrel preferred small pecans. As relative abundance of preferred food diminished to 0.10, squirrels switched preference. Absolute abundance of either food did not affect caching levels. A difficulty-of-retrieval experiment did not result in switching of preference. No effect of sex on hoarding was exhibited. There was an inverse correlation between individual storing effort and caching levels of the same squirrels tested as pairs, with individual non-storers showing increases in numbers of pecans stored. Animals that were activemore » storers as individuals showed decreases when paired. Total number stored did not decrease significantly when squirrels were offered previously marked pecans. When offered own-marked and other-marked pecans, squirrels did not discriminate. 43 references, 3 figures, 6 tables.« less
76 FR 26981 - Proposed Flood Elevation Determinations
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-10
... table provided here represents the flooding sources, location of referenced elevations, effective and.... Specifically, it addresses the following flooding sources: Cache Creek, Cache Creek Left Bank Overflow, and... ``Unincorporated Areas of Yolo County, California'' addressed the flooding source Cache Creek Settling Basin. That...
Xrootd in dCache - design and experiences
NASA Astrophysics Data System (ADS)
Behrmann, Gerd; Ozerov, Dmitry; Zangerl, Thomas
2011-12-01
dCache is a well established distributed storage solution used in both high energy physics computing and other disciplines. An overview of the implementation of the xrootd data access protocol within dCache is presented. The performance of various access mechanisms is studied and compared and it is concluded that our implementation is as perfomant as other protocols. This makes dCache a compelling alternative to the Scalla software suite implementation of xrootd, with added value from broad protocol support, including the IETF approved NFS 4.1 protocol.
Performance of defect-tolerant set-associative cache memories
NASA Technical Reports Server (NTRS)
Frenzel, J. F.
1991-01-01
The increased use of on-chip cache memories has led researchers to investigate their performance in the presence of manufacturing defects. Several techniques for yield improvement are discussed and results are presented which indicate that set-associativity may be used to provide defect tolerance as well as improve the cache performance. Tradeoffs between several cache organizations and replacement strategies are investigated and it is shown that token-based replacement may be a suitable alternative to the widely-used LRU strategy.
Pravosudov, Vladimir V; Mendoza, Sally P; Clayton, Nicola S
2003-08-01
It has been hypothesized that in avian social groups subordinate individuals should maintain more energy reserves than dominants, as an insurance against increased perceived risk of starvation. Subordinates might also have elevated baseline corticosterone levels because corticosterone is known to facilitate fattening in birds. Recent experiments showed that moderately elevated corticosterone levels resulting from unpredictable food supply are correlated with enhanced cache retrieval efficiency and more accurate performance on a spatial memory task. Given the correlation between corticosterone and memory, a further prediction is that subordinates might be more efficient at cache retrieval and show more accurate performance on spatial memory tasks. We tested these predictions in dominant-subordinate pairs of mountain chickadees (Poecile gambeli). Each pair was housed in the same cage but caching behavior was tested individually in an adjacent aviary to avoid the confounding effects of small spaces in which birds could unnaturally and directly influence each other's behavior. In sharp contrast to our hypothesis, we found that subordinate chickadees cached less food, showed less efficient cache retrieval, and performed significantly worse on the spatial memory task than dominants. Although the behavioral differences could have resulted from social stress of subordination, and dominant birds reached significantly higher levels of corticosterone during their response to acute stress compared to subordinates, there were no significant differences between dominants and subordinates in baseline levels or in the pattern of adrenocortical stress response. We find no evidence, therefore, to support the hypothesis that subordinate mountain chickadees maintain elevated baseline corticosterone levels whereas lower caching rates and inferior cache retrieval efficiency might contribute to reduced survival of subordinates commonly found in food-caching parids.
AYUSH: A Technique for Extending Lifetime of SRAM-NVM Hybrid Caches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh; Vetter, Jeffrey S
2014-01-01
Recently, researchers have explored way-based hybrid SRAM-NVM (non-volatile memory) last level caches (LLCs) to bring the best of SRAM and NVM together. However, the limited write endurance of NVMs restricts the lifetime of these hybrid caches. We present AYUSH, a technique to enhance the lifetime of hybrid caches, which works by using data-migration to preferentially use SRAM for storing frequently-reused data. Microarchitectural simulations confirm that AYUSH achieves larger improvement in lifetime than a previous technique and also maintains performance and energy efficiency. For single, dual and quad-core workloads, the average increase in cache lifetime with AYUSH is 6.90X, 24.06X andmore » 47.62X, respectively.« less
Performance Prediction Toolkit
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chennupati, Gopinath; Santhi, Nanadakishore; Eidenbenz, Stephen
The Performance Prediction Toolkit (PPT), is a scalable co-design tool that contains the hardware and middle-ware models, which accept proxy applications as input in runtime prediction. PPT relies on Simian, a parallel discrete event simulation engine in Python or Lua, that uses the process concept, where each computing unit (host, node, core) is a Simian entity. Processes perform their task through message exchanges to remain active, sleep, wake-up, begin and end. The PPT hardware model of a compute core (such as a Haswell core) consists of a set of parameters, such as clock speed, memory hierarchy levels, their respective sizes,more » cache-lines, access times for different cache levels, average cycle counts of ALU operations, etc. These parameters are ideally read off a spec sheet or are learned using regression models learned from hardware counters (PAPI) data. The compute core model offers an API to the software model, a function called time_compute(), which takes as input a tasklist. A tasklist is an unordered set of ALU, and other CPU-type operations (in particular virtual memory loads and stores). The PPT application model mimics the loop structure of the application and replaces the computational kernels with a call to the hardware model's time_compute() function giving tasklists as input that model the compute kernel. A PPT application model thus consists of tasklists representing kernels and the high-er level loop structure that we like to think of as pseudo code. The key challenge for the hardware model's time_compute-function is to translate virtual memory accesses into actual cache hierarchy level hits and misses.PPT also contains another CPU core level hardware model, Analytical Memory Model (AMM). The AMM solves this challenge soundly, where our previous alternatives explicitly include the L1,L2,L3 hit-rates as inputs to the tasklists. Explicit hit-rates inevitably only reflect the application modeler's best guess, perhaps informed by a few small test problems using hardware counters; also, hard-coded hit-rates make the hardware model insensitive to changes in cache sizes. Alternatively, we use reuse distance distributions in the tasklists. In general, reuse profiles require the application modeler to run a very expensive trace analysis on the real code that realistically can be done at best for small examples.« less
Locality in Search Engine Queries and Its Implications for Caching
2001-05-01
in the question of whether caching might be effective for search engines as well. They study two real search engine traces by examining query...locality and its implications for caching. The two search engines studied are Vivisimo and Excite. Their trace analysis results show that queries have
Predictive Caching Using the TDAG Algorithm
NASA Technical Reports Server (NTRS)
Laird, Philip; Saul, Ronald
1992-01-01
We describe how the TDAG algorithm for learning to predict symbol sequences can be used to design a predictive cache store. A model of a two-level mass storage system is developed and used to calculate the performance of the cache under various conditions. Experimental simulations provide good confirmation of the model.
Mammal caching of oak acorns in a red pine and a mixed oak stand
E.R. Thorn; W.M. Tzilkowski
1991-01-01
Small mammal caching of oak (Quercus spp.) acorns in adjacent red pine (Pinus resinosa) and mixed-oak stands was investigated at The Penn State Experimental Forest, Huntingdon Co., Pennsylvania. Gray squirrels (Sciurus carolinensis) and mice (Peromyscus spp.) were the most common acorn-caching...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Evangelinos, Constantinos; Nair, Ravi; Ohmacht, Martin
In one embodiment, a computer-implemented method includes encountering a store operation during a compile-time of a program, where the store operation is applicable to a memory line. It is determined, by a computer processor, that no cache coherence action is necessary for the store operation. A store-without-coherence-action instruction is generated for the store operation, responsive to determining that no cache coherence action is necessary. The store-without-coherence-action instruction specifies that the store operation is to be performed without a cache coherence action, and cache coherence is maintained upon execution of the store-without-coherence-action instruction.
Population substructure in Cache County, Utah: the Cache County study
2014-01-01
Background Population stratification is a key concern for genetic association analyses. In addition, extreme homogeneity of ethnic origins of a population can make it difficult to interpret how genetic associations in that population may translate into other populations. Here we have evaluated the genetic substructure of samples from the Cache County study relative to the HapMap Reference populations and data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Results Our findings show that the Cache County study is similar in ethnic diversity to the self-reported "Whites" in the ADNI sample and less homogenous than the HapMap CEU population. Conclusions We conclude that the Cache County study is genetically representative of the general European American population in the USA and is an appropriate population for conducting broadly applicable genetic studies. PMID:25078123
Preliminary report on deposit models for sand and gravel in the Cache la Poudre River valley
Langer, W.H.; Lindsey, D.A.
1999-01-01
The stratigraphy, sedimentary features, and physical characteristics of gravel deposits in the Cache la Poudre River valley were studied to establish geologic models for these deposits. Because most of the gravel mined in the valley is beneath the low terraces and floodplain, the quality of these deposits for aggregate was studied in detail at eight sites in a 25.5-mile reach between Fort Collins and Greeley, Colorado. Aggregate quality was determined by field and laboratory measurements on samples collected under a consistent sampling plan. The Broadway terrace is underlain by Pleistocene alluvium and, at some places, by fine-grained wind-blown deposits. The Piney Creek terrace, low terraces, and floodplain are primarily underlain by Holocene alluvium. Pleistocene alluvium may underlie these terraces at isolated locations along the river. Gravels beneath the Piney Creek terrace, low terraces, and floodplain are divisible into two units that are poorly distinguishable at the upstream end of the study area, but are readily distinguishable about 7 miles downstream. Where distinguished, the two gravel units are separated by a sharp, locally erosional, contact. The upper gravel is probably of Holocene age, but the lower gravel is considered to be Holocene and Pleistocene. The primary variation in particle size of the gravels beneath the floodplain and low terraces of the Cache la Poudre River valley is the downstream decrease in the proportion of particles measuring 3/4 inch and larger. Above Fort Collins, about 60 pct of the gravel collects on the 3/4 inch sieve, whereas about 50 pct of gravel collects on the same sieve size at Greeley. For 1.5-inch sieves, the corresponding values are about 50 pct for Fort Collins and only about 30 pct for Greeley. Local differences in particle size and sorting between the upper and lower gravel units were observed in the field, but only the coarsest particle sizes appear to have been concentrated in the lower unit. Field measurements of aggregate quality, pebble lithology, and shape show little significant downstream variation. Pebble lithology is about 25 percent granite; 48 percent pegmatite; 5-7 percent each of gneiss, quartz, and quartzite; and minor amounts of diabase, schist, volcanic porphyry, and sandstone. Among the rock types, only the volcanic porphyries might be reactive with Portland cement. Pebble shape is dominantly equidimensional with a tendency to form thick, disc-shaped particles. Disc-shaped and spherical particles comprise about 39 percent and 31 percent of the pebble-size fraction, respectively. Rod and blade shapes comprise about 18 and 12 percent of the pebble-size fraction, respectively. The relatively large proportion of equidimensional particles in the Cache la Poudre may be due to the small proportion of layered gneiss in gravel. Pebbles having axial ratios less than 0.5, which might be structurally weak, are rare. The two gravel units show subtle local differences and evidence for derivation of the younger gravel from the older gravel. At many sites, the upper gravel unit tends to contain more quartz plus quartzite, has poorer physical quality, and contains more angular pebbles than the lower gravel. Weathering, followed by transport in the river, might be expected to concentrate quartz and quartzite, degrade physical quality, and break pebbles into angular fragments. This conclusion is consistent with local evidence of an erosional contact between the two gravel units.
Improving Internet Archive Service through Proxy Cache.
ERIC Educational Resources Information Center
Yu, Hsiang-Fu; Chen, Yi-Ming; Wang, Shih-Yong; Tseng, Li-Ming
2003-01-01
Discusses file transfer protocol (FTP) servers for downloading archives (files with particular file extensions), and the change to HTTP (Hypertext transfer protocol) with increased Web use. Topics include the Archie server; proxy cache servers; and how to improve the hit rate of archives by a combination of caching and better searching mechanisms.…
Winter prey caching by northern hawk owls in Minnesota
Richard R. Schaefer; D. Craig Rudolph; Jesse F. Fagan
2007-01-01
Northern Hawk Owls (Surnia ulula) have been reported to cache prey during the breeding season for later consumption, but detailed reports of prey caching during the non-breeding season are comparatively rare. We provided prey to four individual Northern Hawk Owls in wintering areas in northeastern Minnesota during 2001 and 2005 and observed their...
Visits, Hits, Caching and Counting on the World Wide Web: Old Wine in New Bottles?
ERIC Educational Resources Information Center
Berthon, Pierre; Pitt, Leyland; Prendergast, Gerard
1997-01-01
Although web browser caching speeds up retrieval, reduces network traffic, and decreases the load on servers and browser's computers, an unintended consequence for marketing research is that Web servers undercount hits. This article explores counting problems, caching, proxy servers, trawler software and presents a series of correction factors…
A measurement-based study of concurrency in a multiprocessor
NASA Technical Reports Server (NTRS)
Mcguire, Patrick John
1987-01-01
A systematic measurement-based methodology for characterizing the amount of concurrency present in a workload, and the effect of concurrency on system performance indices such as cache miss rate and bus activity are developed. Hardware and software instrumentation of an Alliant FX/8 was used to obtain data from a real workload environment. Results show that 35% of the workload is concurrent, with the concurrent periods typically using all available processors. Measurements of periods of change in concurrency show uneven usage of processors during these times. Other system measures, including cache miss rate and processor bus activity, are analyzed with respect to the concurrency measures. Probability of a cache miss is seen to increase with concurrency. The change in cache miss rate is much more sensitive to the fraction of concurrent code in the worklaod than the number of processors active during concurrency. Regression models are developed to quantify the relationships between cache miss rate, bus activity, and the concurrency measures. The model for cache miss rate predicts an increase in the median miss rate value as much as 300% for a 100% increase in concurrency in the workload.
NASA Astrophysics Data System (ADS)
Fang, Juan; Hao, Xiaoting; Fan, Qingwen; Chang, Zeqing; Song, Shuying
2017-05-01
In the Heterogeneous multi-core architecture, CPU and GPU processor are integrated on the same chip, which poses a new challenge to the last-level cache management. In this architecture, the CPU application and the GPU application execute concurrently, accessing the last-level cache. CPU and GPU have different memory access characteristics, so that they have differences in the sensitivity of last-level cache (LLC) capacity. For many CPU applications, a reduced share of the LLC could lead to significant performance degradation. On the contrary, GPU applications can tolerate increase in memory access latency when there is sufficient thread-level parallelism. Taking into account the GPU program memory latency tolerance characteristics, this paper presents a method that let GPU applications can access to memory directly, leaving lots of LLC space for CPU applications, in improving the performance of CPU applications and does not affect the performance of GPU applications. When the CPU application is cache sensitive, and the GPU application is insensitive to the cache, the overall performance of the system is improved significantly.
Episodic-like memory during cache recovery by scrub jays.
Clayton, N S; Dickinson, A
1998-09-17
The recollection of past experiences allows us to recall what a particular event was, and where and when it occurred, a form of memory that is thought to be unique to humans. It is known, however, that food-storing birds remember the spatial location and contents of their caches. Furthermore, food-storing animals adapt their caching and recovery strategies to the perishability of food stores, which suggests that they are sensitive to temporal factors. Here we show that scrub jays (Aphelocoma coerulescens) remember 'when' food items are stored by allowing them to recover perishable 'wax worms' (wax-moth larvae) and non-perishable peanuts which they had previously cached in visuospatially distinct sites. Jays searched preferentially for fresh wax worms, their favoured food, when allowed to recover them shortly after caching. However, they rapidly learned to avoid searching for worms after a longer interval during which the worms had decayed. The recovery preference of jays demonstrates memory of where and when particular food items were cached, thereby fulfilling the behavioural criteria for episodic-like memory in non-human animals.
Ordering of guarded and unguarded stores for no-sync I/O
Gara, Alan; Ohmacht, Martin
2013-06-25
A parallel computing system processes at least one store instruction. A first processor core issues a store instruction. A first queue, associated with the first processor core, stores the store instruction. A second queue, associated with a first local cache memory device of the first processor core, stores the store instruction. The first processor core updates first data in the first local cache memory device according to the store instruction. The third queue, associated with at least one shared cache memory device, stores the store instruction. The first processor core invalidates second data, associated with the store instruction, in the at least one shared cache memory. The first processor core invalidates third data, associated with the store instruction, in other local cache memory devices of other processor cores. The first processor core flushing only the first queue.
Pravosudov, Vladimir V; Clayton, Nicola S
2002-08-01
To test the hypothesis that accurate cache recovery is more critical for birds that live in harsh conditions where the food supply is limited and unpredictable, the authors compared food caching, memory, and the hippocampus of black-capped chickadees (Poecile atricapilla) from Alaska and Colorado. Under identical laboratory conditions, Alaska chickadees (a) cached significantly more food; (b) were more efficient at cache recovery: (c) performed more accurately on one-trial associative learning tasks in which birds had to rely on spatial memory, but did not differ when tested on a nonspatial version of this task; and (d) had significantly larger hippocampal volumes containing more neurons compared with Colorado chickadees. The results support the hypothesis that these population differences may reflect adaptations to a harsh environment.
Accurate low-cost methods for performance evaluation of cache memory systems
NASA Technical Reports Server (NTRS)
Laha, Subhasis; Patel, Janak H.; Iyer, Ravishankar K.
1988-01-01
Methods of simulation based on statistical techniques are proposed to decrease the need for large trace measurements and for predicting true program behavior. Sampling techniques are applied while the address trace is collected from a workload. This drastically reduces the space and time needed to collect the trace. Simulation techniques are developed to use the sampled data not only to predict the mean miss rate of the cache, but also to provide an empirical estimate of its actual distribution. Finally, a concept of primed cache is introduced to simulate large caches by the sampling-based method.
The Optimization of In-Memory Space Partitioning Trees for Cache Utilization
NASA Astrophysics Data System (ADS)
Yeo, Myung Ho; Min, Young Soo; Bok, Kyoung Soo; Yoo, Jae Soo
In this paper, a novel cache conscious indexing technique based on space partitioning trees is proposed. Many researchers investigated efficient cache conscious indexing techniques which improve retrieval performance of in-memory database management system recently. However, most studies considered data partitioning and targeted fast information retrieval. Existing data partitioning-based index structures significantly degrade performance due to the redundant accesses of overlapped spaces. Specially, R-tree-based index structures suffer from the propagation of MBR (Minimum Bounding Rectangle) information by updating data frequently. In this paper, we propose an in-memory space partitioning index structure for optimal cache utilization. The proposed index structure is compared with the existing index structures in terms of update performance, insertion performance and cache-utilization rate in a variety of environments. The results demonstrate that the proposed index structure offers better performance than existing index structures.
Efficient Cache use for Stencil Operations on Structured Discretization Grids
NASA Technical Reports Server (NTRS)
Frumkin, Michael; VanderWijngaart, Rob F.
2001-01-01
We derive tight bounds on the cache misses for evaluation of explicit stencil operators on structured grids. Our lower bound is based on the isoperimetrical property of the discrete octahedron. Our upper bound is based on a good surface to volume ratio of a parallelepiped spanned by a reduced basis of the interference lattice of a grid. Measurements show that our algorithm typically reduces the number of cache misses by a factor of three, relative to a compiler optimized code. We show that stencil calculations on grids whose interference lattice have a short vector feature abnormally high numbers of cache misses. We call such grids unfavorable and suggest to avoid these in computations by appropriate padding. By direct measurements on a MIPS R10000 processor we show a good correlation between abnormally high numbers of cache misses and unfavorable three-dimensional grids.
Wienert, Stephan; Beil, Michael; Saeger, Kai; Hufnagl, Peter; Schrader, Thomas
2009-01-09
The virtual microscopy is widely accepted in Pathology for educational purposes and teleconsultation but is far from the routine use in surgical pathology due to the technical requirements and some limitations. A technical problem is the limited bandwidth of a usual network and the delayed transmission rate and presentation time on the screen. In this study the process of secondary diagnostic was evaluated using the "T.Konsult Pathologie" service of the Professional Association of German Pathologists within the German breast cancer screening program. The characteristics of the access to the WSI (Whole Slide Images) have been analyzed to explore the possibilities of prefetching and caching to reduce the presentation and transfer time with the goal to increase user acceptance. The log files of the web server were analyzed to reconstruct the movements of the pathologist on the WSI and to create the observation path. Using a specialized tool the observation paths were extracted automatically from the log files. The attributes linearity, 3-point-linearity, changes per request, and number of consecutive requests were calculated to design, develop and evaluate different caching and prefetching strategies. The analysis of the observation paths showed that a complete accordance of two image requests is a very rare event. But more frequently a partial covering of two requested image areas can be found. In total 257 diagnostic paths from 131 WSI have been extracted and analysed. On average a diagnostic path consists of 16 image requests and takes 189 seconds between first and last image request. The mean linearity was 0,41 and the mean 3-point-linearity 0,85. Three different caching algorithms have been compared with respect to hit rate and additional image requests on the WSI server. Tests demonstrated that 95% of the diagnostic paths could be loaded without any deletion of entries in the cache (cache size 12,2 Megapixel). If the image parts are stored after JPEG compression this complies with less than 2 MB. WSI telepathology is a technology which offers the possibility to break the limitations of conventional static telepathology. The complete histological slide may be investigated instead of sets of images of lesions sampled by the presenting pathologist. The benefit is demonstrated by the high diagnostic security of 95% accordance between first and second diagnosis.
Wienert, Stephan; Beil, Michael; Saeger, Kai; Hufnagl, Peter; Schrader, Thomas
2009-01-01
Background The virtual microscopy is widely accepted in Pathology for educational purposes and teleconsultation but is far from the routine use in surgical pathology due to the technical requirements and some limitations. A technical problem is the limited bandwidth of a usual network and the delayed transmission rate and presentation time on the screen. Methods In this study the process of secondary diagnostic was evaluated using the "T.Konsult Pathologie" service of the Professional Association of German Pathologists within the German breast cancer screening program. The characteristics of the access to the WSI (Whole Slide Images) have been analyzed to explore the possibilities of prefetching and caching to reduce the presentation and transfer time with the goal to increase user acceptance. The log files of the web server were analyzed to reconstruct the movements of the pathologist on the WSI and to create the observation path. Using a specialized tool the observation paths were extracted automatically from the log files. The attributes linearity, 3-point-linearity, changes per request, and number of consecutive requests were calculated to design, develop and evaluate different caching and prefetching strategies. Results The analysis of the observation paths showed that a complete accordance of two image requests is a very rare event. But more frequently a partial covering of two requested image areas can be found. In total 257 diagnostic paths from 131 WSI have been extracted and analysed. On average a diagnostic path consists of 16 image requests and takes 189 seconds between first and last image request. The mean linearity was 0,41 and the mean 3-point-linearity 0,85. Three different caching algorithms have been compared with respect to hit rate and additional image requests on the WSI server. Tests demonstrated that 95% of the diagnostic paths could be loaded without any deletion of entries in the cache (cache size 12,2 Megapixel). If the image parts are stored after JPEG compression this complies with less than 2 MB. Discussion WSI telepathology is a technology which offers the possibility to break the limitations of conventional static telepathology. The complete histological slide may be investigated instead of sets of images of lesions sampled by the presenting pathologist. The benefit is demonstrated by the high diagnostic security of 95% accordance between first and second diagnosis. PMID:19134181
Zimmer, Hubert D; Lehnert, Günther
2006-01-01
If configurations of objects are presented in a S1-S2 matching task for the identity of objects a spatial mismatch effect occurs. Changing the (irrelevant) spatial layout lengthens response times. We investigated what causes this effect. We observed a reliable mismatch effect that was not influenced by a secondary task during maintenance. Neither articulatory suppression (Experiment 1), nor unattended (Experiments 2 and 6) or attended visual material (Experiment 3) reduced the effect, and this was independent of the length of the retention interval (Experiment 6). The effect was also rather independent of the visual appearance of the local elements. It was of similar size with color patches (Experiment 4) and with completely different surface information when testing was cross modal (Experiment 5), and the name-ability of the global configuration was not relevant (Experiments 6 and 7). In contrast, the figurative similarity of the configurations of S1 and S2 systematically influenced the size of the spatial mismatch effect (Experiment 7). We conclude that the spatial mismatch effect is caused by a mismatch of the global shape of the configuration stored together with the objects of S1 and not by a mismatch of templates of perceptual records maintained in a visual cache.
On the relative contributions of wind vs. animals to seed dispersal of four Sierra Nevada pines.
Vander Wall, Stephen B
2008-07-01
Selective pressures that influence the form of seed dispersal syndromes are poorly understood. Morphology of plant propagules is often used to infer the means of dispersal, but morphology can be misleading. Several species of pines, for example, have winged seeds adapted for wind dispersal but owe much of their establishment to scatter-hoarding animals. Here the relative importance of wind vs. animal dispersal is assessed for four species of pines of the eastern Sierra Nevada that have winged seeds but differed in seed size: lodgepole pine (Pinus contorta murrayana, 8 mg); ponderosa pine (Pinus ponderosa ponderosa, 56 mg); Jeffrey pine (Pinus jeffreyi, 160 mg); and sugar pine (Pinus lambertiana, 231 mg). Pre-dispersal seed mortality eliminated much of the ponderosa pine seed crop (66%), but had much less effect on Jeffrey pine (32% of seeds destroyed), lodgepole pine (29%), and sugar pine (7%). When cones opened most filled seeds were dispersed by wind. Animals removed > 99% of wind-dispersed Jeffrey and sugar pine seeds from the ground within 60 days, but animals gathered only 93% of lodgepole pine seeds and 38% of ponderosa pine seeds during the same period. Animals gathered and scatter hoarded radioactively labeled ponderosa, Jeffrey, and sugar pine seeds, making a total of 2103 caches over three years of study. Only three lodgepole pine caches were found. Caches typically contained 1-4 seeds buried 5-20 mm deep, depths suitable for seedling emergence. Although Jeffrey and sugar pine seeds are initially wind dispersed, nearly all seedlings arise from animal caches. Lodgepole pine is almost exclusively wind dispersed, with animals acting as seed predators. Animals treated ponderosa pine in an intermediate fashion. Two-phased dispersal of large, winged pine seeds appears adaptive; initial wind dispersal helps to minimize pre-dispersal seed mortality whereas scatter hoarding by animals places seeds in sites with a higher probability of seedling establishment.
Cache directory lookup reader set encoding for partial cache line speculation support
Gara, Alan; Ohmacht, Martin
2014-10-21
In a multiprocessor system, with conflict checking implemented in a directory lookup of a shared cache memory, a reader set encoding permits dynamic recordation of read accesses. The reader set encoding includes an indication of a portion of a line read, for instance by indicating boundaries of read accesses. Different encodings may apply to different types of speculative execution.
Nelson, Michael E.
2011-01-01
A single Gray Wolf (Canis lupus) killed an adult male White-tailed Deer (Odocoileus virginianus) and cached the intact carcass in 76 cm of snow. The carcass was revisited and entirely consumed between four and seven days later. This is the first recorded observation of a Gray Wolf caching an entire adult deer.
Formal verification of an MMU and MMU cache
NASA Technical Reports Server (NTRS)
Schubert, E. T.
1991-01-01
We describe the formal verification of a hardware subsystem consisting of a memory management unit and a cache. These devices are verified independently and then shown to interact correctly when composed. The MMU authorizes memory requests and translates virtual addresses to real addresses. The cache improves performance by maintaining a LRU (least recently used) list from the memory resident segment table.
NASA Astrophysics Data System (ADS)
Pleros, Nikos; Maniotis, Pavlos; Alexoudi, Theonitsa; Fitsios, Dimitris; Vagionas, Christos; Papaioannou, Sotiris; Vyrsokinos, K.; Kanellos, George T.
2014-03-01
The processor-memory performance gap, commonly referred to as "Memory Wall" problem, owes to the speed mismatch between processor and electronic RAM clock frequencies, forcing current Chip Multiprocessor (CMP) configurations to consume more than 50% of the chip real-estate for caching purposes. In this article, we present our recent work spanning from Si-based integrated optical RAM cell architectures up to complete optical cache memory architectures for Chip Multiprocessor configurations. Moreover, we discuss on e/o router subsystems with up to Tb/s routing capacity for cache interconnection purposes within CMP configurations, currently pursued within the FP7 PhoxTrot project.
A search game model of the scatter hoarder's problem
Alpern, Steve; Fokkink, Robbert; Lidbetter, Thomas; Clayton, Nicola S.
2012-01-01
Scatter hoarders are animals (e.g. squirrels) who cache food (nuts) over a number of sites for later collection. A certain minimum amount of food must be recovered, possibly after pilfering by another animal, in order to survive the winter. An optimal caching strategy is one that maximizes the survival probability, given worst case behaviour of the pilferer. We modify certain ‘accumulation games’ studied by Kikuta & Ruckle (2000 J. Optim. Theory Appl.) and Kikuta & Ruckle (2001 Naval Res. Logist.), which modelled the problem of optimal diversification of resources against catastrophic loss, to include the depth at which the food is hidden at each caching site. Optimal caching strategies can then be determined as equilibria in a new ‘caching game’. We show how the distribution of food over sites and the site-depths of the optimal caching varies with the animal's survival requirements and the amount of pilfering. We show that in some cases, ‘decoy nuts’ are required to be placed above other nuts that are buried further down at the same site. Methods from the field of search games are used. Some empirically observed behaviour can be shown to be optimal in our model. PMID:22012971
Image matrix processor for fast multi-dimensional computations
Roberson, George P.; Skeate, Michael F.
1996-01-01
An apparatus for multi-dimensional computation which comprises a computation engine, including a plurality of processing modules. The processing modules are configured in parallel and compute respective contributions to a computed multi-dimensional image of respective two dimensional data sets. A high-speed, parallel access storage system is provided which stores the multi-dimensional data sets, and a switching circuit routes the data among the processing modules in the computation engine and the storage system. A data acquisition port receives the two dimensional data sets representing projections through an image, for reconstruction algorithms such as encountered in computerized tomography. The processing modules include a programmable local host, by which they may be configured to execute a plurality of different types of multi-dimensional algorithms. The processing modules thus include an image manipulation processor, which includes a source cache, a target cache, a coefficient table, and control software for executing image transformation routines using data in the source cache and the coefficient table and loading resulting data in the target cache. The local host processor operates to load the source cache with a two dimensional data set, loads the coefficient table, and transfers resulting data out of the target cache to the storage system, or to another destination.
Joshua tree (Yucca brevifolia) seeds are dispersed by seed-caching rodents
Vander Wall, S.B.; Esque, T.; Haines, D.; Garnett, M.; Waitman, B.A.
2006-01-01
Joshua tree (Yucca brevifolia) is a distinctive and charismatic plant of the Mojave Desert. Although floral biology and seed production of Joshua tree and other yuccas are well understood, the fate of Joshua tree seeds has never been studied. We tested the hypothesis that Joshua tree seeds are dispersed by seed-caching rodents. We radioactively labelled Joshua tree seeds and followed their fates at five source plants in Potosi Wash, Clark County, Nevada, USA. Rodents made a mean of 30.6 caches, usually within 30 m of the base of source plants. Caches contained a mean of 5.2 seeds buried 3-30 nun deep. A variety of rodent species appears to have prepared the caches. Three of the 836 Joshua tree seeds (0.4%) cached germinated the following spring. Seed germination using rodent exclosures was nearly 15%. More than 82% of seeds in open plots were removed by granivores, and neither microsite nor supplemental water significantly affected germination. Joshua tree produces seeds in indehiscent pods or capsules, which rodents dismantle to harvest seeds. Because there is no other known means of seed dispersal, it is possible that the Joshua tree-rodent seed dispersal interaction is an obligate mutualism for the plant.
Minimizing Cache Misses Using Minimum-Surface Bodies
NASA Technical Reports Server (NTRS)
Frumkin, Michael; VanderWijngaart, Rob; Biegel, Bryan (Technical Monitor)
2002-01-01
A number of known techniques for improving cache performance in scientific computations involve the reordering of the iteration space. Some of these reorderings can be considered as coverings of the iteration space with the sets having good surface-to-volume ratio. Use of such sets reduces the number of cache misses in computations of local operators having the iteration space as a domain. First, we derive lower bounds which any algorithm must suffer while computing a local operator on a grid. Then we explore coverings of iteration spaces represented by structured and unstructured grids which allow us to approach these lower bounds. For structured grids we introduce a covering by successive minima tiles of the interference lattice of the grid. We show that the covering has low surface-to-volume ratio and present a computer experiment showing actual reduction of the cache misses achieved by using these tiles. For planar unstructured grids we show existence of a covering which reduces the number of cache misses to the level of structured grids. On the other hand, we present a triangulation of a 3-dimensional cube such that any local operator on the corresponding grid has significantly larger number of cache misses than a similar operator on a structured grid.
NASA Technical Reports Server (NTRS)
Younse, Paulo J.; Dicicco, Matthew A.; Morgan, Albert R.
2012-01-01
A report describes the PLuto (programmable logic) Mars Technology Rover, a mid-sized FIDO (field integrated design and operations) class rover with six fully drivable and steerable cleated wheels, a rocker-bogey suspension, a pan-tilt mast with panorama and navigation stereo camera pairs, forward and rear stereo hazcam pairs, internal avionics with motor drivers and CPU, and a 5-degrees-of-freedom robotic arm. The technology rover was integrated with an arm-mounted percussive coring tool, microimager, and sample handling encapsulation containerization subsystem (SHEC). The turret of the arm contains a percussive coring drill and microimager. The SHEC sample caching system mounted to the rover body contains coring bits, sample tubes, and sample plugs. The coring activities performed in the field provide valuable data on drilling conditions for NASA tasks developing and studying coring technology. Caching of samples using the SHEC system provide insight to NASA tasks investigating techniques to store core samples in the future.
Parallelization of an Object-Oriented Unstructured Aeroacoustics Solver
NASA Technical Reports Server (NTRS)
Baggag, Abdelkader; Atkins, Harold; Oezturan, Can; Keyes, David
1999-01-01
A computational aeroacoustics code based on the discontinuous Galerkin method is ported to several parallel platforms using MPI. The discontinuous Galerkin method is a compact high-order method that retains its accuracy and robustness on non-smooth unstructured meshes. In its semi-discrete form, the discontinuous Galerkin method can be combined with explicit time marching methods making it well suited to time accurate computations. The compact nature of the discontinuous Galerkin method also makes it well suited for distributed memory parallel platforms. The original serial code was written using an object-oriented approach and was previously optimized for cache-based machines. The port to parallel platforms was achieved simply by treating partition boundaries as a type of boundary condition. Code modifications were minimal because boundary conditions were abstractions in the original program. Scalability results are presented for the SCI Origin, IBM SP2, and clusters of SGI and Sun workstations. Slightly superlinear speedup is achieved on a fixed-size problem on the Origin, due to cache effects.
NASA Astrophysics Data System (ADS)
Dykstra, D.; Bockelman, B.; Blomer, J.; Herner, K.; Levshina, T.; Slyz, M.
2015-12-01
A common use pattern in the computing models of particle physics experiments is running many distributed applications that read from a shared set of data files. We refer to this data is auxiliary data, to distinguish it from (a) event data from the detector (which tends to be different for every job), and (b) conditions data about the detector (which tends to be the same for each job in a batch of jobs). Relatively speaking, conditions data also tends to be relatively small per job where both event data and auxiliary data are larger per job. Unlike event data, auxiliary data comes from a limited working set of shared files. Since there is spatial locality of the auxiliary data access, the use case appears to be identical to that of the CernVM- Filesystem (CVMFS). However, we show that distributing auxiliary data through CVMFS causes the existing CVMFS infrastructure to perform poorly. We utilize a CVMFS client feature called "alien cache" to cache data on existing local high-bandwidth data servers that were engineered for storing event data. This cache is shared between the worker nodes at a site and replaces caching CVMFS files on both the worker node local disks and on the site's local squids. We have tested this alien cache with the dCache NFSv4.1 interface, Lustre, and the Hadoop Distributed File System (HDFS) FUSE interface, and measured performance. In addition, we use high-bandwidth data servers at central sites to perform the CVMFS Stratum 1 function instead of the low-bandwidth web servers deployed for the CVMFS software distribution function. We have tested this using the dCache HTTP interface. As a result, we have a design for an end-to-end high-bandwidth distributed caching read-only filesystem, using existing client software already widely deployed to grid worker nodes and existing file servers already widely installed at grid sites. Files are published in a central place and are soon available on demand throughout the grid and cached locally on the site with a convenient POSIX interface. This paper discusses the details of the architecture and reports performance measurements.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dykstra, D.; Bockelman, B.; Blomer, J.
A common use pattern in the computing models of particle physics experiments is running many distributed applications that read from a shared set of data files. We refer to this data is auxiliary data, to distinguish it from (a) event data from the detector (which tends to be different for every job), and (b) conditions data about the detector (which tends to be the same for each job in a batch of jobs). Relatively speaking, conditions data also tends to be relatively small per job where both event data and auxiliary data are larger per job. Unlike event data, auxiliarymore » data comes from a limited working set of shared files. Since there is spatial locality of the auxiliary data access, the use case appears to be identical to that of the CernVM- Filesystem (CVMFS). However, we show that distributing auxiliary data through CVMFS causes the existing CVMFS infrastructure to perform poorly. We utilize a CVMFS client feature called 'alien cache' to cache data on existing local high-bandwidth data servers that were engineered for storing event data. This cache is shared between the worker nodes at a site and replaces caching CVMFS files on both the worker node local disks and on the site's local squids. We have tested this alien cache with the dCache NFSv4.1 interface, Lustre, and the Hadoop Distributed File System (HDFS) FUSE interface, and measured performance. In addition, we use high-bandwidth data servers at central sites to perform the CVMFS Stratum 1 function instead of the low-bandwidth web servers deployed for the CVMFS software distribution function. We have tested this using the dCache HTTP interface. As a result, we have a design for an end-to-end high-bandwidth distributed caching read-only filesystem, using existing client software already widely deployed to grid worker nodes and existing file servers already widely installed at grid sites. Files are published in a central place and are soon available on demand throughout the grid and cached locally on the site with a convenient POSIX interface. This paper discusses the details of the architecture and reports performance measurements.« less
Agricultural Influences on Cache Valley, Utah Air Quality During a Wintertime Inversion Episode
NASA Astrophysics Data System (ADS)
Silva, P. J.
2017-12-01
Several of northern Utah's intermountain valleys are classified as non-attainment for fine particulate matter. Past data indicate that ammonium nitrate is the major contributor to fine particles and that the gas phase ammonia concentrations are among the highest in the United States. During the 2017 Utah Winter Fine Particulate Study, USDA brought a suite of online and real-time measurement methods to sample particulate matter and potential gaseous precursors from agricultural emissions in the Cache Valley. Instruments were co-located at the State of Utah monitoring site in Smithfield, Utah from January 21st through February 12th, 2017. A Scanning mobility particle sizer (SMPS) and aerodynamic particle sizer (APS) acquired size distributions of particles from 10 nm - 10 μm in 5-min intervals. A URG ambient ion monitor (AIM) gave hourly concentrations for gas and particulate ions and a Chromatotec Trsmedor gas chromatograph obtained 10 minute measurements of gaseous sulfur species. High ammonia concentrations were detected at the Smithfield site with concentrations above 100 ppb at times, indicating a significant influence from agriculture at the sampling site. Ammonia is not the only agricultural emission elevated in Cache Valley during winter, as reduced sulfur gas concentrations of up to 20 ppb were also detected. Dimethylsulfide was the major sulfur-containing gaseous species. Analysis indicates that particle growth and particle nucleation events were both observed by the SMPS. Relationships between gas and particulate concentrations and correlations between the two will be discussed.
An effective write policy for software coherence schemes
NASA Technical Reports Server (NTRS)
Chen, Yung-Chin; Veidenbaum, Alexander V.
1992-01-01
The authors study the write behavior and evaluate the performance of various write strategies and buffering techniques for a MIN-based multiprocessor system using the simple software coherence scheme. Hit ratios, memory latencies, total execution time, and total write traffic are used as the performance indices. The write-through write-allocate no-fetch cache using a write-back write buffer is shown to have a better performance than both write-through and write-back caches. This type of write buffer is effective in reducing the volume as well as bursts of write traffic. On average, the use of a write-back cache reduces by 60 percent the total write traffic generated by a write-through cache.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smith, Tyler Barratt; Urrea, Jorge Mario
2012-06-01
The aim of the Authenticating Cache architecture is to ensure that machine instructions in a Read Only Memory (ROM) are legitimate from the time the ROM image is signed (immediately after compilation) to the time they are placed in the cache for the processor to consume. The proposed architecture allows the detection of ROM image modifications during distribution or when it is loaded into memory. It also ensures that modified instructions will not execute in the processor-as the cache will not be loaded with a page that fails an integrity check. The authenticity of the instruction stream can also bemore » verified in this architecture. The combination of integrity and authenticity assurance greatly improves the security profile of a system.« less
Replication Strategy for Spatiotemporal Data Based on Distributed Caching System
Xiong, Lian; Tao, Yang; Xu, Juan; Zhao, Lun
2018-01-01
The replica strategy in distributed cache can effectively reduce user access delay and improve system performance. However, developing a replica strategy suitable for varied application scenarios is still quite challenging, owing to differences in user access behavior and preferences. In this paper, a replication strategy for spatiotemporal data (RSSD) based on a distributed caching system is proposed. By taking advantage of the spatiotemporal locality and correlation of user access, RSSD mines high popularity and associated files from historical user access information, and then generates replicas and selects appropriate cache node for placement. Experimental results show that the RSSD algorithm is simple and efficient, and succeeds in significantly reducing user access delay. PMID:29342897
Error recovery in shared memory multiprocessors using private caches
NASA Technical Reports Server (NTRS)
Wu, Kun-Lung; Fuchs, W. Kent; Patel, Janak H.
1990-01-01
The problem of recovering from processor transient faults in shared memory multiprocesses systems is examined. A user-transparent checkpointing and recovery scheme using private caches is presented. Processes can recover from errors due to faulty processors by restarting from the checkpointed computation state. Implementation techniques using checkpoint identifiers and recovery stacks are examined as a means of reducing performance degradation in processor utilization during normal execution. This cache-based checkpointing technique prevents rollback propagation, provides rapid recovery, and can be integrated into standard cache coherence protocols. An analytical model is used to estimate the relative performance of the scheme during normal execution. Extensions to take error latency into account are presented.
Magpies can use local cues to retrieve their food caches.
Feenders, Gesa; Smulders, Tom V
2011-03-01
Much importance has been placed on the use of spatial cues by food-hoarding birds in the retrieval of their caches. In this study, we investigate whether food-hoarding birds can be trained to use local cues ("beacons") in their cache retrieval. We test magpies (Pica pica) in an active hoarding-retrieval paradigm, where local cues are always reliable, while spatial cues are not. Our results show that the birds use the local cues to retrieve their caches, even when occasionally contradicting spatial information is available. The design of our study does not allow us to test rigorously whether the birds prefer using local over spatial cues, nor to investigate the process through which they learn to use local cues. We furthermore provide evidence that magpies develop landmark preferences, which improve their retrieval accuracy. Our findings support the hypothesis that birds are flexible in their use of memory information, using a combination of the most reliable or salient information to retrieve their caches. © Springer-Verlag 2010
Using Minimum-Surface Bodies for Iteration Space Partitioning
NASA Technical Reports Server (NTRS)
Frumlin, Michael; VanderWijngaart, Rob F.; Biegel, Bryan (Technical Monitor)
2001-01-01
A number of known techniques for improving cache performance in scientific computations involve the reordering of the iteration space. Some of these reorderings can be considered as coverings of the iteration space with the sets having good surface-to-volume ratio. Use of such sets reduces the number of cache misses in computations of local operators having the iteration space as a domain. We study coverings of iteration spaces represented by structured and unstructured grids. For structured grids we introduce a covering based on successive minima tiles of the interference lattice of the grid. We show that the covering has good surface-to-volume ratio and present a computer experiment showing actual reduction of the cache misses achieved by using these tiles. For unstructured grids no cache efficient covering can be guaranteed. We present a triangulation of a 3-dimensional cube such that any local operator on the corresponding grid has significantly larger number of cache misses than a similar operator on a structured grid.
Wiewiórka, Marek S; Messina, Antonio; Pacholewska, Alicja; Maffioletti, Sergio; Gawrysiak, Piotr; Okoniewski, Michał J
2014-09-15
Many time-consuming analyses of next -: generation sequencing data can be addressed with modern cloud computing. The Apache Hadoop-based solutions have become popular in genomics BECAUSE OF: their scalability in a cloud infrastructure. So far, most of these tools have been used for batch data processing rather than interactive data querying. The SparkSeq software has been created to take advantage of a new MapReduce framework, Apache Spark, for next-generation sequencing data. SparkSeq is a general-purpose, flexible and easily extendable library for genomic cloud computing. It can be used to build genomic analysis pipelines in Scala and run them in an interactive way. SparkSeq opens up the possibility of customized ad hoc secondary analyses and iterative machine learning algorithms. This article demonstrates its scalability and overall fast performance by running the analyses of sequencing datasets. Tests of SparkSeq also prove that the use of cache and HDFS block size can be tuned for the optimal performance on multiple worker nodes. Available under open source Apache 2.0 license: https://bitbucket.org/mwiewiorka/sparkseq/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Hayakawa, Hitoshi; Ogawa, Makoto; Shibata, Tadashi
2005-04-01
A very large scale integrated circuit (VLSI) architecture for a multiple-instruction-stream multiple-data-stream (MIMD) associative processor has been proposed. The processor employs an architecture that enables seamless switching from associative operations to arithmetic operations. The MIMD element is convertible to a regular central processing unit (CPU) while maintaining its high performance as an associative processor. Therefore, the MIMD associative processor can perform not only on-chip perception, i.e., searching for the vector most similar to an input vector throughout the on-chip cache memory, but also arithmetic and logic operations similar to those in ordinary CPUs, both simultaneously in parallel processing. Three key technologies have been developed to generate the MIMD element: associative-operation-and-arithmetic-operation switchable calculation units, a versatile register control scheme within the MIMD element for flexible operations, and a short instruction set for minimizing the memory size for program storage. Key circuit blocks were designed and fabricated using 0.18 μm complementary metal-oxide-semiconductor (CMOS) technology. As a result, the full-featured MIMD element is estimated to be 3 mm2, showing the feasibility of an 8-parallel-MIMD-element associative processor in a single chip of 5 mm× 5 mm.
Software Exploit Prevention and Remediation via Software Memory Protection
2009-05-01
trampolines that are necessary. Trampolines are pieces of code emitted into the fragment cache to transfer con- trol back to Strata. Most control...transfer instructions (CTIs) are initially linked to trampolines (unless the transfer target already exists in the fragment cache). Once a CTI’s target...instruction becomes available in the fragment cache, the CTI is linked directly to the destination, avoiding future uses of the trampoline . This
Image matrix processor for fast multi-dimensional computations
Roberson, G.P.; Skeate, M.F.
1996-10-15
An apparatus for multi-dimensional computation is disclosed which comprises a computation engine, including a plurality of processing modules. The processing modules are configured in parallel and compute respective contributions to a computed multi-dimensional image of respective two dimensional data sets. A high-speed, parallel access storage system is provided which stores the multi-dimensional data sets, and a switching circuit routes the data among the processing modules in the computation engine and the storage system. A data acquisition port receives the two dimensional data sets representing projections through an image, for reconstruction algorithms such as encountered in computerized tomography. The processing modules include a programmable local host, by which they may be configured to execute a plurality of different types of multi-dimensional algorithms. The processing modules thus include an image manipulation processor, which includes a source cache, a target cache, a coefficient table, and control software for executing image transformation routines using data in the source cache and the coefficient table and loading resulting data in the target cache. The local host processor operates to load the source cache with a two dimensional data set, loads the coefficient table, and transfers resulting data out of the target cache to the storage system, or to another destination. 10 figs.
Zhou, ZhangBing; Zhao, Deng; Shu, Lei; Tsang, Kim-Fung
2015-01-01
Wireless sensor networks, serving as an important interface between physical environments and computational systems, have been used extensively for supporting domain applications, where multiple-attribute sensory data are queried from the network continuously and periodically. Usually, certain sensory data may not vary significantly within a certain time duration for certain applications. In this setting, sensory data gathered at a certain time slot can be used for answering concurrent queries and may be reused for answering the forthcoming queries when the variation of these data is within a certain threshold. To address this challenge, a popularity-based cooperative caching mechanism is proposed in this article, where the popularity of sensory data is calculated according to the queries issued in recent time slots. This popularity reflects the possibility that sensory data are interested in the forthcoming queries. Generally, sensory data with the highest popularity are cached at the sink node, while sensory data that may not be interested in the forthcoming queries are cached in the head nodes of divided grid cells. Leveraging these cooperatively cached sensory data, queries are answered through composing these two-tier cached data. Experimental evaluation shows that this approach can reduce the network communication cost significantly and increase the network capability. PMID:26131665
Analysis of power gating in different hierarchical levels of 2MB cache, considering variation
NASA Astrophysics Data System (ADS)
Jafari, Mohsen; Imani, Mohsen; Fathipour, Morteza
2015-09-01
This article reintroduces power gating technique in different hierarchical levels of static random-access memory (SRAM) design including cell, row, bank and entire cache memory in 16 nm Fin field effect transistor. Different structures of SRAM cells such as 6T, 8T, 9T and 10T are used in design of 2MB cache memory. The power reduction of the entire cache memory employing cell-level optimisation is 99.7% with the expense of area and other stability overheads. The power saving of the cell-level optimisation is 3× (1.2×) higher than power gating in cache (bank) level due to its superior selectivity. The access delay times are allowed to increase by 4% in the same energy delay product to achieve the best power reduction for each supply voltages and optimisation levels. The results show the row-level power gating is the best for optimising the power of the entire cache with lowest drawbacks. Comparisons of cells show that the cells whose bodies have higher power consumption are the best candidates for power gating technique in row-level optimisation. The technique has the lowest percentage of saving in minimum energy point (MEP) of the design. The power gating also improves the variation of power in all structures by at least 70%.
Respiratory hospital admissions associated with PM10 pollution in Utah, Salt Lake, and Cache Valleys
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pope CA, I.I.I.
This study assessed the association between respiratory hospital admissions and PM10 pollution in Utah, Salt Lake, and Cache valleys during April 1985 through March 1989. Utah and Salt Lake valleys had high levels of PM10 pollution that violated both the annual and 24-h standards issued by the Environmental Protection Agency (EPA). Much lower PM10 levels occurred in the Cache Valley. Utah Valley experienced the intermittent operation of its primary source of PM10 pollution: an integrated steel mill. Bronchitis and asthma admissions for preschool-age children were approximately twice as frequent in Utah Valley when the steel mill was operating versus whenmore » it was not. Similar differences were not observed in Salt Lake or Cache valleys. Even though Cache Valley had higher smoking rates and lower temperatures in winter than did Utah Valley, per capita bronchitis and asthma admissions for all ages were approximately twice as high in Utah Valley. During the period when the steel mill was closed, differences in per capita admissions between Utah and Cache valleys narrowed considerably. Regression analysis also demonstrated a statistical association between respiratory hospital admissions and PM10 pollution. The results suggest that PM10 pollution plays a role in the incidence and severity of respiratory disease.« less
Longland, William; Ostoja, Steven M.
2013-01-01
Seeds of Indian ricegrass (Achnatherum hymenoides), a native bunchgrass common to sandy soils on arid western rangelands, are naturally dispersed by seed-caching rodent species, particularly Dipodomys spp. (kangaroo rats). These animals cache large quantities of seeds when mature seeds are available on or beneath plants and recover most of their caches for consumption during the remainder of the year. Unrecovered seeds in caches account for the vast majority of Indian ricegrass seedling recruitment. We applied three different densities of white millet (Panicum miliaceum) seeds as “diversionary foods” to plots at three Great Basin study sites in an attempt to reduce rodents' over-winter cache recovery so that more Indian ricegrass seeds would remain in soil seedbanks and potentially establish new seedlings. One year after diversionary seed application, a moderate level of Indian ricegrass seedling recruitment occurred at two of our study sites in western Nevada, although there was no recruitment at the third site in eastern California. At both Nevada sites, the number of Indian ricegrass seedlings sampled along transects was significantly greater on all plots treated with diversionary seeds than on non-seeded control plots. However, the density of diversionary seeds applied to plots had a marginally non-significant effect on seedling recruitment, and it was not correlated with recruitment patterns among plots. Results suggest that application of a diversionary seed type that is preferred by seed-caching rodents provides a promising passive restoration strategy for target plant species that are dispersed by these rodents.
An efficient parallel-processing method for transposing large matrices in place.
Portnoff, M R
1999-01-01
We have developed an efficient algorithm for transposing large matrices in place. The algorithm is efficient because data are accessed either sequentially in blocks or randomly within blocks small enough to fit in cache, and because the same indexing calculations are shared among identical procedures operating on independent subsets of the data. This inherent parallelism makes the method well suited for a multiprocessor computing environment. The algorithm is easy to implement because the same two procedures are applied to the data in various groupings to carry out the complete transpose operation. Using only a single processor, we have demonstrated nearly an order of magnitude increase in speed over the previously published algorithm by Gate and Twigg for transposing a large rectangular matrix in place. With multiple processors operating in parallel, the processing speed increases almost linearly with the number of processors. A simplified version of the algorithm for square matrices is presented as well as an extension for matrices large enough to require virtual memory.
Security Enhancement Using Cache Based Reauthentication in WiMAX Based E-Learning System
Rajagopal, Chithra; Bhuvaneshwaran, Kalaavathi
2015-01-01
WiMAX networks are the most suitable for E-Learning through their Broadcast and Multicast Services at rural areas. Authentication of users is carried out by AAA server in WiMAX. In E-Learning systems the users must be forced to perform reauthentication to overcome the session hijacking problem. The reauthentication of users introduces frequent delay in the data access which is crucial in delaying sensitive applications such as E-Learning. In order to perform fast reauthentication caching mechanism known as Key Caching Based Authentication scheme is introduced in this paper. Even though the cache mechanism requires extra storage to keep the user credentials, this type of mechanism reduces the 50% of the delay occurring during reauthentication. PMID:26351658
Security Enhancement Using Cache Based Reauthentication in WiMAX Based E-Learning System.
Rajagopal, Chithra; Bhuvaneshwaran, Kalaavathi
2015-01-01
WiMAX networks are the most suitable for E-Learning through their Broadcast and Multicast Services at rural areas. Authentication of users is carried out by AAA server in WiMAX. In E-Learning systems the users must be forced to perform reauthentication to overcome the session hijacking problem. The reauthentication of users introduces frequent delay in the data access which is crucial in delaying sensitive applications such as E-Learning. In order to perform fast reauthentication caching mechanism known as Key Caching Based Authentication scheme is introduced in this paper. Even though the cache mechanism requires extra storage to keep the user credentials, this type of mechanism reduces the 50% of the delay occurring during reauthentication.
Clayton, Nicola S; Yu, Kara Shirley; Dickinson, Anthony
2003-01-01
When Western Scrub-Jays (Aphelocoma californica) cached and recovered perishable crickets, N. S. Clayton, K. S. Yu, and A. Dickinson (2001) reported that the jays rapidly learned to search for fresh crickets after a 1-day retention interval (RI) between caching and recovery but to avoid searching for perished crickets after a 4-day RI. In the present experiments, the jays generalized their search preference for crickets to intermediate RIs and used novel information about the rate of decay of crickets presented during the RI to reverse these search preferences at recovery. The authors interpret this reversal as evidence that the birds can integrate information about the caching episode with new information presented during the RI.
Improving Block-level Efficiency with scsi-mq
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caldwell, Blake A
2015-01-01
Current generation solid-state storage devices are exposing a new bottlenecks in the SCSI and block layers of the Linux kernel, where IO throughput is limited by lock contention, inefficient interrupt handling, and poor memory locality. To address these limitations, the Linux kernel block layer underwent a major rewrite with the blk-mq project to move from a single request queue to a multi-queue model. The Linux SCSI subsystem rework to make use of this new model, known as scsi-mq, has been merged into the Linux kernel and work is underway for dm-multipath support in the upcoming Linux 4.0 kernel. These piecesmore » were necessary to make use of the multi-queue block layer in a Lustre parallel filesystem with high availability requirements. We undertook adding support of the 3.18 kernel to Lustre with scsi-mq and dm-multipath patches to evaluate the potential of these efficiency improvements. In this paper we evaluate the block-level performance of scsi-mq with backing storage hardware representative of a HPC-targerted Lustre filesystem. Our findings show that SCSI write request latency is reduced by as much as 13.6%. Additionally, when profiling the CPU usage of our prototype Lustre filesystem, we found that CPU idle time increased by a factor of 7 with Linux 3.18 and blk-mq as compared to a standard 2.6.32 Linux kernel. Our findings demonstrate increased efficiency of the multi-queue block layer even with disk-based caching storage arrays used in existing parallel filesystems.« less
Cache Sharing and Isolation Tradeoffs in Multicore Mixed-Criticality Systems
2015-05-01
of lockdown registers, to provide way-based partitioning. These alternatives are illustrated in Fig. 1 with respect to a quad-core ARM Cortex A9...presented a cache-partitioning scheme that allows multiple tasks to share the same cache partition on a single processor (as we do for Level-A and...sets and determined the fraction that were schedulable on our target hardware platform, the quad-core ARM Cortex A9 machine mentioned earlier, the LLC
Software-Controlled Caches in the VMP Multiprocessor
1986-03-01
programming system level that Processors is tuned for the VMP design. In this vein, we are interested in exploring how far the software support can go to ...handled in software, analogously to the handling agement of the shared program state is familiar and of virtual memory page faults. Hardware support for...ensure good behavior, as opposed to how Each cache miss results in bus traffic. Table 2 pro- vides the bus cost for the "average" cache miss. Fig
Constant time worker thread allocation via configuration caching
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eichenberger, Alexandre E; O'Brien, John K. P.
Mechanisms are provided for allocating threads for execution of a parallel region of code. A request for allocation of worker threads to execute the parallel region of code is received from a master thread. Cached thread allocation information identifying prior thread allocations that have been performed for the master thread are accessed. Worker threads are allocated to the master thread based on the cached thread allocation information. The parallel region of code is executed using the allocated worker threads.
PCM-Based Durable Write Cache for Fast Disk I/O
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Zhuo; Wang, Bin; Carpenter, Patrick
2012-01-01
Flash based solid-state devices (FSSDs) have been adopted within the memory hierarchy to improve the performance of hard disk drive (HDD) based storage system. However, with the fast development of storage-class memories, new storage technologies with better performance and higher write endurance than FSSDs are emerging, e.g., phase-change memory (PCM). Understanding how to leverage these state-of-the-art storage technologies for modern computing systems is important to solve challenging data intensive computing problems. In this paper, we propose to leverage PCM for a hybrid PCM-HDD storage architecture. We identify the limitations of traditional LRU caching algorithms for PCM-based caches, and develop amore » novel hash-based write caching scheme called HALO to improve random write performance of hard disks. To address the limited durability of PCM devices and solve the degraded spatial locality in traditional wear-leveling techniques, we further propose novel PCM management algorithms that provide effective wear-leveling while maximizing access parallelism. We have evaluated this PCM-based hybrid storage architecture using applications with a diverse set of I/O access patterns. Our experimental results demonstrate that the HALO caching scheme leads to an average reduction of 36.8% in execution time compared to the LRU caching scheme, and that the SFC wear leveling extends the lifetime of PCM by a factor of 21.6.« less
Cooperation and information replication in wireless networks.
Poularakis, Konstantinos; Tassiulas, Leandros
2016-03-06
A significant portion of today's network traffic is due to recurring downloads of a few popular contents. It has been observed that replicating the latter in caches installed at network edges-close to users-can drastically reduce network bandwidth usage and improve content access delay. Such caching architectures are gaining increasing interest in recent years as a way of dealing with the explosive traffic growth, fuelled further by the downward slope in storage space price. In this work, we provide an overview of caching with a particular emphasis on emerging network architectures that enable caching at the radio access network. In this context, novel challenges arise due to the broadcast nature of the wireless medium, which allows simultaneously serving multiple users tuned into a multicast stream, and the mobility of the users who may be frequently handed off from one cell tower to another. Existing results indicate that caching at the wireless edge has a great potential in removing bottlenecks on the wired backbone networks. Taking into consideration the schedule of multicast service and mobility profiles is crucial to extract maximum benefit in network performance. © 2016 The Author(s).
Turbidity and Total Suspended Solids on the Lower Cache River Watershed, AR.
Rosado-Berrios, Carlos A; Bouldin, Jennifer L
2016-06-01
The Cache River Watershed (CRW) in Arkansas is part of one of the largest remaining bottomland hardwood forests in the US. Although wetlands are known to improve water quality, the Cache River is listed as impaired due to sedimentation and turbidity. This study measured turbidity and total suspended solids (TSS) in seven sites of the lower CRW; six sites were located on the Bayou DeView tributary of the Cache River. Turbidity and TSS levels ranged from 1.21 to 896 NTU, and 0.17 to 386.33 mg/L respectively and had an increasing trend over the 3-year study. However, a decreasing trend from upstream to downstream in the Bayou DeView tributary was noted. Sediment loading calculated from high precipitation events and mean TSS values indicate that contributions from the Cache River main channel was approximately 6.6 times greater than contributions from Bayou DeView. Land use surrounding this river channel affects water quality as wetlands provide a filter for sediments in the Bayou DeView channel.
Integrating Cache Performance Modeling and Tuning Support in Parallelization Tools
NASA Technical Reports Server (NTRS)
Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)
1998-01-01
With the resurgence of distributed shared memory (DSM) systems based on cache-coherent Non Uniform Memory Access (ccNUMA) architectures and increasing disparity between memory and processors speeds, data locality overheads are becoming the greatest bottlenecks in the way of realizing potential high performance of these systems. While parallelization tools and compilers facilitate the users in porting their sequential applications to a DSM system, a lot of time and effort is needed to tune the memory performance of these applications to achieve reasonable speedup. In this paper, we show that integrating cache performance modeling and tuning support within a parallelization environment can alleviate this problem. The Cache Performance Modeling and Prediction Tool (CPMP), employs trace-driven simulation techniques without the overhead of generating and managing detailed address traces. CPMP predicts the cache performance impact of source code level "what-if" modifications in a program to assist a user in the tuning process. CPMP is built on top of a customized version of the Computer Aided Parallelization Tools (CAPTools) environment. Finally, we demonstrate how CPMP can be applied to tune a real Computational Fluid Dynamics (CFD) application.
Clark's nutcracker spatial memory: the importance of large, structural cues.
Bednekoff, Peter A; Balda, Russell P
2014-02-01
Clark's nutcrackers, Nucifraga columbiana, cache and recover stored seeds in high alpine areas including areas where snowfall, wind, and rockslides may frequently obscure or alter cues near the cache site. Previous work in the laboratory has established that Clark's nutcrackers use spatial memory to relocate cached food. Following from aspects of this work, we performed experiments to test the importance of large, structural cues for Clark's nutcracker spatial memory. Birds were no more accurate in recovering caches when more objects were on the floor of a large experimental room nor when this room was subdivided with a set of panels. However, nutcrackers were consistently less accurate in this large room than in a small experimental room. Clark's nutcrackers probably use structural features of experimental rooms as important landmarks during recovery of cached food. This use of large, extremely stable cues may reflect the imperfect reliability of smaller, closer cues in the natural habitat of Clark's nutcrackers. This article is part of a Special Issue entitled: CO3 2013. Copyright © 2013 Elsevier B.V. All rights reserved.
2014-06-06
Adaptive Management Plan NED national economic development NEPA National Environmental Policy Act NER National Ecosystem Restoration NFIP... management and flow maintenance (e.g., flood water height, channel and culvert sizing) are based on high water events (i.e., FEMA base flood – 1% or 100...Minimum 15 years of experience in economics X Minimum 15 years of experience in flood risk management analysis and benefits calculations X Direct
Data Resilience in the dCache Storage System
Rossi, A. L.; Adeyemi, F.; Ashish, A.; ...
2017-11-23
In this study we discuss design, implementation considerations, and performance of a new Resilience Service in the dCache storage system responsible for file availability and durability functionality.
Simplifying and speeding the management of intra-node cache coherence
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton on Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Phillip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Yorktown Heights, NY
2012-04-17
A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
KOLAM: a cross-platform architecture for scalable visualization and tracking in wide-area imagery
NASA Astrophysics Data System (ADS)
Fraser, Joshua; Haridas, Anoop; Seetharaman, Guna; Rao, Raghuveer M.; Palaniappan, Kannappan
2013-05-01
KOLAM is an open, cross-platform, interoperable, scalable and extensible framework supporting a novel multi- scale spatiotemporal dual-cache data structure for big data visualization and visual analytics. This paper focuses on the use of KOLAM for target tracking in high-resolution, high throughput wide format video also known as wide-area motion imagery (WAMI). It was originally developed for the interactive visualization of extremely large geospatial imagery of high spatial and spectral resolution. KOLAM is platform, operating system and (graphics) hardware independent, and supports embedded datasets scalable from hundreds of gigabytes to feasibly petabytes in size on clusters, workstations, desktops and mobile computers. In addition to rapid roam, zoom and hyper- jump spatial operations, a large number of simultaneously viewable embedded pyramid layers (also referred to as multiscale or sparse imagery), interactive colormap and histogram enhancement, spherical projection and terrain maps are supported. The KOLAM software architecture was extended to support airborne wide-area motion imagery by organizing spatiotemporal tiles in very large format video frames using a temporal cache of tiled pyramid cached data structures. The current version supports WAMI animation, fast intelligent inspection, trajectory visualization and target tracking (digital tagging); the latter by interfacing with external automatic tracking software. One of the critical needs for working with WAMI is a supervised tracking and visualization tool that allows analysts to digitally tag multiple targets, quickly review and correct tracking results and apply geospatial visual analytic tools on the generated trajectories. One-click manual tracking combined with multiple automated tracking algorithms are available to assist the analyst and increase human effectiveness.
dCache: Big Data storage for HEP communities and beyond
NASA Astrophysics Data System (ADS)
Millar, A. P.; Behrmann, G.; Bernardt, C.; Fuhrmann, P.; Litvintsev, D.; Mkrtchyan, T.; Petersen, A.; Rossi, A.; Schwank, K.
2014-06-01
With over ten years in production use dCache data storage system has evolved to match ever changing lansdcape of continually evolving storage technologies with new solutions to both existing problems and new challenges. In this paper, we present three areas of innovation in dCache: providing efficient access to data with NFS v4.1 pNFS, adoption of CDMI and WebDAV as an alternative to SRM for managing data, and integration with alternative authentication mechanisms.
Nelson, Michael E.; Mech, L. David
2011-01-01
Wolves (Canis lupus) in northeastern Minnesota cached six radio-collars (four in winter, two in spring-summer) of 202 radio-collared White-tailed Deer (Odocoileus virginianus) they killed or consumed from 1975 to 2010. A Wolf bedded on top of one collar cached in snow. We found one collar each at a Wolf den and Wolf rendezvous site, 2.5 km and 0.5 km respectively, from each deer's previous locations.
Middleton, B.; Wu, X.B.
2008-01-01
Agricultural development on floodplains contributes to hydrologic alteration and forest fragmentation, which may alter landscape-level processes. These changes may be related to shifts in the seed bank composition of floodplain wetlands. We examined the patterns of seed bank composition across a floodplain watershed by looking at the number of seeds germinating per m2 by species in 60 farmed and intact forested wetlands along the Cache River watershed in Illinois. The seed bank composition was compared above and below a water diversion (position), which artificially subdivides the watershed. Position of these wetlands represented the most variability of Axis I in a Nonmetric Multidimensional Scaling (NMS) analysis of site environmental variables and their relationship to seed bank composition (coefficient of determination for Axis 1: r2 = 0.376; Pearson correlation of position to Axis 1: r = 0.223). The 3 primary axes were also represented by other site environmental variables, including farming status (farmed or unfarmed), distance from the mouth of the river, latitude, and longitude. Spatial analysis based on Mantel correlograms showed that both water-dispersed and wind/water-dispersed seed assemblages had strong spatial structure in the upper Cache (above the water diversion), bur the spatial structure of water-dispersed seed assemblage was diminished in the lower Cache (below the water diversion), which lost floodpulsing. Bearing analysis also Suggested that water-dispersal process had a stronger influence on the overall spatial pattern of seed assemblage in the upper Cache, while wind/water-dispersal process had a stronger influence in the lower Cache. An analysis of the landscapes along the river showed that the mid-lower Cache (below the water diversion) had undergone greater land cover changes associated with agriculture than did the upper Cache watershed. Thus, the combination of forest fragmentation and hydrologic changes in the surrounding landscape may have had an influence on the seed bank composition and spatial distribution of the seed banks of the Cache River watershed. Our study suggests that the spatial pattern of seed bank composition may be influenced by landscape-level factors and processes.
Forest rodents provide directed dispersal of Jeffrey pine seeds
Briggs, J.S.; Wall, S.B.V.; Jenkins, S.H.
2009-01-01
Some species of animals provide directed dispersal of plant seeds by transporting them nonrandomly to microsites where their chances of producing healthy seedlings are enhanced. We investigated whether this mutualistic interaction occurs between granivorous rodents and Jeffrey pine (Pinus jeffreyi) in the eastern Sierra Nevada by comparing the effectiveness of random abiotic seed dispersal with the dispersal performed by four species of rodents: deer mice (Peromyscus maniculatus), yellow-pine and long-eared chipmunks (Tamias amoenus and T. quadrimaculatus), and golden-mantled ground squirrels (Spermophilus lateralis). We conducted two caching studies using radio-labeled seeds, the first with individual animals in field enclosures and the second with a community of rodents in open forest. We used artificial caches to compare the fates of seeds placed at the range of microsites and depths used by animals with the fates of seeds dispersed abiotically. Finally, we examined the distribution and survival of naturally establishing seedlings over an eight-year period.Several lines of evidence suggested that this community of rodents provided directed dispersal. Animals preferred to cache seeds in microsites that were favorable for emergence or survival of seedlings and avoided caching in microsites in which seedlings fared worst. Seeds buried at depths typical of animal caches (5–25 mm) produced at least five times more seedlings than did seeds on the forest floor. The four species of rodents differed in the quality of dispersal they provided. Small, shallow caches made by deer mice most resembled seeds dispersed by abiotic processes, whereas many of the large caches made by ground squirrels were buried too deeply for successful emergence of seedlings. Chipmunks made the greatest number of caches within the range of depths and microsites favorable for establishment of pine seedlings. Directed dispersal is an important element of the population dynamics of Jeffrey pine, a dominant tree species in the eastern Sierra Nevada. Quantifying the occurrence and dynamics of directed dispersal in this and other cases will contribute to better understanding of mutualistic coevolution of plants and animals and to more effective management of ecosystems in which directed dispersal is a keystone process.
Corvids Outperform Pigeons and Primates in Learning a Basic Concept.
Wright, Anthony A; Magnotti, John F; Katz, Jeffrey S; Leonard, Kevin; Vernouillet, Alizée; Kelly, Debbie M
2017-04-01
Corvids (birds of the family Corvidae) display intelligent behavior previously ascribed only to primates, but such feats are not directly comparable across species. To make direct species comparisons, we used a same/different task in the laboratory to assess abstract-concept learning in black-billed magpies ( Pica hudsonia). Concept learning was tested with novel pictures after training. Concept learning improved with training-set size, and test accuracy eventually matched training accuracy-full concept learning-with a 128-picture set; this magpie performance was equivalent to that of Clark's nutcrackers (a species of corvid) and monkeys (rhesus, capuchin) and better than that of pigeons. Even with an initial 8-item picture set, both corvid species showed partial concept learning, outperforming both monkeys and pigeons. Similar corvid performance refutes the hypothesis that nutcrackers' prolific cache-location memory accounts for their superior concept learning, because magpies rely less on caching. That corvids with "primitive" neural architectures evolved to equal primates in full concept learning and even to outperform them on the initial 8-item picture test is a testament to the shared (convergent) survival importance of abstract-concept learning.
Optimal Padding for the Two-Dimensional Fast Fourier Transform
NASA Technical Reports Server (NTRS)
Dean, Bruce H.; Aronstein, David L.; Smith, Jeffrey S.
2011-01-01
One-dimensional Fast Fourier Transform (FFT) operations work fastest on grids whose size is divisible by a power of two. Because of this, padding grids (that are not already sized to a power of two) so that their size is the next highest power of two can speed up operations. While this works well for one-dimensional grids, it does not work well for two-dimensional grids. For a two-dimensional grid, there are certain pad sizes that work better than others. Therefore, the need exists to generalize a strategy for determining optimal pad sizes. There are three steps in the FFT algorithm. The first is to perform a one-dimensional transform on each row in the grid. The second step is to transpose the resulting matrix. The third step is to perform a one-dimensional transform on each row in the resulting grid. Steps one and three both benefit from padding the row to the next highest power of two, but the second step needs a novel approach. An algorithm was developed that struck a balance between optimizing the grid pad size with prime factors that are small (which are optimal for one-dimensional operations), and with prime factors that are large (which are optimal for two-dimensional operations). This algorithm optimizes based on average run times, and is not fine-tuned for any specific application. It increases the amount of times that processor-requested data is found in the set-associative processor cache. Cache retrievals are 4-10 times faster than conventional memory retrievals. The tested implementation of the algorithm resulted in faster execution times on all platforms tested, but with varying sized grids. This is because various computer architectures process commands differently. The test grid was 512 512. Using a 540 540 grid on a Pentium V processor, the code ran 30 percent faster. On a PowerPC, a 256x256 grid worked best. A Core2Duo computer preferred either a 1040x1040 (15 percent faster) or a 1008x1008 (30 percent faster) grid. There are many industries that can benefit from this algorithm, including optics, image-processing, signal-processing, and engineering applications.
Compiler-Directed File Layout Optimization for Hierarchical Storage Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ding, Wei; Zhang, Yuanrui; Kandemir, Mahmut
File layout of array data is a critical factor that effects the behavior of storage caches, and has so far taken not much attention in the context of hierarchical storage systems. The main contribution of this paper is a compiler-driven file layout optimization scheme for hierarchical storage caches. This approach, fully automated within an optimizing compiler, analyzes a multi-threaded application code and determines a file layout for each disk-resident array referenced by the code, such that the performance of the target storage cache hierarchy is maximized. We tested our approach using 16 I/O intensive application programs and compared its performancemore » against two previously proposed approaches under different cache space management schemes. Our experimental results show that the proposed approach improves the execution time of these parallel applications by 23.7% on average.« less
Compiler-Directed File Layout Optimization for Hierarchical Storage Systems
Ding, Wei; Zhang, Yuanrui; Kandemir, Mahmut; ...
2013-01-01
File layout of array data is a critical factor that effects the behavior of storage caches, and has so far taken not much attention in the context of hierarchical storage systems. The main contribution of this paper is a compiler-driven file layout optimization scheme for hierarchical storage caches. This approach, fully automated within an optimizing compiler, analyzes a multi-threaded application code and determines a file layout for each disk-resident array referenced by the code, such that the performance of the target storage cache hierarchy is maximized. We tested our approach using 16 I/O intensive application programs and compared its performancemore » against two previously proposed approaches under different cache space management schemes. Our experimental results show that the proposed approach improves the execution time of these parallel applications by 23.7% on average.« less
Efficacy of Code Optimization on Cache-Based Processors
NASA Technical Reports Server (NTRS)
VanderWijngaart, Rob F.; Saphir, William C.; Chancellor, Marisa K. (Technical Monitor)
1997-01-01
In this paper a number of techniques for improving the cache performance of a representative piece of numerical software is presented. Target machines are popular processors from several vendors: MIPS R5000 (SGI Indy), MIPS R8000 (SGI PowerChallenge), MIPS R10000 (SGI Origin), DEC Alpha EV4 + EV5 (Cray T3D & T3E), IBM RS6000 (SP Wide-node), Intel PentiumPro (Ames' Whitney), Sun UltraSparc (NERSC's NOW). The optimizations all attempt to increase the locality of memory accesses. But they meet with rather varied and often counterintuitive success on the different computing platforms. We conclude that it may be genuinely impossible to obtain portable performance on the current generation of cache-based machines. At the least, it appears that the performance of modern commodity processors cannot be described with parameters defining the cache alone.
SIDECACHE: Information access, management and dissemination framework for web services.
Doderer, Mark S; Burkhardt, Cory; Robbins, Kay A
2011-06-14
Many bioinformatics algorithms and data sets are deployed using web services so that the results can be explored via the Internet and easily integrated into other tools and services. These services often include data from other sites that is accessed either dynamically or through file downloads. Developers of these services face several problems because of the dynamic nature of the information from the upstream services. Many publicly available repositories of bioinformatics data frequently update their information. When such an update occurs, the developers of the downstream service may also need to update. For file downloads, this process is typically performed manually followed by web service restart. Requests for information obtained by dynamic access of upstream sources is sometimes subject to rate restrictions. SideCache provides a framework for deploying web services that integrate information extracted from other databases and from web sources that are periodically updated. This situation occurs frequently in biotechnology where new information is being continuously generated and the latest information is important. SideCache provides several types of services including proxy access and rate control, local caching, and automatic web service updating. We have used the SideCache framework to automate the deployment and updating of a number of bioinformatics web services and tools that extract information from remote primary sources such as NCBI, NCIBI, and Ensembl. The SideCache framework also has been used to share research results through the use of a SideCache derived web service.
NASA Astrophysics Data System (ADS)
Poat, M. D.; Lauret, J.
2017-10-01
As demand for widely accessible storage capacity increases and usage is on the rise, steady IO performance is desired but tends to suffer within multi-user environments. Typical deployments use standard hard drives as the cost per/GB is quite low. On the other hand, HDD based solutions for storage is not known to scale well with process concurrency and soon enough, high rate of IOPs create a “random access” pattern killing performance. Though not all SSDs are alike, SSDs are an established technology often used to address this exact “random access” problem. In this contribution, we will first discuss the IO performance of many different SSD drives (tested in a comparable and standalone manner). We will then be discussing the performance and integrity of at least three low-level disk caching techniques (Flashcache, dm-cache, and bcache) including individual policies, procedures, and IO performance. Furthermore, the STAR online computing infrastructure currently hosts a POSIX-compliant Ceph distributed storage cluster - while caching is not a native feature of CephFS (only exists in the Ceph Object store), we will show how one can implement a caching mechanism profiting from an implementation at a lower level. As our illustration, we will present our CephFS setup, IO performance tests, and overall experience from such configuration. We hope this work will service the community’s interest for using disk-caching mechanisms with applicable uses such as distributed storage systems and seeking an overall IO performance gain.
Frequency Dependence of Single-Event Upset in Highly Advanced PowerPC Microprocessors
NASA Technical Reports Server (NTRS)
Irom, Farokh; Farmanesh, Farhad; White, Mark; Kouba, Coy K.
2006-01-01
Single-event upset effects from heavy ions were measured for Motorola silicon-on-insulator (SOI) microprocessor with 90 nm feature sizes at three frequencies of 500, 1066 and 1600 MHz. Frequency dependence of single-event upsets is discussed. The results of our studies suggest the single-event upset in registers and D-Cache tend to increase with frequency. This might have important implications for the overall single-event upset trend as technology moves toward higher frequencies.
120-MHz BiCMOS superscalar RISC processor
NASA Astrophysics Data System (ADS)
Tanaka, Shigeya; Hotta, Takashi; Murabayashi, Fumio; Yamada, Hiromichi; Yoshida, Shoji; Shimamura, Kotaro; Katsura, Koyo; Bandoh, Tadaaki; Ikeda, Koichi; Matsubara, Kenji
1994-04-01
A superscalar RISC processor contains 2.8 million transistors in a die size of 16.2 mm x 16.5 mm, and utilizes 3.3 V/0.5 micron BiCMOS technology. In order to take advantage of superscalar performance without incurring penalties from a slower clock or a longer pipeline, a tag bit is implemented in the instruction cache to indicate dependency between two instructions. A performance gain of up to 37% is obtained with only a 3.5% area overhead from our superscalar design.
Horizontally scaling dChache SRM with the Terracotta platform
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perelmutov, T.; Crawford, M.; Moibenko, A.
2011-01-01
The dCache disk caching file system has been chosen by a majority of LHC experiments Tier 1 centers for their data storage needs. It is also deployed at many Tier 2 centers. The Storage Resource Manager (SRM) is a standardized grid storage interface and a single point of remote entry into dCache, and hence is a critical component. SRM must scale to increasing transaction rates and remain resilient against changing usage patterns. The initial implementation of the SRM service in dCache suffered from an inability to support clustered deployment, and its performance was limited by the hardware of a singlemore » node. Using the Terracotta platform, we added the ability to horizontally scale the dCache SRM service to run on multiple nodes in a cluster configuration, coupled with network load balancing. This gives site administrators the ability to increase the performance and reliability of SRM service to face the ever-increasing requirements of LHC data handling. In this paper we will describe the previous limitations of the architecture SRM server and how the Terracotta platform allowed us to readily convert single node service into a highly scalable clustered application.« less
Dispersal Mutualism Incorporated into Large-Scale, Infrequent Disturbances
Parker, V. Thomas
2015-01-01
Because of their influence on succession and other community interactions, large-scale, infrequent natural disturbances also should play a major role in mutualistic interactions. Using field data and experiments, I test whether mutualisms have been incorporated into large-scale wildfire by whether the outcomes of a mutualism depend on disturbance. In this study a seed dispersal mutualism is shown to depend on infrequent, large-scale disturbances. A dominant shrubland plant (Arctostaphylos species) produces seeds that make up a persistent soil seed bank and requires fire to germinate. In post-fire stands, I show that seedlings emerging from rodent caches dominate sites experiencing higher fire intensity. Field experiments show that rodents (Perimyscus californicus, P. boylii) do cache Arctostaphylos fruit and bury most seed caches to a sufficient depth to survive a killing heat pulse that a fire might drive into the soil. While the rodent dispersal and caching behavior itself has not changed compared to other habitats, the environmental transformation caused by wildfire converts the caching burial of seed from a dispersal process to a plant fire adaptive trait, and provides the context for stimulating subsequent life history evolution in the plant host. PMID:26151560
Reader set encoding for directory of shared cache memory in multiprocessor system
Ahn, Dnaiel; Ceze, Luis H.; Gara, Alan; Ohmacht, Martin; Xiaotong, Zhuang
2014-06-10
In a parallel processing system with speculative execution, conflict checking occurs in a directory lookup of a cache memory that is shared by all processors. In each case, the same physical memory address will map to the same set of that cache, no matter which processor originated that access. The directory includes a dynamic reader set encoding, indicating what speculative threads have read a particular line. This reader set encoding is used in conflict checking. A bitset encoding is used to specify particular threads that have read the line.
DSP code optimization based on cache
NASA Astrophysics Data System (ADS)
Xu, Chengfa; Li, Chengcheng; Tang, Bin
2013-03-01
DSP program's running efficiency on board is often lower than which via the software simulation during the program development, which is mainly resulted from the user's improper use and incomplete understanding of the cache-based memory. This paper took the TI TMS320C6455 DSP as an example, analyzed its two-level internal cache, and summarized the methods of code optimization. Processor can achieve its best performance when using these code optimization methods. At last, a specific algorithm application in radar signal processing is proposed. Experiment result shows that these optimization are efficient.
Memory and the hippocampus in food-storing birds: a comparative approach.
Clayton, N S
1998-01-01
Comparative studies provide a unique source of evidence for the role of the hippocampus in learning and memory. Within birds and mammals, the hippocampal volume of scatter-hoarding species that cache food in many different locations is enlarged, relative to the remainder of the telencephalon, when compared with than that of species which cache food in one larder, or do not cache at all. Do food-storing species show enhanced memory function in association with the volumetric enlargement of the hippocampus? Comparative studies within the parids (titmice and chickadees) and corvids (jays, nutcrackers and magpies), two families of birds which show natural variation in food-storing behavior, suggest that there may be two kinds of memory specialization associated with scatter-hoarding. First, in terms of spatial memory, several scatter-hoarding species have a more accurate and enduring spatial memory, and a preference to rely more heavily upon spatial cues, than that of closely related species which store less food, or none at all. Second, some scatter-hoarding parids and corvids are also more resistant to memory interference. While the most critical component about a cache site may be its spatial location, there is mounting evidence that food-storing birds remember additional information about the contents and status of cache sites. What is the underlying neural mechanism by which the hippocampus learns and remembers cache sites? The current mammalian dogma is that the neural mechanisms of learning and memory are achieved primarily by variations in synaptic number and efficacy. Recent work on the concomitant development of food-storing, memory and the avian hippocampus illustrates that the avian hippocampus may swell or shrivel by as much as 30% in response to presence or absence of food-storing experience. Memory for food caches triggers a dramatic increase in the total number of number of neurons within the avian hippocampus by altering the rate at which these cells are born and die.
Microphase separation in random multiblock copolymers
NASA Astrophysics Data System (ADS)
Govorun, E. N.; Chertovich, A. V.
2017-01-01
Microphase separation in random multiblock copolymers is studied with the mean-field theory assuming that long blocks of a copolymer are strongly segregated, whereas short blocks are able to penetrate into "alien" domains and exchange between the domains and interfacial layer. A bidisperse copolymer with blocks of only two sizes (long and short) is considered as a model of multiblock copolymers with high polydispersity in the block size. Short blocks of the copolymer play an important role in the microphase separation. First, their penetration into the "alien" domains leads to the formation of joint long blocks in their own domains. Second, short blocks localized at the interface considerably change the interfacial tension. The possibility of penetration of short blocks into the "alien" domains is controlled by the product χ Nsh (χ is the Flory-Huggins interaction parameter and Nsh is the short block length). At not very large χ Nsh , the domain size is larger than that for a regular copolymer consisting of the same long blocks as in the considered random copolymer. At a fixed mean block size, the domain size grows with an increase in the block size dispersity, the rate of the growth being dependent of the more detailed parameters of the block size distribution.
NIC atomic operation unit with caching and bandwidth mitigation
Hemmert, Karl Scott; Underwood, Keith D.; Levenhagen, Michael J.
2016-03-01
A network interface controller atomic operation unit and a network interface control method comprising, in an atomic operation unit of a network interface controller, using a write-through cache and employing a rate-limiting functional unit.
Sparse Partial Equilibrium Tables in Chemically Resolved Reactive Flow
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vitello, P; Fried, L E; Pudliner, B
2003-07-14
The detonation of an energetic material is the result of a complex interaction between kinetic chemical reactions and hydrodynamics. Unfortunately, little is known concerning the detailed chemical kinetics of detonations in energetic materials. CHEETAH uses rate laws to treat species with the slowest chemical reactions, while assuming other chemical species are in equilibrium. CHEETAH supports a wide range of elements and condensed detonation products and can also be applied to gas detonations. A sparse hash table of equation of state values, called the ''cache'' is used in CHEETAH to enhance the efficiency of kinetic reaction calculations. For large-scale parallel hydrodynamicmore » calculations, CHEETAH uses MPI communication to updates to the cache. We present here details of the sparse caching model used in the CHEETAH. To demonstrate the efficiency of modeling using a sparse cache model we consider detonations in energetic materials.« less
NASA Technical Reports Server (NTRS)
Liu, Yuan-Kwei
1991-01-01
The feasibility is analyzed of upgrading the Intel 386 microprocessor, which has been proposed as the baseline processor for the Space Station Freedom (SSF) Data Management System (DMS), to the more advanced i486 microprocessors. The items compared between the two processors include the instruction set architecture, power consumption, the MIL-STD-883C Class S (Space) qualification schedule, and performance. The advantages of the i486 over the 386 are (1) lower power consumption; and (2) higher floating point performance. The i486 on-chip cache does not have parity check or error detection and correction circuitry. The i486 with on-chip cache disabled, however, has lower integer performance than the 386 without cache, which is the current DMS design choice. Adding cache to the 386/386 DX memory hierachy appears to be the most beneficial change to the current DMS design at this time.
NASA Technical Reports Server (NTRS)
Liu, Yuan-Kwei
1991-01-01
The feasibility is analyzed of upgrading the Intel 386 microprocessor, which has been proposed as the baseline processor for the Space Station Freedom (SSF) Data Management System (DMS), to the more advanced i486 microprocessors. The items compared between the two processors include the instruction set architecture, power consumption, the MIL-STD-883C Class S (Space) qualification schedule, and performance. The advantages of the i486 over the 386 are (1) lower power consumption; and (2) higher floating point performance. The i486 on-chip cache does not have parity check or error detection and correction circuitry. The i486 with on-chip cache disabled, however, has lower integer performance than the 386 without cache, which is the current DMS design choice. Adding cache to the 386/387 DX memory hierarchy appears to be the most beneficial change to the current DMS design at this time.
Using Solid State Disk Array as a Cache for LHC ATLAS Data Analysis
NASA Astrophysics Data System (ADS)
Yang, W.; Hanushevsky, A. B.; Mount, R. P.; Atlas Collaboration
2014-06-01
User data analysis in high energy physics presents a challenge to spinning-disk based storage systems. The analysis is data intense, yet reads are small, sparse and cover a large volume of data files. It is also unpredictable due to users' response to storage performance. We describe here a system with an array of Solid State Disk as a non-conventional, standalone file level cache in front of the spinning disk storage to help improve the performance of LHC ATLAS user analysis at SLAC. The system uses several days of data access records to make caching decisions. It can also use information from other sources such as a work-flow management system. We evaluate the performance of the system both in terms of caching and its impact on user analysis jobs. The system currently uses Xrootd technology, but the technique can be applied to any storage system.
Haffenden, Angela M; Goodale, Melvyn A
2002-12-01
Previous findings have suggested that visuomotor programming can make use of learned size information in experimental paradigms where movement kinematics are quite consistent from trial to trial. The present experiment was designed to test whether or not this conclusion could be generalized to a different manipulation of kinematic variability. As in previous work, an association was established between the size and colour of square blocks (e.g. red = large; yellow = small, or vice versa). Associating size and colour in this fashion has been shown to reliably alter the perceived size of two test blocks halfway in size between the large and small blocks: estimations of the test block matched in colour to the group of large blocks are smaller than estimations of the test block matched to the group of small blocks. Subjects grasped the blocks, and on other trials estimated the size of the blocks. These changes in perceived block size were incorporated into grip scaling only when movement kinematics were highly consistent from trial to trial; that is, when the blocks were presented in the same location on each trial. When the blocks were presented in different locations grip scaling remained true to the metrics of the test blocks despite the changes in perceptual estimates of block size. These results support previous findings suggesting that kinematic consistency facilitates the incorporation of learned perceptual information into grip scaling.
Hadwiger, M; Beyer, J; Jeong, Won-Ki; Pfister, H
2012-12-01
This paper presents the first volume visualization system that scales to petascale volumes imaged as a continuous stream of high-resolution electron microscopy images. Our architecture scales to dense, anisotropic petascale volumes because it: (1) decouples construction of the 3D multi-resolution representation required for visualization from data acquisition, and (2) decouples sample access time during ray-casting from the size of the multi-resolution hierarchy. Our system is designed around a scalable multi-resolution virtual memory architecture that handles missing data naturally, does not pre-compute any 3D multi-resolution representation such as an octree, and can accept a constant stream of 2D image tiles from the microscopes. A novelty of our system design is that it is visualization-driven: we restrict most computations to the visible volume data. Leveraging the virtual memory architecture, missing data are detected during volume ray-casting as cache misses, which are propagated backwards for on-demand out-of-core processing. 3D blocks of volume data are only constructed from 2D microscope image tiles when they have actually been accessed during ray-casting. We extensively evaluate our system design choices with respect to scalability and performance, compare to previous best-of-breed systems, and illustrate the effectiveness of our system for real microscopy data from neuroscience.
Design issues and caching strategies for CD-ROM-based multimedia storage
NASA Astrophysics Data System (ADS)
Shastri, Vijnan; Rajaraman, V.; Jamadagni, H. S.; Venkat-Rangan, P.; Sampath-Kumar, Srihari
1996-03-01
CD-ROMs have proliferated as a distribution media for desktop machines for a large variety of multimedia applications (targeted for a single-user environment) like encyclopedias, magazines and games. With CD-ROM capacities up to 3 GB being available in the near future, they will form an integral part of Video on Demand (VoD) servers to store full-length movies and multimedia. In the first section of this paper we look at issues related to the single- user desktop environment. Since these multimedia applications are highly interactive in nature, we take a pragmatic approach, and have made a detailed study of the multimedia application behavior in terms of the I/O request patterns generated to the CD-ROM subsystem by tracing these patterns. We discuss prefetch buffer design and seek time characteristics in the context of the analysis of these traces. We also propose an adaptive main-memory hosted cache that receives caching hints from the application to reduce the latency when the user moves from one node of the hyper graph to another. In the second section we look at the use of CD-ROM in a VoD server and discuss the problem of scheduling multiple request streams and buffer management in this scenario. We adapt the C-SCAN (Circular SCAN) algorithm to suit the CD-ROM drive characteristics and prove that it is optimal in terms of buffer size management. We provide computationally inexpensive relations by which this algorithm can be implemented. We then propose an admission control algorithm which admits new request streams without disrupting the continuity of playback of the previous request streams. The algorithm also supports operations such as fast forward and replay. Finally, we discuss the problem of optimal placement of MPEG streams on CD-ROMs in the third section.
Efficient Maintenance and Update of Nonbonded Lists in Macromolecular Simulations.
Chowdhury, Rezaul; Beglov, Dmitri; Moghadasi, Mohammad; Paschalidis, Ioannis Ch; Vakili, Pirooz; Vajda, Sandor; Bajaj, Chandrajit; Kozakov, Dima
2014-10-14
Molecular mechanics and dynamics simulations use distance based cutoff approximations for faster computation of pairwise van der Waals and electrostatic energy terms. These approximations traditionally use a precalculated and periodically updated list of interacting atom pairs, known as the "nonbonded neighborhood lists" or nblists, in order to reduce the overhead of finding atom pairs that are within distance cutoff. The size of nblists grows linearly with the number of atoms in the system and superlinearly with the distance cutoff, and as a result, they require significant amount of memory for large molecular systems. The high space usage leads to poor cache performance, which slows computation for large distance cutoffs. Also, the high cost of updates means that one cannot afford to keep the data structure always synchronized with the configuration of the molecules when efficiency is at stake. We propose a dynamic octree data structure for implicit maintenance of nblists using space linear in the number of atoms but independent of the distance cutoff. The list can be updated very efficiently as the coordinates of atoms change during the simulation. Unlike explicit nblists, a single octree works for all distance cutoffs. In addition, octree is a cache-friendly data structure, and hence, it is less prone to cache miss slowdowns on modern memory hierarchies than nblists. Octrees use almost 2 orders of magnitude less memory, which is crucial for simulation of large systems, and while they are comparable in performance to nblists when the distance cutoff is small, they outperform nblists for larger systems and large cutoffs. Our tests show that octree implementation is approximately 1.5 times faster in practical use case scenarios as compared to nblists.
2015-06-10
This diagram, superimposed on a photo of Martian landscape, illustrates a concept called "adaptive caching," which is in development for NASA's 2020 Mars rover mission. In addition to the investigations that the Mars 2020 rover will conduct on Mars, the rover will collect carefully selected samples of Mars rock and soil and cache them to be available for possible return to Earth if a Mars sample-return mission is scheduled and flown. Each sample will be stored in a sealed tube. Adaptive caching would result in a set of samples, up to the maximum number of tubes carried on the rover, being placed on the surface at the discretion of the mission operators. The tubes holding the collected samples would not go into a surrounding container. In this illustration, green dots indicate "regions of interest," where samples might be collected. The green diamond indicates one region of interest serving as the depot for the cache. The green X at upper right represents the landing site. The solid black line indicates the rover's route during its prime mission, and the dashed black line indicates its route during an extension of the mission. The base image is a portion of the "Everest Panorama" taken by the panoramic camera on NASA's Mars Exploration Rover Spirit at the top of Husband Hill in 2005. http://photojournal.jpl.nasa.gov/catalog/PIA19150
Use of the sun as a heading indicator when caching and recovering in a wild rodent
Samson, Jamie; Manser, Marta B.
2016-01-01
A number of diurnal species have been shown to use directional information from the sun to orientate. The use of the sun in this way has been suggested to occur in either a time-dependent (relying on specific positional information) or a time-compensated manner (a compass that adjusts itself over time with the shifts in the sun’s position). However, some interplay may occur between the two where a species could also use the sun in a time-limited way, whereby animals acquire certain information about the change of position, but do not show full compensational abilities. We tested whether Cape ground squirrels (Xerus inauris) use the sun as an orientation marker to provide information for caching and recovery. This species is a social sciurid that inhabits arid, sparsely vegetated habitats in Southern Africa, where the sun is nearly always visible during the diurnal period. Due to the lack of obvious landmarks, we predicted that they might use positional cues from the sun in the sky as a reference point when caching and recovering food items. We provide evidence that Cape ground squirrels use information from the sun’s position while caching and reuse this information in a time-limited way when recovering these caches. PMID:27580797
Memory transfer optimization for a lattice Boltzmann solver on Kepler architecture nVidia GPUs
NASA Astrophysics Data System (ADS)
Mawson, Mark J.; Revell, Alistair J.
2014-10-01
The Lattice Boltzmann method (LBM) for solving fluid flow is naturally well suited to an efficient implementation for massively parallel computing, due to the prevalence of local operations in the algorithm. This paper presents and analyses the performance of a 3D lattice Boltzmann solver, optimized for third generation nVidia GPU hardware, also known as 'Kepler'. We provide a review of previous optimization strategies and analyse data read/write times for different memory types. In LBM, the time propagation step (known as streaming), involves shifting data to adjacent locations and is central to parallel performance; here we examine three approaches which make use of different hardware options. Two of which make use of 'performance enhancing' features of the GPU; shared memory and the new shuffle instruction found in Kepler based GPUs. These are compared to a standard transfer of data which relies instead on optimized storage to increase coalesced access. It is shown that the more simple approach is most efficient; since the need for large numbers of registers per thread in LBM limits the block size and thus the efficiency of these special features is reduced. Detailed results are obtained for a D3Q19 LBM solver, which is benchmarked on nVidia K5000M and K20C GPUs. In the latter case the use of a read-only data cache is explored, and peak performance of over 1036 Million Lattice Updates Per Second (MLUPS) is achieved. The appearance of a periodic bottleneck in the solver performance is also reported, believed to be hardware related; spikes in iteration-time occur with a frequency of around 11 Hz for both GPUs, independent of the size of the problem.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-03-29
..., Armstrong Hall, Room 201, 14 E. Cache La Poudre, Colorado Springs, CO 80903, telephone (719) 389-6201..., Room 201, 14 E. Cache La Poudre, Colorado Springs, CO 80903, telephone (719) 389-6201, before April 29...
Ironside, Kirsten E; Mattson, David J; Theimer, Tad; Jansen, Brian; Holton, Brandon; Arundel, Terence; Peters, Michael; Sexton, Joseph O; Edwards, Thomas C
2017-01-01
Many studies of animal movement have focused on directed versus area-restricted movement, which rely on correlations between step-length and turn-angles and on stationarity through time to define behavioral states. Although these approaches might apply well to grazing in patchy landscapes, species that either feed for short periods on large, concentrated food sources or cache food exhibit movements that are difficult to model using the traditional metrics of turn-angle and step-length alone. We used GPS telemetry collected from a prey-caching predator, the cougar ( Puma concolor, Linnaeus ), to test whether combining metrics of site recursion, spatiotemporal clustering, speed, and turning into an index of movement using partial sums, improves the ability to identify caching behavior. The index was used to identify changes in movement characteristics over time and segment paths into behavioral classes. The identification of behaviors from the Path Identification Index (PII) was evaluated using field investigations of cougar activities at GPS locations. We tested for statistical stationarity across behaviors for use of topographic view-sheds. Changes in the frequency and duration of PII were useful for identifying seasonal activities such as migration, gestation, and denning. The comparison of field investigations of cougar activities to behavioral PII classes resulted in an overall classification accuracy of 81%. Changes in behaviors were reflected in cougars' use of topographic view-sheds, resulting in statistical nonstationarity over time, and revealed important aspects of hunting behavior. Incorporating metrics of site recursion and spatiotemporal clustering revealed the temporal structure in movements of a caching forager. The movement index PII, shows promise for identifying behaviors in species that frequently return to specific locations such as food caches, watering holes, or dens, and highlights the potential role memory and cognitive abilities play in determining animal movements.
Urhan, A Utku; Brodin, Anders
2015-05-01
Scatter hoarding birds are known for their accurate spatial memory. In a previous experiment, we tested the retrieval accuracy in marsh tits in a typical laboratory set-up for this species. We also tested the performance of humans in this experimental set-up. Somewhat unexpectedly, humans performed much better than marsh tits. In the first five attempts, humans relocated almost 90 % of the caches they had hidden 5 h earlier. Marsh tits only relocated 25 % in the first five attempts and just above 40 % in the first ten attempts. Typically, in this type of experiment, the birds will be caching and retrieving many times in the same sites in the same experimental room. This is very different from the conditions in nature where hoarding parids only cache once in a caching site. Hence, it is possible that memories from previous sessions will disturb the formation of new memories. If there is such proactive interference, the prediction is that success should decay over sessions. Here, we have designed an experiment to investigate whether there is such memory interference in this type of experiment. We allowed marsh tits and humans to cache and retrieve in three repeated sessions without prior experience of the arena. The performance did not change over sessions, and on average, marsh tits correctly visited around 25 % of the caches in the first five attempts. The corresponding success in humans was constant across sessions, and it was around 90 % on average. We conclude that the somewhat poor performance of the marsh tits did not depend on proactive memory interference. We also discuss other possible reasons for why marsh tits in general do not perform better in laboratory experiments.
The history of scatter hoarding studies.
Brodin, Anders
2010-03-27
In this review, I will present an overview of the development of the field of scatter hoarding studies. Scatter hoarding is a conspicuous behaviour and it has been observed by humans for a long time. Apart from an exceptional experimental study already published in 1720, it started with observational field studies of scatter hoarding birds in the 1940s. Driven by a general interest in birds, several ornithologists made large-scale studies of hoarding behaviour in species such as nutcrackers and boreal titmice. Scatter hoarding birds seem to remember caching locations accurately, and it was shown in the 1960s that successful retrieval is dependent on a specific part of the brain, the hippocampus. The study of scatter hoarding, spatial memory and the hippocampus has since then developed into a study system for evolutionary studies of spatial memory. In 1978, a game theoretical paper started the era of modern studies by establishing that a recovery advantage is necessary for individual hoarders for the evolution of a hoarding strategy. The same year, a combined theoretical and empirical study on scatter hoarding squirrels investigated how caches should be spaced out in order to minimize cache loss, a phenomenon sometimes called optimal cache density theory. Since then, the scatter hoarding paradigm has branched into a number of different fields: (i) theoretical and empirical studies of the evolution of hoarding, (ii) field studies with modern sampling methods, (iii) studies of the precise nature of the caching memory, (iv) a variety of studies of caching memory and its relationship to the hippocampus. Scatter hoarding has also been the subject of studies of (v) coevolution between scatter hoarding animals and the plants that are dispersed by these.
Upadhyay, Amit A.; Fleetwood, Aaron D.; Adebali, Ogun; ...
2016-04-06
Cellular receptors usually contain a designated sensory domain that recognizes the signal. Per/Arnt/Sim (PAS) domains are ubiquitous sensors in thousands of species ranging from bacteria to humans. Although PAS domains were described as intracellular sensors, recent structural studies revealed PAS-like domains in extracytoplasmic regions in several transmembrane receptors. However, these structurally defined extracellular PAS-like domains do not match sequence-derived PAS domain models, and thus their distribution across the genomic landscape remains largely unknown. Here we show that structurally defined extracellular PAS-like domains belong to the Cache superfamily, which is homologous to, but distinct from the PAS superfamily. Our newly builtmore » computational models enabled identification of Cache domains in tens of thousands of signal transduction proteins including those from important pathogens and model organisms.Moreover, we show that Cache domains comprise the dominant mode of extracellular sensing in prokaryotes.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Upadhyay, Amit A.; Fleetwood, Aaron D.; Adebali, Ogun
Cellular receptors usually contain a designated sensory domain that recognizes the signal. Per/Arnt/Sim (PAS) domains are ubiquitous sensors in thousands of species ranging from bacteria to humans. Although PAS domains were described as intracellular sensors, recent structural studies revealed PAS-like domains in extracytoplasmic regions in several transmembrane receptors. However, these structurally defined extracellular PAS-like domains do not match sequence-derived PAS domain models, and thus their distribution across the genomic landscape remains largely unknown. Here we show that structurally defined extracellular PAS-like domains belong to the Cache superfamily, which is homologous to, but distinct from the PAS superfamily. Our newly builtmore » computational models enabled identification of Cache domains in tens of thousands of signal transduction proteins including those from important pathogens and model organisms.Moreover, we show that Cache domains comprise the dominant mode of extracellular sensing in prokaryotes.« less
Memory for Multiple Cache Locations and Prey Quantities in a Food-Hoarding Songbird
Armstrong, Nicola; Garland, Alexis; Burns, K. C.
2012-01-01
Most animals can discriminate between pairs of numbers that are each less than four without training. However, North Island robins (Petroica longipes), a food-hoarding songbird endemic to New Zealand, can discriminate between quantities of items as high as eight without training. Here we investigate whether robins are capable of other complex quantity discrimination tasks. We test whether their ability to discriminate between small quantities declines with (1) the number of cache sites containing prey rewards and (2) the length of time separating cache creation and retrieval (retention interval). Results showed that subjects generally performed above-chance expectations. They were equally able to discriminate between different combinations of prey quantities that were hidden from view in 2, 3, and 4 cache sites from between 1, 10, and 60 s. Overall results indicate that North Island robins can process complex quantity information involving more than two discrete quantities of items for up to 1 min long retention intervals without training. PMID:23293622
Memory for multiple cache locations and prey quantities in a food-hoarding songbird.
Armstrong, Nicola; Garland, Alexis; Burns, K C
2012-01-01
Most animals can discriminate between pairs of numbers that are each less than four without training. However, North Island robins (Petroica longipes), a food-hoarding songbird endemic to New Zealand, can discriminate between quantities of items as high as eight without training. Here we investigate whether robins are capable of other complex quantity discrimination tasks. We test whether their ability to discriminate between small quantities declines with (1) the number of cache sites containing prey rewards and (2) the length of time separating cache creation and retrieval (retention interval). Results showed that subjects generally performed above-chance expectations. They were equally able to discriminate between different combinations of prey quantities that were hidden from view in 2, 3, and 4 cache sites from between 1, 10, and 60 s. Overall results indicate that North Island robins can process complex quantity information involving more than two discrete quantities of items for up to 1 min long retention intervals without training.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Koskela, Tuomas S.; Lobet, Mathieu; Deslippe, Jack
In this session we show, in two case studies, how the roofline feature of Intel Advisor has been utilized to optimize the performance of kernels of the XGC1 and PICSAR codes in preparation for Intel Knights Landing architecture. The impact of the implemented optimizations and the benefits of using the automatic roofline feature of Intel Advisor to study performance of large applications will be presented. This demonstrates an effective optimization strategy that has enabled these science applications to achieve up to 4.6 times speed-up and prepare for future exascale architectures. # Goal/Relevance of Session The roofline model [1,2] is amore » powerful tool for analyzing the performance of applications with respect to the theoretical peak achievable on a given computer architecture. It allows one to graphically represent the performance of an application in terms of operational intensity, i.e. the ratio of flops performed and bytes moved from memory in order to guide optimization efforts. Given the scale and complexity of modern science applications, it can often be a tedious task for the user to perform the analysis on the level of functions or loops to identify where performance gains can be made. With new Intel tools, it is now possible to automate this task, as well as base the estimates of peak performance on measurements rather than vendor specifications. The goal of this session is to demonstrate how the roofline feature of Intel Advisor can be used to balance memory vs. computation related optimization efforts and effectively identify performance bottlenecks. A series of typical optimization techniques: cache blocking, structure refactoring, data alignment, and vectorization illustrated by the kernel cases will be addressed. # Description of the codes ## XGC1 The XGC1 code [3] is a magnetic fusion Particle-In-Cell code that uses an unstructured mesh for its Poisson solver that allows it to accurately resolve the edge plasma of a magnetic fusion device. After recent optimizations to its collision kernel [4], most of the computing time is spent in the electron push (pushe) kernel, where these optimization efforts have been focused. The kernel code scaled well with MPI+OpenMP but had almost no automatic compiler vectorization, in part due to indirect memory addresses and in part due to low trip counts of low-level loops that would be candidates for vectorization. Particle blocking and sorting have been implemented to increase trip counts of low-level loops and improve memory locality, and OpenMP directives have been added to vectorize compute-intensive loops that were identified by Advisor. The optimizations have improved the performance of the pushe kernel 2x on Haswell processors and 1.7x on KNL. The KNL node-for-node performance has been brought to within 30% of a NERSC Cori phase I Haswell node and we expect to bridge this gap by reducing the memory footprint of compute intensive routines to improve cache reuse. ## PICSAR is a Fortran/Python high-performance Particle-In-Cell library targeting at MIC architectures first designed to be coupled with the PIC code WARP for the simulation of laser-matter interaction and particle accelerators. PICSAR also contains a FORTRAN stand-alone kernel for performance studies and benchmarks. A MPI domain decomposition is used between NUMA domains and a tile decomposition (cache-blocking) handled by OpenMP has been added for shared-memory parallelism and better cache management. The so-called current deposition and field gathering steps that compose the PIC time loop constitute major hotspots that have been rewritten to enable more efficient vectorization. Particle communications between tiles and MPI domain has been merged and parallelized. All considered, these improvements provide speedups of 3.1 for order 1 and 4.6 for order 3 interpolation shape factors on KNL configured in SNC4 quadrant flat mode. Performance is similar between a node of cori phase 1 and KNL at order 1 and better on KNL by a factor 1.6 at order 3 with the considered test case (homogeneous thermal plasma).« less
An ultra-compact processor module based on the R3000
NASA Astrophysics Data System (ADS)
Mullenhoff, D. J.; Kaschmitter, J. L.; Lyke, J. C.; Forman, G. A.
1992-08-01
Viable high density packaging is of critical importance for future military systems, particularly space borne systems which require minimum weight and size and high mechanical integrity. A leading, emerging technology for high density packaging is multi-chip modules (MCM). During the 1980's, a number of different MCM technologies have emerged. In support of Strategic Defense Initiative Organization (SDIO) programs, Lawrence Livermore National Laboratory (LLNL) has developed, utilized, and evaluated several different MCM technologies. Prior LLNL efforts include modules developed in 1986, using hybrid wafer scale packaging, which are still operational in an Air Force satellite mission. More recent efforts have included very high density cache memory modules, developed using laser pantography. As part of the demonstration effort, LLNL and Phillips Laboratory began collaborating in 1990 in the Phase 3 Multi-Chip Module (MCM) technology demonstration project. The goal of this program was to demonstrate the feasibility of General Electric's (GE) High Density Interconnect (HDI) MCM technology. The design chosen for this demonstration was the processor core for a MIPS R3000 based reduced instruction set computer (RISC), which has been described previously. It consists of the R3000 microprocessor, R3010 floating point coprocessor and 128 Kbytes of cache memory.
Qadri, Muhammad A J; Leonard, Kevin; Cook, Robert G; Kelly, Debbie M
2018-02-15
Clark's nutcrackers exhibit remarkable cache recovery behavior, remembering thousands of seed locations over the winter. No direct laboratory test of their visual memory capacity, however, has yet been performed. Here, two nutcrackers were tested in an operant procedure used to measure different species' visual memory capacities. The nutcrackers were incrementally tested with an ever-expanding pool of pictorial stimuli in a two-alternative discrimination task. Each picture was randomly assigned to either a right or a left choice response, forcing the nutcrackers to memorize each picture-response association. The nutcrackers' visual memorization capacity was estimated at a little over 500 pictures, and the testing suggested effects of primacy, recency, and memory decay over time. The size of this long-term visual memory was less than the approximately 800-picture capacity established for pigeons. These results support the hypothesis that nutcrackers' spatial memory is a specialized adaptation tied to their natural history of food-caching and recovery, and not to a larger long-term, general memory capacity. Furthermore, despite millennia of separate and divergent evolution, the mechanisms of visual information retention seem to reflect common memory systems of differing capacities across the different species tested in this design.
Security in the CernVM File System and the Frontier Distributed Database Caching System
NASA Astrophysics Data System (ADS)
Dykstra, D.; Blomer, J.
2014-06-01
Both the CernVM File System (CVMFS) and the Frontier Distributed Database Caching System (Frontier) distribute centrally updated data worldwide for LHC experiments using http proxy caches. Neither system provides privacy or access control on reading the data, but both control access to updates of the data and can guarantee the authenticity and integrity of the data transferred to clients over the internet. CVMFS has since its early days required digital signatures and secure hashes on all distributed data, and recently Frontier has added X.509-based authenticity and integrity checking. In this paper we detail and compare the security models of CVMFS and Frontier.
ROPEC - ROtary PErcussive Coring Drill for Mars Sample Return
NASA Technical Reports Server (NTRS)
Chu, Philip; Spring, Justin; Zacny, Kris
2014-01-01
The ROtary Percussive Coring Drill is a light weight, flight-like, five-actuator drilling system prototype designed to acquire core material from rock targets for the purposes of Mars Sample Return. In addition to producing rock cores for sample caching, the ROPEC drill can be integrated with a number of end effectors to perform functions such as rock surface abrasion, dust and debris removal, powder and regolith acquisition, and viewing of potential cores prior to caching. The ROPEC drill and its suite of end effectors have been demonstrated with a five degree of freedom Robotic Arm mounted to a mobility system with a prototype sample cache and bit storage station.
USDA-ARS?s Scientific Manuscript database
Positive interactions among individual plants (facilitation) may often enhance seedling survival in stressful environments. Many granivorous small mammal species cache groups of seeds for future consumption in shallowly buried scatterhoards, and seeds of many plant species germinate and establish ag...
78 FR 2655 - Uinta-Wasatch-Cache National Forest; Utah; Ogden Travel Plan Project
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-14
...-Wasatch-Cache National Forest; Utah; Ogden Travel Plan Project AGENCY: Forest Service, USDA. ACTION... prepare a supplement to the Ogden Travel Plan Revision Final Supplemental Environmental Impact Statement (FSEIS). The Ogden Travel Plan Revision FSEIS evaluated six alternatives for possible travel management...
Cache-site selection in Clark's Nutcracker (Nucifraga columbiana)
Teresa J. Lorenz; Kimberly A. Sullivan; Amanda V. Bakian; Carol A. Aubry
2011-01-01
Clark's Nutcracker (Nucifraga Columbiana) is one of the most specialized scatter-hoarding birds, considered a seed disperser for four species of pines (Pinus spp.), as well as an obligate coevolved mutualist of White bark Pine (P. albicaulis). Cache-site selection has not been formally studied in Clark...
Sex, estradiol, and spatial memory in a food-caching corvid.
Rensel, Michelle A; Ellis, Jesse M S; Harvey, Brigit; Schlinger, Barney A
2015-09-01
Estrogens significantly impact spatial memory function in mammalian species. Songbirds express the estrogen synthetic enzyme aromatase at relatively high levels in the hippocampus and there is evidence from zebra finches that estrogens facilitate performance on spatial learning and/or memory tasks. It is unknown, however, whether estrogens influence hippocampal function in songbirds that naturally exhibit memory-intensive behaviors, such as cache recovery observed in many corvid species. To address this question, we examined the impact of estradiol on spatial memory in non-breeding Western scrub-jays, a species that routinely participates in food caching and retrieval in nature and in captivity. We also asked if there were sex differences in performance or responses to estradiol. Utilizing a combination of an aromatase inhibitor, fadrozole, with estradiol implants, we found that while overall cache recovery rates were unaffected by estradiol, several other indices of spatial memory, including searching efficiency and efficiency to retrieve the first item, were impaired in the presence of estradiol. In addition, males and females differed in some performance measures, although these differences appeared to be a consequence of the nature of the task as neither sex consistently out-performed the other. Overall, our data suggest that a sustained estradiol elevation in a food-caching bird impairs some, but not all, aspects of spatial memory on an innate behavioral task, at times in a sex-specific manner. Copyright © 2015 Elsevier Inc. All rights reserved.
SEX, ESTRADIOL, AND SPATIAL MEMORY IN A FOOD-CACHING CORVID
Rensel, Michelle A.; Ellis, Jesse M.S.; Harvey, Brigit; Schlinger, Barney A.
2015-01-01
Estrogens significantly impact spatial memory function in mammalian species. Songbirds express the estrogen synthetic enzyme aromatase at relatively high levels in the hippocampus and there is evidence from zebra finches that estrogens facilitate performance on spatial learning and/or memory tasks. It is unknown, however, whether estrogens influence hippocampal function in songbirds that naturally exhibit memory-intensive behaviors, such as cache recovery observed in many corvid species. To address this question, we examined the impact of estradiol on spatial memory in non-breeding Western scrub-jays, a species that routinely participates in food caching and retrieval in nature and in captivity. We also asked if there were sex differences in performance or responses to estradiol. Utilizing a combination of an aromatase inhibitor, fadrozole, with estradiol implants, we found that while overall cache recovery rates were unaffected by estradiol, several other indices of spatial memory, including searching efficiency and efficiency to retrieve the first item, were impaired in the presence of estradiol. In addition, males and females differed in some performance measures, although these differences appeared to be a consequence of the nature of the task as neither sex consistently out-performed the other. Overall, our data suggest that a sustained estradiol elevation in a food-caching bird impairs some, but not all, aspects of spatial memory on an innate behavioral task, at times in a sex-specific manner. PMID:26232613
Pravosudov, V V; Roth, T C; Forister, M L; Ladage, L D; Burg, T M; Braun, M J; Davidson, B S
2012-09-01
Food-caching birds rely on stored food to survive the winter, and spatial memory has been shown to be critical in successful cache recovery. Both spatial memory and the hippocampus, an area of the brain involved in spatial memory, exhibit significant geographic variation linked to climate-based environmental harshness and the potential reliance on food caches for survival. Such geographic variation has been suggested to have a heritable basis associated with differential selection. Here, we ask whether population genetic differentiation and potential isolation among multiple populations of food-caching black-capped chickadees is associated with differences in memory and hippocampal morphology by exploring population genetic structure within and among groups of populations that are divergent to different degrees in hippocampal morphology. Using mitochondrial DNA and 583 AFLP loci, we found that population divergence in hippocampal morphology is not significantly associated with neutral genetic divergence or geographic distance, but instead is significantly associated with differences in winter climate. These results are consistent with variation in a history of natural selection on memory and hippocampal morphology that creates and maintains differences in these traits regardless of population genetic structure and likely associated gene flow. Published 2012. This article is a US Government work and is in the public domain in the USA.
Claire A. Zugmeyer; John L. Koprowski
2009-01-01
Severe disturbance may alter or eliminate important habitat structure that helps preserve food caches of foodhoarding species. Recent recolonization of an insect-damaged forest by the endangered Mt. Graham red squirrel (Tamiasciurus hudsonicus grahamensis) provided an opportunity to examine habitat selection for midden (cache) sites following...
Random Fill Cache Architecture (Preprint)
2014-10-01
a concrete example, we show how the cache collision attack works to extract the AES encryption keys (e.g., in the OpenSSL implementation of AES). AES...each round are implemented as table lookups for performance reasons. OpenSSL uses ten 1-KB lookup tables, five for encryption and five for decryption
76 FR 14372 - Uinta-Wasatch-Cache National Forest Resource Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-16
... Street, Salt Lake City, Utah. Written comments should be sent to Loyal Clark, Uinta-Wasatch-Cache... open to the public. The following business will be conducted: (1) Review Forest Service project approval letter, (2) discuss travel budget, and (3) review new proposals. Persons who wish to bring related...
NASA Astrophysics Data System (ADS)
Morita, Kazuyo; Yamamoto, Kimiko
2017-03-01
Xylan, one of hemicellulose family, block copolymer was newly developed for wide-range directed self-assembly lithography (DSA). Xylan is higher hydrophilic material because of having many hydroxy groups in one molecule. It means that xylan block copolymer has a possibility of high-chi block copolymer. Generally, DSA is focused on microphase separation for smaller size with high-chi block copolymer and not well known for larger size. In this study, xylan block copolymer was confirmed enabling wider range of patterning size, from smaller size to larger size. The key of xylan block copolymer is a new molecular structure of block copolymer and sugar chain control technology. Sugar content is the important parameter for not only micro-phase separation property but also line edge roughness (LER) and defects. Based on the sugar control technology, wide-range (hp 8.3nm to 26nm L/S and CD 10nm to 51nm hole) DSA patterning was demonstrated. Additionally it was confirmed that xylan block copolymer is suitable for sequential infiltration synthesis (SIS) process.
Differentiated strategies for improving streaming service quality
NASA Astrophysics Data System (ADS)
An, Hui; Chen, Xin-Meng
2005-02-01
With the explosive growth of streaming services, users are becoming more and more sensitive to its quality of service. To handle these problems, the research community focuses of the application of caching and replication techniques. But most approaches try to find specific strategies of caching of replication that suit for streaming service characteristics and to design some kind of universal policy to deal with all streaming objects. This paper explores the combination of caching and replication for improving streaming service quality and demonstrates that it makes sense to incorporate two technologies. It provides a system model and discusses some related issues of how to determining a refreshable streaming object and which refreshment policies a refreshable object should use.
Hydrologic data for the Cache Creek-Bear Thrust environmental impact statement near Jackson, Wyoming
Craig, G.S.; Ringen, B.H.; Cox, E.R.
1981-01-01
Information on the quantity and quality of surface and ground water in an area of concern for the Cache Creek-Bear Thrust Environmental Impact Statement in northwestern Wyoming is presented without interpretation. The environmental impact statement is being prepared jointly by the U.S. Geological Survey and the U.S. Forest Service and concerns proposed exploration and development of oil and gas on leased Federal land near Jackson, Wyoming. Information includes data from a gaging station on Cache Creek and from wells, springs, and miscellaneous sites on streams. Data include streamflow, chemical and suspended-sediment quality of streams, and the occurrence and chemical quality of ground water. (USGS)
NASA Astrophysics Data System (ADS)
Zecha, Stefanie; Regelous, Anette
2017-04-01
National Geoparks are restricted areas incorporating educational resources of great importance in promoting education for sustainable development, mobilizing knowledge inherent to the EarthSciences. Different methods can be used to implement the education of sustainability. Here we present possibilities for National Geoparks to support sustainability focusing on new media and EarthCaches based on the data set of the "EarthCachers International EarthCaching" conference in Goslar in October 2015. Using an empirical study designed by ourselves we collected actual information about the environmental consciousness of Earthcachers. The data set was analyzed using SPSS and statistical methods. Here we present the results and their consequences for National Geoparks.
Tschanz, JoAnn T.; Norton, Maria C.; Zandi, Peter P.; Lyketsos, Constantine G.
2014-01-01
The Cache County Study on Memory in Aging is a longitudinal, population-based study of Alzheimer's disease (AD) and other dementias. Initiated in 1995 and extending to 2013, the study has followed over 5,000 elderly residents of Cache County, Utah (USA) for over twelve years. Achieving a 90% participation rate at enrollment, and spawning two ancillary projects, the study has contributed to the literature on genetic, psychosocial and environmental risk factors for AD, late life cognitive decline, and the clinical progression of dementia after its onset. This paper describes the major study contributions to the literature on AD and dementia. PMID:24423221
Tschanz, Joann T; Norton, Maria C; Zandi, Peter P; Lyketsos, Constantine G
2013-12-01
The Cache County Study on Memory in Aging is a longitudinal, population-based study of Alzheimer's disease (AD) and other dementias. Initiated in 1995 and extending to 2013, the study has followed over 5,000 elderly residents of Cache County, Utah (USA) for over twelve years. Achieving a 90% participation rate at enrolment, and spawning two ancillary projects, the study has contributed to the literature on genetic, psychosocial and environmental risk factors for AD, late-life cognitive decline, and the clinical progression of dementia after its onset. This paper describes the major study contributions to the literature on AD and dementia.
Resource-Efficient Data-Intensive System Designs for High Performance and Capacity
2015-09-01
76, 79, 80, and 81.] [9] Anirudh Badam, KyoungSoo Park, Vivek S. Pai, and Larry L. Peterson. HashCache: cache storage for the next billion. In Proc...Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows , Tushar Chandra, Andrew Fikes, and Robert E. Gruber. Bigtable: A
Retrospective Cognition by Food-Caching Western Scrub-Jays
ERIC Educational Resources Information Center
de Kort, S.R.; Dickinson, A.; Clayton, N.S.
2005-01-01
Episodic-like memory, the retrospective component of cognitive time travel in animals, needs to fulfil three criteria to meet the behavioral properties of episodic memory as defined for humans. Here, we review results obtained with the cache-recovery paradigm with western scrub-jays and conclude that they fulfil these three criteria. The jays…
76 FR 16640 - Petitions for Modification of Existing Mandatory Safety Standards
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-24
... standard to permit an alternative method of compliance to allow additional outby storage caches of Self.... The petitioner further states that: (a) Additional SCSR outby storage caches will be placed a maximum of 2,000 feet apart in beltlines and return air courses; (b) these additional SCSR outby storage...
Acorn Caching in Tree Squirrels: Teaching Hypothesis Testing in the Park
ERIC Educational Resources Information Center
McEuen, Amy B.; Steele, Michael A.
2012-01-01
We developed an exercise for a university-level ecology class that teaches hypothesis testing by examining acorn preferences and caching behavior of tree squirrels (Sciurus spp.). This exercise is easily modified to teach concepts of behavioral ecology for earlier grades, particularly high school, and provides students with a theoretical basis for…
USDA-ARS?s Scientific Manuscript database
Seeds of many plant species are dispersed by seed-caching rodents that place groups of seeds in superficially-buried scatterhoard caches. A case in point is provided by an important forage plant on arid western rangelands, Indian ricegrass (Oryzopsis hymenoides), for which seedling recruitment comes...
Exploitation of pocket gophers and their food caches by grizzly bears
Mattson, D.J.
2004-01-01
I investigated the exploitation of pocket gophers (Thomomys talpoides) by grizzly bears (Ursus arctos horribilis) in the Yellowstone region of the United States with the use of data collected during a study of radiomarked bears in 1977-1992. My analysis focused on the importance of pocket gophers as a source of energy and nutrients, effects of weather and site features, and importance of pocket gophers to grizzly bears in the western contiguous United States prior to historical extirpations. Pocket gophers and their food caches were infrequent in grizzly bear feces, although foraging for pocket gophers accounted for about 20-25% of all grizzly bear feeding activity during April and May. Compared with roots individually excavated by bears, pocket gopher food caches were less digestible but more easily dug out. Exploitation of gopher food caches by grizzly bears was highly sensitive to site and weather conditions and peaked during and shortly after snowmelt. This peak coincided with maximum success by bears in finding pocket gopher food caches. Exploitation was most frequent and extensive on gently sloping nonforested sites with abundant spring beauty (Claytonia lanceolata) and yampah (Perdieridia gairdneri). Pocket gophers are rare in forests, and spring beauty and yampah roots are known to be important foods of both grizzly bears and burrowing rodents. Although grizzly bears commonly exploit pocket gophers only in the Yellowstone region, this behavior was probably widespread in mountainous areas of the western contiguous United States prior to extirpations of grizzly bears within the last 150 years.
Strickland, Dan; Kielstra, Brian; Ryan Norris, D
2011-12-01
Variation in habitat quality can have important consequences for fitness and population dynamics. For food-caching species, a critical determinant of habitat quality is normally the density of storable food, but it is also possible that quality is driven by the ability of habitats to preserve food items. The food-caching gray jay (Perisoreus canadensis) occupies year-round territories in the coniferous boreal and subalpine forests of North America, but does not use conifer seed crops as a source of food. Over the last 33 years, we found that the occupancy rate of territories in Algonquin Park (ON, Canada) has declined at a higher rate in territories with a lower proportion of conifers compared to those with a higher proportion. Individuals occupying territories with a low proportion of conifers were also less likely to successfully fledge young. Using chambers to simulate food caches, we conducted an experiment to examine the hypothesis that coniferous trees are better able to preserve the perishable food items stored in summer and fall than deciduous trees due to their antibacterial and antifungal properties. Over a 1-4 month exposure period, we found that mealworms, blueberries, and raisins all lost less weight when stored on spruce and pine trees compared to deciduous and other coniferous trees. Our results indicate a novel mechanism to explain how habitat quality may influence the fitness and population dynamics of food-caching animals, and has important implications for understanding range limits for boreal breeding animals.
A Global User-Driven Model for Tile Prefetching in Web Geographical Information Systems.
Pan, Shaoming; Chong, Yanwen; Zhang, Hang; Tan, Xicheng
2017-01-01
A web geographical information system is a typical service-intensive application. Tile prefetching and cache replacement can improve cache hit ratios by proactively fetching tiles from storage and replacing the appropriate tiles from the high-speed cache buffer without waiting for a client's requests, which reduces disk latency and improves system access performance. Most popular prefetching strategies consider only the relative tile popularities to predict which tile should be prefetched or consider only a single individual user's access behavior to determine which neighbor tiles need to be prefetched. Some studies show that comprehensively considering all users' access behaviors and all tiles' relationships in the prediction process can achieve more significant improvements. Thus, this work proposes a new global user-driven model for tile prefetching and cache replacement. First, based on all users' access behaviors, a type of expression method for tile correlation is designed and implemented. Then, a conditional prefetching probability can be computed based on the proposed correlation expression mode. Thus, some tiles to be prefetched can be found by computing and comparing the conditional prefetching probability from the uncached tiles set and, similarly, some replacement tiles can be found in the cache buffer according to multi-step prefetching. Finally, some experiments are provided comparing the proposed model with other global user-driven models, other single user-driven models, and other client-side prefetching strategies. The results show that the proposed model can achieve a prefetching hit rate in approximately 10.6% ~ 110.5% higher than the compared methods.
Ironside, Kirsten E.; Mattson, David J.; Theimer, Tad; Jansen, Brian; Holton, Brandon; Arundel, Terry; Peters, Michael; Sexton, Joseph O.; Edwards, Thomas C.
2017-01-01
Relocation studies of animal movement have focused on directed versus area restricted movement, which rely on correlations between step-length and turn angles, along with a degree of stationarity through time to define behavioral states. Although these approaches may work well for grazing foraging strategies in a patchy landscape, species that do not spend a significant amount of time searching out and gathering small dispersed food items, but instead feed for short periods on large, concentrated sources or cache food result in movements that maybe difficult to analyze using turning and velocity alone. We use GPS telemetry collected from a prey-caching predator, the cougar (Puma concolor), to test whether adding additional movement metrics capturing site recursion, to the more traditional velocity and turning, improve the ability to identify behaviors. We evaluated our movement index’s ability to identify behaviors using field investigations. We further tested for statistical stationarity across behaviors for use of topographic view-sheds. We found little correlation between turn angle, velocity, tortuosity, and site fidelity and combined them into a movement index used to identify movement paths (temporally autocorrelated movements) related to fast directed movements (taxis), area restricted movements (search), and prey caching (foraging). Changes in the frequency and duration of these movements were helpful for identifying seasonal activities such as migration and denning in females. Comparison of field investigations of cougar activities to behavioral classes defined using the movement index and found an overall classification accuracy of 81%. Changes in behaviors resulted in changes in how cougars used topographic view-sheds, showing statistical non-stationarity over time. The movement index shows promise for identifying behaviors in species that frequently return to specific locations such as food caches, watering holes, or dens, and highlights the role memory and cognitive abilities may play in determining animal movements. With the addition of measures capturing site recursion the temporal structure in movements of a caching forager was revealed.
Cache and energy efficient algorithms for Nussinov's RNA Folding.
Zhao, Chunchun; Sahni, Sartaj
2017-12-06
An RNA folding/RNA secondary structure prediction algorithm determines the non-nested/pseudoknot-free structure by maximizing the number of complementary base pairs and minimizing the energy. Several implementations of Nussinov's classical RNA folding algorithm have been proposed. Our focus is to obtain run time and energy efficiency by reducing the number of cache misses. Three cache-efficient algorithms, ByRow, ByRowSegment and ByBox, for Nussinov's RNA folding are developed. Using a simple LRU cache model, we show that the Classical algorithm of Nussinov has the highest number of cache misses followed by the algorithms Transpose (Li et al.), ByRow, ByRowSegment, and ByBox (in this order). Extensive experiments conducted on four computational platforms-Xeon E5, AMD Athlon 64 X2, Intel I7 and PowerPC A2-using two programming languages-C and Java-show that our cache efficient algorithms are also efficient in terms of run time and energy. Our benchmarking shows that, depending on the computational platform and programming language, either ByRow or ByBox give best run time and energy performance. The C version of these algorithms reduce run time by as much as 97.2% and energy consumption by as much as 88.8% relative to Classical and by as much as 56.3% and 57.8% relative to Transpose. The Java versions reduce run time by as much as 98.3% relative to Classical and by as much as 75.2% relative to Transpose. Transpose achieves run time and energy efficiency at the expense of memory as it takes twice the memory required by Classical. The memory required by ByRow, ByRowSegment, and ByBox is the same as that of Classical. As a result, using the same amount of memory, the algorithms proposed by us can solve problems up to 40% larger than those solvable by Transpose.
Gonthier, G.J.; Kleiss, B.A.
1996-01-01
The U.S. Geological Survey, working in cooperation with the U.S. Army Corps of Engineers, Waterways Experiment Station, collected surface-water and ground-water data from 119 wells and 13 staff gages from September 1989 to September 1992 to describe ground-water flow patterns and water budget in the Black Swamp, a bottomland forested wetland in eastern Arkansas. The study area was between two streamflow gaging stations located about 30.5 river miles apart on the Cache River. Ground-water flow was from northwest to southeast with some diversion toward the Cache River. Hydraulic connection between the surface water and the alluvial aquifer is indicated by nearly equal changes in surface-water and ground-water levels near the Cache River. Diurnal fluctuations of hydraulic head ranged from more than 0 to 0.38 feet and were caused by evapotranspiration. Changes in hydraulic head of the alluvial aquifer beneath the wetland lagged behind stage fluctuations and created the potential for changes in ground-water movement. Differences between surface-water levels in the wetland and stage of the Cache River created a frequently occurring local ground-water flow condition in which surface water in the wetland seeped into the upper part of the alluvial aquifer and then seeped into the Cache River. When the Cache River flooded the wetland, ground water consistently seeped to the surface during falling surface-water stage and surface water seeped into the ground during rising surface-water stage. Ground-water flow was a minor component of the water budget, accounting for less than 1 percent of both inflow and outflow. Surface-water drainage from the study area through diversion canals was not accounted for in the water budget and may be the reason for a surplus of water in the budget. Even though ground-water flow volume is small compared to other water budget components, ground-water seepage to the wetland surface may still be vital to some wetland functions.
Effect of video server topology on contingency capacity requirements
NASA Astrophysics Data System (ADS)
Kienzle, Martin G.; Dan, Asit; Sitaram, Dinkar; Tetzlaff, William H.
1996-03-01
Video servers need to assign a fixed set of resources to each video stream in order to guarantee on-time delivery of the video data. If a server has insufficient resources to guarantee the delivery, it must reject the stream request rather than slowing down all existing streams. Large scale video servers are being built as clusters of smaller components, so as to be economical, scalable, and highly available. This paper uses a blocking model developed for telephone systems to evaluate video server cluster topologies. The goal is to achieve high utilization of the components and low per-stream cost combined with low blocking probability and high user satisfaction. The analysis shows substantial economies of scale achieved by larger server images. Simple distributed server architectures can result in partitioning of resources with low achievable resource utilization. By comparing achievable resource utilization of partitioned and monolithic servers, we quantify the cost of partitioning. Next, we present an architecture for a distributed server system that avoids resource partitioning and results in highly efficient server clusters. Finally, we show how, in these server clusters, further optimizations can be achieved through caching and batching of video streams.
Cooperative storage of shared files in a parallel computing system with dynamic block size
Bent, John M.; Faibish, Sorin; Grider, Gary
2015-11-10
Improved techniques are provided for parallel writing of data to a shared object in a parallel computing system. A method is provided for storing data generated by a plurality of parallel processes to a shared object in a parallel computing system. The method is performed by at least one of the processes and comprises: dynamically determining a block size for storing the data; exchanging a determined amount of the data with at least one additional process to achieve a block of the data having the dynamically determined block size; and writing the block of the data having the dynamically determined block size to a file system. The determined block size comprises, e.g., a total amount of the data to be stored divided by the number of parallel processes. The file system comprises, for example, a log structured virtual parallel file system, such as a Parallel Log-Structured File System (PLFS).
Predictive Cache Modeling and Analysis
2011-11-01
metaheuristic /bin-packing algorithm to optimize task placement based on task communication characterization. Our previous work on task allocation showed...Cache Miss Minimization Technology To efficiently explore combinations and discover nearly-optimal task-assignment algorithms , we extended to our...it was possible to use our algorithmic techniques to decrease network bandwidth consumption by ~25%. In this effort, we adapted these existing
Statistical Inference-Based Cache Management for Mobile Learning
ERIC Educational Resources Information Center
Li, Qing; Zhao, Jianmin; Zhu, Xinzhong
2009-01-01
Supporting efficient data access in the mobile learning environment is becoming a hot research problem in recent years, and the problem becomes tougher when the clients are using light-weight mobile devices such as cell phones whose limited storage space prevents the clients from holding a large cache. A practical solution is to store the cache…
Geo-Caching: Place-Based Discovery of Virginia State Parks and Museums
ERIC Educational Resources Information Center
Gray, Howard Richard
2007-01-01
The use of Global Positioning Systems (GPS) units has exploded in recent years along with the computer technology to access this data-based information. Geo-caching is an exciting game using GPS that provides place-based information regarding the public lands, facilities and cultural heritage programs within the Virginia Parks and Museum system.…
Code of Federal Regulations, 2011 CFR
2011-07-01
.../Attainment. The area surrounding Brigham City, as described by the following Townships or the portions of the following Townships in Box Elder County: T9N 2W, T9N R1W, T8N 2W Cache County, UT (part): Cache County.... The area surrounding Grantsville, as described by the following Townships or the portions of the...
Code of Federal Regulations, 2010 CFR
2010-07-01
.../Attainment. The area surrounding Brigham City, as described by the following Townships or the portions of the following Townships in Box Elder County: T9N 2W, T9N R1W, T8N 2W Cache County, UT (part): Cache County.... The area surrounding Grantsville, as described by the following Townships or the portions of the...
Managing coherence via put/get windows
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton on Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Yorktown Heights, NY
2011-01-11
A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
Managing coherence via put/get windows
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton on Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Yorktown Heights, NY
2012-02-21
A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
What-where-when memory in magpies (Pica pica).
Zinkivskay, Ann; Nazir, Farrah; Smulders, Tom V
2009-01-01
Some animals have been shown to be able to remember which type of food they hoarded or encountered in which location and how long ago (what-where-when memory). In this study, we test whether magpies (Pica pica) also show evidence of remembering these different aspects of a past episode. Magpies hid red- and blue-dyed pellets of scrambled eggs in a large tray containing wood shavings. They were allowed to make as many caches as they wanted. The birds were then returned either the same day or the next day to retrieve the pellets. If they returned the same day, one colour of pellets was replaced with wooden beads of similar size and colour, while if they returned the next day this would happen to the other colour. Over just a few trials, the birds learned to only search for the food pellets, and ignore the beads, of the appropriate colour for the given retention interval. A probe trial in which all items were removed showed that the birds persisted in searching for the pellets and not the beads. This shows that magpies can remember which food item they hoarded where, and when, even if the food items only differ from each other in their colour and are dispersed throughout a continuous caching substrate.
A Global User-Driven Model for Tile Prefetching in Web Geographical Information Systems
Pan, Shaoming; Chong, Yanwen; Zhang, Hang; Tan, Xicheng
2017-01-01
A web geographical information system is a typical service-intensive application. Tile prefetching and cache replacement can improve cache hit ratios by proactively fetching tiles from storage and replacing the appropriate tiles from the high-speed cache buffer without waiting for a client’s requests, which reduces disk latency and improves system access performance. Most popular prefetching strategies consider only the relative tile popularities to predict which tile should be prefetched or consider only a single individual user's access behavior to determine which neighbor tiles need to be prefetched. Some studies show that comprehensively considering all users’ access behaviors and all tiles’ relationships in the prediction process can achieve more significant improvements. Thus, this work proposes a new global user-driven model for tile prefetching and cache replacement. First, based on all users’ access behaviors, a type of expression method for tile correlation is designed and implemented. Then, a conditional prefetching probability can be computed based on the proposed correlation expression mode. Thus, some tiles to be prefetched can be found by computing and comparing the conditional prefetching probability from the uncached tiles set and, similarly, some replacement tiles can be found in the cache buffer according to multi-step prefetching. Finally, some experiments are provided comparing the proposed model with other global user-driven models, other single user-driven models, and other client-side prefetching strategies. The results show that the proposed model can achieve a prefetching hit rate in approximately 10.6% ~ 110.5% higher than the compared methods. PMID:28085937
Janine Rice; Tim Bardsley; Pete Gomben; Dustin Bambrough; Stacey Weems; Sarah Leahy; Christopher Plunkett; Charles Condrat; Linda A. Joyce
2017-01-01
Watersheds on the Uinta-Wasatch-Cache and Ashley National Forests provide many ecosystem services, and climate change poses a risk to these services. We developed a watershed vulnerability assessment to provide scientific information for land managers facing the challenge of managing these watersheds. Literature-based information and expert elicitation is used to...
Hide And Seek GPS And Geocaching In The Classroom
ERIC Educational Resources Information Center
Lary, Lynn M.
2004-01-01
In short, geocaching is a high-tech, worldwide treasure hunt (geocaches can now be found in more than 180 countries) where a person hides a cache for others to find. Generally, the cache is some type of waterproof container that contains a log book and an assortment of goodies, such as lottery tickets, toys, photo books for cachers to fill with…
Dynamically programmable cache
NASA Astrophysics Data System (ADS)
Nakkar, Mouna; Harding, John A.; Schwartz, David A.; Franzon, Paul D.; Conte, Thomas
1998-10-01
Reconfigurable machines have recently been used as co- processors to accelerate the execution of certain algorithms or program subroutines. The problems with the above approach include high reconfiguration time and limited partial reconfiguration. By far the most critical problems are: (1) the small on-chip memory which results in slower execution time, and (2) small FPGA areas that cannot implement large subroutines. Dynamically Programmable Cache (DPC) is a novel architecture for embedded processors which offers solutions to the above problems. To solve memory access problems, DPC processors merge reconfigurable arrays with the data cache at various cache levels to create a multi-level reconfigurable machines. As a result DPC machines have both higher data accessibility and FPGA memory bandwidth. To solve the limited FPGA resource problem, DPC processors implemented multi-context switching (Virtualization) concept. Virtualization allows implementation of large subroutines with fewer FPGA cells. Additionally, DPC processors can parallelize the execution of several operations resulting in faster execution time. In this paper, the speedup improvement for DPC machines are shown to be 5X faster than an Altera FLEX10K FPGA chip and 2X faster than a Sun Ultral SPARC station for two different algorithms (convolution and motion estimation).
Dynamic code block size for JPEG 2000
NASA Astrophysics Data System (ADS)
Tsai, Ping-Sing; LeCornec, Yann
2008-02-01
Since the standardization of the JPEG 2000, it has found its way into many different applications such as DICOM (digital imaging and communication in medicine), satellite photography, military surveillance, digital cinema initiative, professional video cameras, and so on. The unified framework of the JPEG 2000 architecture makes practical high quality real-time compression possible even in video mode, i.e. motion JPEG 2000. In this paper, we present a study of the compression impact using dynamic code block size instead of fixed code block size as specified in the JPEG 2000 standard. The simulation results show that there is no significant impact on compression if dynamic code block sizes are used. In this study, we also unveil the advantages of using dynamic code block sizes.
Rate-Compatible LDPC Codes with Linear Minimum Distance
NASA Technical Reports Server (NTRS)
Divsalar, Dariush; Jones, Christopher; Dolinar, Samuel
2009-01-01
A recently developed method of constructing protograph-based low-density parity-check (LDPC) codes provides for low iterative decoding thresholds and minimum distances proportional to block sizes, and can be used for various code rates. A code constructed by this method can have either fixed input block size or fixed output block size and, in either case, provides rate compatibility. The method comprises two submethods: one for fixed input block size and one for fixed output block size. The first mentioned submethod is useful for applications in which there are requirements for rate-compatible codes that have fixed input block sizes. These are codes in which only the numbers of parity bits are allowed to vary. The fixed-output-blocksize submethod is useful for applications in which framing constraints are imposed on the physical layers of affected communication systems. An example of such a system is one that conforms to one of many new wireless-communication standards that involve the use of orthogonal frequency-division modulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perelmutov, T.; Bakken, J.; Petravick, D.
Storage Resource Managers (SRMs) are middleware components whose function is to provide dynamic space allocation and file management on shared storage components on the Grid[1,2]. SRMs support protocol negotiation and reliable replication mechanism. The SRM standard supports independent SRM implementations, allowing for a uniform access to heterogeneous storage elements. SRMs allow site-specific policies at each location. Resource Reservations made through SRMs have limited lifetimes and allow for automatic collection of unused resources thus preventing clogging of storage systems with ''orphan'' files. At Fermilab, data handling systems use the SRM management interface to the dCache Distributed Disk Cache [5,6] and themore » Enstore Tape Storage System [15] as key components to satisfy current and future user requests [4]. The SAM project offers the SRM interface for its internal caches as well.« less
The Use of Proxy Caches for File Access in a Multi-Tier Grid Environment
NASA Astrophysics Data System (ADS)
Brun, R.; Duellmann, D.; Ganis, G.; Hanushevsky, A.; Janyst, L.; Peters, A. J.; Rademakers, F.; Sindrilaru, E.
2011-12-01
The use of proxy caches has been extensively studied in the HEP environment for efficient access of database data and showed significant performance with only very moderate operational effort at higher grid tiers (T2, T3). In this contribution we propose to apply the same concept to the area of file access and analyse the possible performance gains, operational impact on site services and applicability to different HEP use cases. Base on a proof-of-concept studies with a modified XROOT proxy server we review the cache efficiency and overheads for access patterns of typical ROOT based analysis programs. We conclude with a discussion of the potential role of this new component at the different tiers of a distributed computing grid.
Evolution of magnetic disk subsystems
NASA Astrophysics Data System (ADS)
Kaneko, Satoru
1994-06-01
The higher recording density of magnetic disk realized today has brought larger storage capacity per unit and smaller form factors. If the required access performance per MB is constant, the performance of large subsystems has to be several times better. This article describes mainly the technology for improving the performance of the magnetic disk subsystems and the prospects of their future evolution. Also considered are 'crosscall pathing' which makes the data transfer channel more effective, 'disk cache' which improves performance coupling with solid state memory technology, and 'RAID' which improves the availability and integrity of disk subsystems by organizing multiple disk drives in a subsystem. As a result, it is concluded that since the performance of the subsystem is dominated by that of the disk cache, maximation of the performance of the disk cache subsystems is very important.
The Use of Proxy Caches for File Access in a Multi-Tier Grid Environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brun, R.; Dullmann, D.; Ganis, G.
2012-04-19
The use of proxy caches has been extensively studied in the HEP environment for efficient access of database data and showed significant performance with only very moderate operational effort at higher grid tiers (T2, T3). In this contribution we propose to apply the same concept to the area of file access and analyze the possible performance gains, operational impact on site services and applicability to different HEP use cases. Base on a proof-of-concept studies with a modified XROOT proxy server we review the cache efficiency and overheads for access patterns of typical ROOT based analysis programs. We conclude with amore » discussion of the potential role of this new component at the different tiers of a distributed computing grid.« less
Managing coherence via put/get windows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blumrich, Matthias A; Chen, Dong; Coteus, Paul W
A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an areamore » of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.« less
A Novel Method for Block Size Forensics Based on Morphological Operations
NASA Astrophysics Data System (ADS)
Luo, Weiqi; Huang, Jiwu; Qiu, Guoping
Passive forensics analysis aims to find out how multimedia data is acquired and processed without relying on pre-embedded or pre-registered information. Since most existing compression schemes for digital images are based on block processing, one of the fundamental steps for subsequent forensics analysis is to detect the presence of block artifacts and estimate the block size for a given image. In this paper, we propose a novel method for blind block size estimation. A 2×2 cross-differential filter is first applied to detect all possible block artifact boundaries, morphological operations are then used to remove the boundary effects caused by the edges of the actual image contents, and finally maximum-likelihood estimation (MLE) is employed to estimate the block size. The experimental results evaluated on over 1300 nature images show the effectiveness of our proposed method. Compared with existing gradient-based detection method, our method achieves over 39% accuracy improvement on average.
2002-07-01
our general model include: (1) service user (SU), (2) service manager (SM), and (3) service cache manager ( SCM ), where the SCM is an optional...maintained by SMs that satisfy specific requirements. Where employed, the SCM operates as an intermediary, matching advertised SDs of SMs to...Directory Service Agent (optional) not applicableLookup ServiceService Cache Manager ( SCM ) Service URL Service Type Service Attributes Template URL
Environmental Impact Analysis Process, Groom Mountain Range, Lincoln County, Nevada
1985-10-01
bases clustered around springs, temporary camps, rock shelters , quarries, lithic scatters, rock art, pinyon caches, pot drops, isolates, and historic...include pinyon caches and rock shelters with associated historic artifacts and many of the spring sites. These sites provide an unusual research...Management. (b) Proposed Action: Renewed Withdrawal of Groom Mountain Range Addition to Nellis Air Force Bombing and Gunnery Range, Lincoln County, Nevada. (c
Thomas L. Foti
2001-01-01
Relationships between forest vegetation and soil were reconstructed from field notes of the 1846 Public Land Survey (PLS) along a portion of the Cache River including Black Swamp. Locations of corners were digitized long with species,diameter,and distance from section or quarter-section corners. Trees were grouped for analysis according to occurrence on groups of...
Killing of a muskox, Ovibus moschatus, by two wolves, Canis lupis, and subsequent caching
Mech, L. David; Adams, Layne G.
1999-01-01
The killing of a cow Muskox (Ovibos moschatus) by two Wolves (Canis lupus) in 5 minutes during summer on Ellesmere Island is described. After two of the four feedings observed, one Wolf cached a leg and regurgitated food as far as 2.3 km away and probably farther. The implications of this behavior for deriving food-consumption estimates are discussed.
Minimizing End-to-End Interference in I/O Stacks Spanning Shared Multi-Level Buffer Caches
ERIC Educational Resources Information Center
Patrick, Christina M.
2011-01-01
This thesis presents an end-to-end interference minimizing uniquely designed high performance I/O stack that spans multi-level shared buffer cache hierarchies accessing shared I/O servers to deliver a seamless high performance I/O stack. In this thesis, I show that I can build a superior I/O stack which minimizes the inter-application interference…
Evaluating Fragment Construction Policies for SDT Systems
2006-01-01
allocates a fragment and begins translation. Once a termination condition is met, Strata emits any trampolines that are necessary. Trampolines are pieces... trampolines (unless its target previously exists in the fragment cache). Once a CTI’s target instruction becomes available in the fragment cache, the CTI is...linked directly to the destination, avoiding future uses of the trampoline . This mechanism is called Fragment Linking and avoids significant overhead
dCache, Sync-and-Share for Big Data
NASA Astrophysics Data System (ADS)
Millar, AP; Fuhrmann, P.; Mkrtchyan, T.; Behrmann, G.; Bernardt, C.; Buchholz, Q.; Guelzow, V.; Litvintsev, D.; Schwank, K.; Rossi, A.; van der Reest, P.
2015-12-01
The availability of cheap, easy-to-use sync-and-share cloud services has split the scientific storage world into the traditional big data management systems and the very attractive sync-and-share services. With the former, the location of data is well understood while the latter is mostly operated in the Cloud, resulting in a rather complex legal situation. Beside legal issues, those two worlds have little overlap in user authentication and access protocols. While traditional storage technologies, popular in HEP, are based on X.509, cloud services and sync-and-share software technologies are generally based on username/password authentication or mechanisms like SAML or Open ID Connect. Similarly, data access models offered by both are somewhat different, with sync-and-share services often using proprietary protocols. As both approaches are very attractive, dCache.org developed a hybrid system, providing the best of both worlds. To avoid reinventing the wheel, dCache.org decided to embed another Open Source project: OwnCloud. This offers the required modern access capabilities but does not support the managed data functionality needed for large capacity data storage. With this hybrid system, scientists can share files and synchronize their data with laptops or mobile devices as easy as with any other cloud storage service. On top of this, the same data can be accessed via established mechanisms, like GridFTP to serve the Globus Transfer Service or the WLCG FTS3 tool, or the data can be made available to worker nodes or HPC applications via a mounted filesystem. As dCache provides a flexible authentication module, the same user can access its storage via different authentication mechanisms; e.g., X.509 and SAML. Additionally, users can specify the desired quality of service or trigger media transitions as necessary, thus tuning data access latency to the planned access profile. Such features are a natural consequence of using dCache. We will describe the design of the hybrid dCache/OwnCloud system, report on several months of operations experience running it at DESY, and elucidate the future road-map.
From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation
Blazewicz, Marek; Hinder, Ian; Koppelman, David M.; ...
2013-01-01
Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization ismore » based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.« less
Fusion PIC code performance analysis on the Cori KNL system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Koskela, Tuomas S.; Deslippe, Jack; Friesen, Brian
We study the attainable performance of Particle-In-Cell codes on the Cori KNL system by analyzing a miniature particle push application based on the fusion PIC code XGC1. We start from the most basic building blocks of a PIC code and build up the complexity to identify the kernels that cost the most in performance and focus optimization efforts there. Particle push kernels operate at high AI and are not likely to be memory bandwidth or even cache bandwidth bound on KNL. Therefore, we see only minor benefits from the high bandwidth memory available on KNL, and achieving good vectorization ismore » shown to be the most beneficial optimization path with theoretical yield of up to 8x speedup on KNL. In practice we are able to obtain up to a 4x gain from vectorization due to limitations set by the data layout and memory latency.« less
CUDA Fortran acceleration for the finite-difference time-domain method
NASA Astrophysics Data System (ADS)
Hadi, Mohammed F.; Esmaeili, Seyed A.
2013-05-01
A detailed description of programming the three-dimensional finite-difference time-domain (FDTD) method to run on graphical processing units (GPUs) using CUDA Fortran is presented. Two FDTD-to-CUDA thread-block mapping designs are investigated and their performances compared. Comparative assessment of trade-offs between GPU's shared memory and L1 cache is also discussed. This presentation is for the benefit of FDTD programmers who work exclusively with Fortran and are reluctant to port their codes to C in order to utilize GPU computing. The derived CUDA Fortran code is compared with an optimized CPU version that runs on a workstation-class CPU to present a realistic GPU to CPU run time comparison and thus help in making better informed investment decisions on FDTD code redesigns and equipment upgrades. All analyses are mirrored with CUDA C simulations to put in perspective the present state of CUDA Fortran development.
Block-based scalable wavelet image codec
NASA Astrophysics Data System (ADS)
Bao, Yiliang; Kuo, C.-C. Jay
1999-10-01
This paper presents a high performance block-based wavelet image coder which is designed to be of very low implementational complexity yet with rich features. In this image coder, the Dual-Sliding Wavelet Transform (DSWT) is first applied to image data to generate wavelet coefficients in fixed-size blocks. Here, a block only consists of wavelet coefficients from a single subband. The coefficient blocks are directly coded with the Low Complexity Binary Description (LCBiD) coefficient coding algorithm. Each block is encoded using binary context-based bitplane coding. No parent-child correlation is exploited in the coding process. There is also no intermediate buffering needed in between DSWT and LCBiD. The compressed bit stream generated by the proposed coder is both SNR and resolution scalable, as well as highly resilient to transmission errors. Both DSWT and LCBiD process the data in blocks whose size is independent of the size of the original image. This gives more flexibility in the implementation. The codec has a very good coding performance even the block size is (16,16).
Collaborative video caching scheme over OFDM-based long-reach passive optical networks
NASA Astrophysics Data System (ADS)
Li, Yan; Dai, Shifang; Chang, Xiangmao
2018-07-01
Long-reach passive optical networks (LR-PONs) are now considered as a desirable access solution for cost-efficiently delivering broadband services by integrating metro network with access network, among which orthogonal frequency division multiplexing (OFDM)-based LR-PONs gain greater research interests due to their good robustness and high spectrum efficiency. In such attractive OFDM-based LR-PONs, however, it is still challenging to effectively provide video service, which is one of the most popular and profitable broadband services, for end users. Given that more video requesters (i.e., end users) far away from optical line terminal (OLT) are served in OFDM-based LR-PONs, it is efficiency-prohibitive to use traditional video delivery model, which relies on the OLT to transmit videos to requesters, for providing video service, due to the model will incur not only larger video playback delay but also higher downstream bandwidth consumption. In this paper, we propose a novel video caching scheme that to collaboratively cache videos on distributed optical network units (ONUs) which are closer to end users, and thus to timely and cost-efficiently provide videos for requesters by ONUs over OFDM-based LR-PONs. We firstly construct an OFDM-based LR-PON architecture to enable the cooperation among ONUs while caching videos. Given a limited storage capacity of each ONU, we then propose collaborative approaches to cache videos on ONUs with the aim to maximize the local video hit ratio (LVHR), i.e., the proportion of video requests that can be directly satisfied by ONUs, under diverse resources requirements and requests distributions of videos. Simulations are finally conducted to evaluate the efficiency of our proposed scheme.
Cao, Lin; Xiao, Zhishu; Guo, Cong; Chen, Jin
2011-09-01
Local extinction or population decline of large frugivorous vertebrates as primary seed dispersers, caused by human disturbance and habitat change, might lead to dispersal limitation of many large-seeded fruit trees. However, it is not known whether or not scatter-hoarding rodents as secondary seed dispersers can help maintain natural regeneration (e.g. seed dispersal) of these frugivore-dispersed trees in the face of the functional reduction or loss of primary seed dispersers. In the present study, we investigated how scatter-hoarding rodents affect the fate of tagged seeds of a large-seeded fruit tree (Scleropyrum wallichianum Arnott, 1838, Santalaceae) from seed fall to seedling establishment in a heavily defaunated tropical forest in the Xishuangbanna region of Yunnan Province, in southwest China, in 2007 and 2008. Our results show that: (i) rodents removed nearly all S. wallichianum seeds in both years; (ii) a large proportion (2007, 75%; 2008, 67.5%) of the tagged seeds were cached individually in the surface soil or under leaf litters; (iii) dispersal distance of primary caches was further in 2007 (19.6 ± 14.6 m) than that in 2008 (14.1 ± 11.6 m), and distance increased as rodents recovered and moved seeds from primary caches into subsequent caching sites; and (iv) part of the cached seeds (2007, 3.2%; 2008, 2%) survived to the seedling stage each year. Our study suggests that by taking roles of both primary and secondary seed dispersers, scatter-hoarding rodents can play a significant role in maintaining seedling establishment of S. wallichianum, and are able to at least partly compensate for the loss of large frugivorous vertebrates in seed dispersal. © 2011 ISZS, Blackwell Publishing and IOZ/CAS.
Tschanz, Joann T; Treiber, Katherine; Norton, Maria C; Welsh-Bohmer, Kathleen A; Toone, Leslie; Zandi, Peter P; Szekely, Christine A; Lyketsos, Constantine; Breitner, John C S
2005-01-01
There are several population-based studies of aging, memory, and dementia being conducted worldwide. Of these, the Cache County Study on Memory, Health and Aging is noteworthy for its large number of "oldest-old" members. This study, which has been following an initial cohort of 5,092 seniors since 1995, has reported among its major findings the role of the Apolipoprotein E gene on modifying the risk for Alzheimer's disease (AD) in males and females and identifying pharmacologic compounds that may act to reduce AD risk. This article summarizes the major findings of the Cache County study to date, describes ongoing investigations, and reports preliminary analyses on the outcome of the oldest-old in this population, the subgroup of participants who were over age 84 at the study's inception.
A path-oriented knowledge representation system: Defusing the combinatorial system
NASA Technical Reports Server (NTRS)
Karamouzis, Stamos T.; Barry, John S.; Smith, Steven L.; Feyock, Stefan
1995-01-01
LIMAP is a programming system oriented toward efficient information manipulation over fixed finite domains, and quantification over paths and predicates. A generalization of Warshall's Algorithm to precompute paths in a sparse matrix representation of semantic nets is employed to allow questions involving paths between components to be posed and answered easily. LIMAP's ability to cache all paths between two components in a matrix cell proved to be a computational obstacle, however, when the semantic net grew to realistic size. The present paper describes a means of mitigating this combinatorial explosion to an extent that makes the use of the LIMAP representation feasible for problems of significant size. The technique we describe radically reduces the size of the search space in which LIMAP must operate; semantic nets of more than 500 nodes have been attacked successfully. Furthermore, it appears that the procedure described is applicable not only to LIMAP, but to a number of other combinatorially explosive search space problems found in AI as well.
Understanding Consistency Maintenance in Service Discovery Architectures in Response to Message Loss
2002-07-01
manager (SM), and (3) service cache manager ( SCM ). The SCM is an optional element not supported by all discovery protocols. These components participate...the SCM operates as an intermediary, matching advertised SDs of SMs to requirements provided by SUs. Table 1 shows how these general concepts map...Service DescriptionService ItemService Description (SD) Directory Service Agent (optional) not applicableLookup ServiceService Cache Manager ( SCM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Millar, A. P.; Behrmann, G.; Bernardt, C.
With over ten years in production use dCache data storage system has evolved to match ever changing lansdcape of continually evolving storage technologies with new solutions to both existing problems and new challenges. In this paper, we present three areas of innovation in dCache: providing efficient access to data with NFS v4.1 pNFS, adoption of CDMI and WebDAV as an alternative to SRM for managing data, and integration with alternative authentication mechanisms.
Algorithms for Data Intensive Applications on Intelligent and Smart Memories
2003-03-01
editors). Parallel Algorithms and Architectures. North Holland, 1986. [8] P. Diniz . USC ISI, Personal Communication, March, 2001. [9] M. Frigo, C. E ...hierarchy as well as the Translation Lookaside Buer TLB aect the e ectiveness of cache friendly optimizations These penalties vary among...processors and cause large variations in the e ectiveness of cache performance optimizations The area of graph problems is fundamental in a wide variety of
Effect of Spatial Locality Prefetching on Structural Locality
1991-12-01
Pollution module calculates the SLC and CAM cache pollution percentages. And finally, the Generate Reference Frequency List module produces the output...3.2.5 Generate Reference Frequency List 3.2.6 Each program module in the structure chart is mapped into an Ada package. By performing this encapsulation...call routine to generate reference -- frequency list -- end if -- end loop -- close input, output, and reference files end Cache Simulator Figure 3.5
NASA Technical Reports Server (NTRS)
Clement, Bradley J.; Estlin, Tara A.; Bornstein, Benjamin J.
2013-01-01
The Mobile Thread Task Manager (MTTM) is being applied to parallelizing existing flight software to understand the benefits and to develop new techniques and architectural concepts for adapting software to multicore architectures. It allocates and load-balances tasks for a group of threads that migrate across processors to improve cache performance. In order to balance-load across threads, the MTTM augments a basic map-reduce strategy to draw jobs from a global queue. In a multicore processor, memory may be "homed" to the cache of a specific processor and must be accessed from that processor. The MTTB architecture wraps access to data with thread management to move threads to the home processor for that data so that the computation follows the data in an attempt to avoid L2 cache misses. Cache homing is also handled by a memory manager that translates identifiers to processor IDs where the data will be homed (according to rules defined by the user). The user can also specify the number of threads and processors separately, which is important for tuning performance for different patterns of computation and memory access. MTTM efficiently processes tasks in parallel on a multiprocessor computer. It also provides an interface to make it easier to adapt existing software to a multiprocessor environment.
Graded Mirror Self-Recognition by Clark's Nutcrackers.
Clary, Dawson; Kelly, Debbie M
2016-11-04
The traditional 'mark test' has shown some large-brained species are capable of mirror self-recognition. During this test a mark is inconspicuously placed on an animal's body where it can only be seen with the aid of a mirror. If the animal increases the number of actions directed to the mark region when presented with a mirror, the animal is presumed to have recognized the mirror image as its reflection. However, the pass/fail nature of the mark test presupposes self-recognition exists in entirety or not at all. We developed a novel mirror-recognition task, to supplement the mark test, which revealed gradation in the self-recognition of Clark's nutcrackers, a large-brained corvid. To do so, nutcrackers cached food alone, observed by another nutcracker, or with a regular or blurry mirror. The nutcrackers suppressed caching with a regular mirror, a behavioural response to prevent cache theft by conspecifics, but did not suppress caching with a blurry mirror. Likewise, during the mark test, most nutcrackers made more self-directed actions to the mark with a blurry mirror than a regular mirror. Both results suggest self-recognition was more readily achieved with the blurry mirror and that self-recognition may be more broadly present among animals than currently thought.
Fox Squirrels Match Food Assessment and Cache Effort to Value and Scarcity
Delgado, Mikel M.; Nicholas, Molly; Petrie, Daniel J.; Jacobs, Lucia F.
2014-01-01
Scatter hoarders must allocate time to assess items for caching, and to carry and bury each cache. Such decisions should be driven by economic variables, such as the value of the individual food items, the scarcity of these items, competition for food items and risk of pilferage by conspecifics. The fox squirrel, an obligate scatter-hoarder, assesses cacheable food items using two overt movements, head flicks and paw manipulations. These behaviors allow an examination of squirrel decision processes when storing food for winter survival. We measured wild squirrels' time allocations and frequencies of assessment and investment behaviors during periods of food scarcity (summer) and abundance (fall), giving the squirrels a series of 15 items (alternating five hazelnuts and five peanuts). Assessment and investment per cache increased when resource value was higher (hazelnuts) or resources were scarcer (summer), but decreased as scarcity declined (end of sessions). This is the first study to show that assessment behaviors change in response to factors that indicate daily and seasonal resource abundance, and that these factors may interact in complex ways to affect food storing decisions. Food-storing tree squirrels may be a useful and important model species to understand the complex economic decisions made under natural conditions. PMID:24671221
A random rule model of surface growth
NASA Astrophysics Data System (ADS)
Mello, Bernardo A.
2015-02-01
Stochastic models of surface growth are usually based on randomly choosing a substrate site to perform iterative steps, as in the etching model, Mello et al. (2001) [5]. In this paper I modify the etching model to perform sequential, instead of random, substrate scan. The randomicity is introduced not in the site selection but in the choice of the rule to be followed in each site. The change positively affects the study of dynamic and asymptotic properties, by reducing the finite size effect and the short-time anomaly and by increasing the saturation time. It also has computational benefits: better use of the cache memory and the possibility of parallel implementation.
Inferred Lunar Boulder Distributions at Decimeter Scales
NASA Technical Reports Server (NTRS)
Baloga, S. M.; Glaze, L. S.; Spudis, P. D.
2012-01-01
Block size distributions of impact deposits on the Moon are diagnostic of the impact process and environmental effects, such as target lithology and weathering. Block size distributions are also important factors in trafficability, habitability, and possibly the identification of indigenous resources. Lunar block sizes have been investigated for many years for many purposes [e.g., 1-3]. An unresolved issue is the extent to which lunar block size distributions can be extrapolated to scales smaller than limits of resolution of direct measurement. This would seem to be a straightforward statistical application, but it is complicated by two issues. First, the cumulative size frequency distribution of observable boulders rolls over due to resolution limitations at the small end. Second, statistical regression provides the best fit only around the centroid of the data [4]. Confidence and prediction limits splay away from the best fit at the endpoints resulting in inferences in the boulder density at the CPR scale that can differ by many orders of magnitude [4]. These issues were originally investigated by Cintala and McBride [2] using Surveyor data. The objective of this study was to determine whether the measured block size distributions from Lunar Reconnaissance Orbiter Camera - Narrow Angle Camera (LROC-NAC) images (m-scale resolution) can be used to infer the block size distribution at length scales comparable to Mini-RF Circular Polarization Ratio (CPR) scales, nominally taken as 10 cm. This would set the stage for assessing correlations of inferred block size distributions with CPR returns [6].
Motivation and Design of the Sirocco Storage System Version 1.0.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curry, Matthew Leon; Ward, H. Lee; Danielson, Geoffrey Charles
Sirocco is a massively parallel, high performance storage system for the exascale era. It emphasizes client-to-client coordination, low server-side coupling, and free data movement to improve resilience and performance. Its architecture is inspired by peer-to-peer and victim- cache architectures. By leveraging these ideas, Sirocco natively supports several media types, including RAM, flash, disk, and archival storage, with automatic migration between levels. Sirocco also includes storage interfaces and support that are more advanced than typical block storage. Sirocco enables clients to efficiently use key-value storage or block-based storage with the same interface. It also provides several levels of transactional data updatesmore » within a single storage command, including full ACID-compliant updates. This transaction support extends to updating several objects within a single transaction. Further support is provided for con- currency control, enabling greater performance for workloads while providing safe concurrent modification. By pioneering these and other technologies and techniques in the storage system, Sirocco is poised to fulfill a need for a massively scalable, write-optimized storage system for exascale systems. This is version 1.0 of a document reflecting the current and planned state of Sirocco. Further versions of this document will be accessible at http://www.cs.sandia.gov/Scalable_IO/ sirocco .« less
The storage system of PCM based on random access file system
NASA Astrophysics Data System (ADS)
Han, Wenbing; Chen, Xiaogang; Zhou, Mi; Li, Shunfen; Li, Gezi; Song, Zhitang
2016-10-01
Emerging memory technologies such as Phase change memory (PCM) tend to offer fast, random access to persistent storage with better scalability. It's a hot topic of academic and industrial research to establish PCM in storage hierarchy to narrow the performance gap. However, the existing file systems do not perform well with the emerging PCM storage, which access storage medium via a slow, block-based interface. In this paper, we propose a novel file system, RAFS, to bring about good performance of PCM, which is built in the embedded platform. We attach PCM chips to the memory bus and build RAFS on the physical address space. In the proposed file system, we simplify traditional system architecture to eliminate block-related operations and layers. Furthermore, we adopt memory mapping and bypassed page cache to reduce copy overhead between the process address space and storage device. XIP mechanisms are also supported in RAFS. To the best of our knowledge, we are among the first to implement file system on real PCM chips. We have analyzed and evaluated its performance with IOZONE benchmark tools. Our experimental results show that the RAFS on PCM outperforms Ext4fs on SDRAM with small record lengths. Based on DRAM, RAFS is significantly faster than Ext4fs by 18% to 250%.
NASA Astrophysics Data System (ADS)
Li, Miao; Lin, Zaiping; Long, Yunli; An, Wei; Zhou, Yiyu
2016-05-01
The high variability of target size makes small target detection in Infrared Search and Track (IRST) a challenging task. A joint detection and tracking method based on block-wise sparse decomposition is proposed to address this problem. For detection, the infrared image is divided into overlapped blocks, and each block is weighted on the local image complexity and target existence probabilities. Target-background decomposition is solved by block-wise inexact augmented Lagrange multipliers. For tracking, label multi-Bernoulli (LMB) tracker tracks multiple targets taking the result of single-frame detection as input, and provides corresponding target existence probabilities for detection. Unlike fixed-size methods, the proposed method can accommodate size-varying targets, due to no special assumption for the size and shape of small targets. Because of exact decomposition, classical target measurements are extended and additional direction information is provided to improve tracking performance. The experimental results show that the proposed method can effectively suppress background clutters, detect and track size-varying targets in infrared images.
Simulation of flow and habitat conditions under ice, Cache la Poudre River - January 2006
Waddle, Terry
2007-01-01
The objectives of this study are (1) to describe the extent and thickness of ice cover, (2) simulate depth and velocity under ice at the study site for observed and reduced flows, and (3) to quantify fish habitat in this portion of the mainstem Cache la Poudre River for the current winter release schedule as well as for similar conditions without the 0.283 m3/s winter release.
1993-08-01
on the Lempel - Ziv [44] algo- rithm. Zip is compressing a single 8,017 byte file. " RTLSim An register transfer language simulator for the Message...package. gordoni@cs.adelaide.edu.au, Wynn Vale, 5127, Australia, 1.0 edition, October 1991. [44] Ziv J. and Lempel A. "A universal algorithm for...fixed hardware algorithm . Some data caches allow the program to explicitly allocate cache lines [68]. This allocation is only useful in writing new data
Side Channel Attacks on STTRAM and Low Overhead Countermeasures
2017-03-20
introduce security vulnerabilities and expose the cache memory to side channel attacks. In this paper, we propose a side channel attack (SCA) model...where the adversary can monitor the supply current of the memory array to partially identify the sensi- tive cache data that is being read or written. We...propose solutions such as short retention STTRAM, obfuscation of SCA using 1-bit parity, multi-bit random write, and, neutral- izing the SCA using
1993-03-01
CLUSTER A CLUSTER B .UDP D "Orequeqes ProxyDistribute 0 Figure 4-4: HOSTALL Implementation HOST_ALL is implemented as follows. The kernel looks up the...it includes the HOSTALL request as an argument. The generic CronusHost object is managed by the Cronus Kernel. A kernel that receives a ProxyDistnbute...request uses its cached service information to send the HOSTALL request to each host in its cluster via UDP. If the kernel has no cached information
NASA Technical Reports Server (NTRS)
Wheeler, D. J.; Ridd, M. K.; Merola, J. A.
1984-01-01
A basic geographic information system (GIS) for the North Cache Soil Conservation District (SCD) was sought for selected resource problems. Since the resource management issues in the North Cache SCD are very complex, it is not feasible in the initial phase to generate all the physical, socioeconomic, and political baseline data needed for resolving all management issues. A selection of critical varables becomes essential. Thus, there are foud specific objectives: (1) assess resource management needs and determine which resource factors ae most fundamental for building a beginning data base; (2) evaluate the variety of data gathering and analysis techniques for the resource factors selected; (3) incorporate the resulting data into a useful and efficient digital data base; and (4) demonstrate the application of the data base to selected real world resoource management issues.
Sparse Partial Equilibrium Tables in Chemically Resolved Reactive Flow
NASA Astrophysics Data System (ADS)
Vitello, Peter; Fried, Laurence E.; Pudliner, Brian; McAbee, Tom
2004-07-01
The detonation of an energetic material is the result of a complex interaction between kinetic chemical reactions and hydrodynamics. Unfortunately, little is known concerning the detailed chemical kinetics of detonations in energetic materials. CHEETAH uses rate laws to treat species with the slowest chemical reactions, while assuming other chemical species are in equilibrium. CHEETAH supports a wide range of elements and condensed detonation products and can also be applied to gas detonations. A sparse hash table of equation of state values is used in CHEETAH to enhance the efficiency of kinetic reaction calculations. For large-scale parallel hydrodynamic calculations, CHEETAH uses parallel communication to updates to the cache. We present here details of the sparse caching model used in the CHEETAH coupled to an ALE hydrocode. To demonstrate the efficiency of modeling using a sparse cache model we consider detonations in energetic materials.
Cache Locality Optimization for Recursive Programs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lifflander, Jonathan; Krishnamoorthy, Sriram
We present an approach to optimize the cache locality for recursive programs by dynamically splicing--recursively interleaving--the execution of distinct function invocations. By utilizing data effect annotations, we identify concurrency and data reuse opportunities across function invocations and interleave them to reduce reuse distance. We present algorithms that efficiently track effects in recursive programs, detect interference and dependencies, and interleave execution of function invocations using user-level (non-kernel) lightweight threads. To enable multi-core execution, a program is parallelized using a nested fork/join programming model. Our cache optimization strategy is designed to work in the context of a random work stealing scheduler. Wemore » present an implementation using the MIT Cilk framework that demonstrates significant improvements in sequential and parallel performance, competitive with a state-of-the-art compile-time optimizer for loop programs and a domain- specific optimizer for stencil programs.« less
Josephson 4 K-bit cache memory design for a prototype signal processor. I - General overview
NASA Astrophysics Data System (ADS)
Henkels, W. H.; Geppert, L. M.; Kadlec, J.; Epperlein, P. W.; Beha, H.
1985-09-01
In the early stages of thg Josephson computer project conducted at an American computer company, it was recognized that a very fast cache memory was needed to complement Josephson logic. A subnanosecond access time memory was implemented experimentally on the basis of a 2.5-micron Pb-alloy technology. It was then decided to switch over to a Nb-base-electrode technology with the objective to alleviate problems with the long-term reliability and aging of Pb-based junctions. The present paper provides a general overview of the status of a 4 x 1 K-bit Josephson cache design employing a 2.5-micron Nb-edge-junction technology. Attention is given to the fabrication process and its implications, aspects of circuit design methodology, an overview of system environment and chip components, design changes and status, and various difficulties and uncertainties.
Study on data acquisition system based on reconfigurable cache technology
NASA Astrophysics Data System (ADS)
Zhang, Qinchuan; Li, Min; Jiang, Jun
2018-03-01
Waveform capture rate is one of the key features of digital acquisition systems, which represents the waveform processing capability of the system in a unit time. The higher the waveform capture rate is, the larger the chance to capture elusive events is and the more reliable the test result is. First, this paper analyzes the impact of several factors on the waveform capture rate of the system, then the novel technology based on reconfigurable cache is further proposed to optimize system architecture, and the simulation results show that the signal-to-noise ratio of signal, capacity, and structure of cache have significant effects on the waveform capture rate. Finally, the technology is demonstrated by the engineering practice, and the results show that the waveform capture rate of the system is improved substantially without significant increase of system's cost, and the technology proposed has a broad application prospect.
NASA Astrophysics Data System (ADS)
Graham, Jim; Jarnevich, Catherine S.; Simpson, Annie; Newman, Gregory J.; Stohlgren, Thomas J.
2011-06-01
Invasive species are a universal global problem, but the information to identify them, manage them, and prevent invasions is stored around the globe in a variety of formats. The Global Invasive Species Information Network is a consortium of organizations working toward providing seamless access to these disparate databases via the Internet. A distributed network of databases can be created using the Internet and a standard web service protocol. There are two options to provide this integration. First, federated searches are being proposed to allow users to search "deep" web documents such as databases for invasive species. A second method is to create a cache of data from the databases for searching. We compare these two methods, and show that federated searches will not provide the performance and flexibility required from users and a central cache of the datum are required to improve performance.
Graham, Jim; Jarnevich, Catherine S.; Simpson, Annie; Newman, Gregory J.; Stohlgren, Thomas J.
2011-01-01
Invasive species are a universal global problem, but the information to identify them, manage them, and prevent invasions is stored around the globe in a variety of formats. The Global Invasive Species Information Network is a consortium of organizations working toward providing seamless access to these disparate databases via the Internet. A distributed network of databases can be created using the Internet and a standard web service protocol. There are two options to provide this integration. First, federated searches are being proposed to allow users to search “deep” web documents such as databases for invasive species. A second method is to create a cache of data from the databases for searching. We compare these two methods, and show that federated searches will not provide the performance and flexibility required from users and a central cache of the datum are required to improve performance.
Welsh-Bohmer, Kathleen A; Breitner, John C S; Hayden, Kathleen M; Lyketsos, Constantine; Zandi, Peter P; Tschanz, Joann T; Norton, Maria C; Munger, Ron
2006-07-01
The Cache County Study of Memory, Health, and Aging, more commonly referred to as the "Cache County Memory Study (CCMS)" is a longitudinal investigation of aging and Alzheimer's disease (AD) based in an exceptionally long-lived population residing in northern Utah. The study begun in 1994 has followed an initial cohort of 5,092 older individuals (many over age 84) and has examined the development of cognitive impairment and dementia in relation to genetic and environmental antecedents. This article summarizes the major contributions of the CCMS towards the understanding of mild cognitive disorders and AD across the lifespan, underscoring the role of common health exposures in modifying dementia risk and trajectories of cognitive change. The study now in its fourth wave of ascertainment illustrates the role of population-based approaches in informing testable models of cognitive aging and Alzheimer's disease.
NASA Astrophysics Data System (ADS)
Barnaś, Dawid; Bieniasz, Lesław K.
2017-07-01
We have recently developed a vectorized Thomas solver for quasi-block tridiagonal linear algebraic equation systems using Streaming SIMD Extensions (SSE) and Advanced Vector Extensions (AVX) in operations on dense blocks [D. Barnaś and L. K. Bieniasz, Int. J. Comput. Meth., accepted]. The acceleration caused by vectorization was observed for large block sizes, but was less satisfactory for small blocks. In this communication we report on another version of the solver, optimized for small blocks of size up to four rows and/or columns.
Neradovic, D; Soga, O; Van Nostrum, C F; Hennink, W E
2004-05-01
Block copolymers of poly(ethylene glycol) (PEG) as a hydrophilic block and N-isopropylacrylamide (PNIPAAm) or poly (NIPAAm-co-N-(2-hydroxypropyl) methacrylamide-dilactate) (poly(NIPAAm-co-HPMAm-dilactate)) as a thermosensitive block, are able to self-assemble in water into nanoparticles above the cloud point (CP) of the thermosensitive block. The influence of processing and the formulation parameters on the size of the nanoparticles was studied using dynamic light scattering. PNIPAAm-b-PEG 2000 polymers were not suitable for the formation of small and stable particles. Block copolymers with PEG 5000 and 10000 formed relatively small and stable particles in aqueous solutions at temperatures above the CP of the thermosensitive block. Their size decreased with increasing molecular weight of the thermosensitive block, decreasing polymer concentration and using water instead of phosphate buffered saline as solvent. Extrusion and ultrasonication were inefficient methods to size down the polymeric nanoparticles. The heating rate of the polymer solutions was a dominant factor for the size of the nanoparticles. When an aqueous polymer solution was slowly heated through the CP, rather large particles (> or = 200 nm) were formed. Regardless the polymer composition, small nanoparticles (50-70 nm) with a narrow size distribution were formed, when a small volume of an aqueous polymer solution below the CP was added to a large volume of heated water. In this way the thermosensitive block copolymers rapidly pass their CP ('heat shock' procedure), resulting in small and stable nanoparticles.
Ibrahim, Mohamed; Wickenhauser, Patrick; Rautek, Peter; Reina, Guido; Hadwiger, Markus
2018-01-01
Molecular dynamics (MD) simulations are crucial to investigating important processes in physics and thermodynamics. The simulated atoms are usually visualized as hard spheres with Phong shading, where individual particles and their local density can be perceived well in close-up views. However, for large-scale simulations with 10 million particles or more, the visualization of large fields-of-view usually suffers from strong aliasing artifacts, because the mismatch between data size and output resolution leads to severe under-sampling of the geometry. Excessive super-sampling can alleviate this problem, but is prohibitively expensive. This paper presents a novel visualization method for large-scale particle data that addresses aliasing while enabling interactive high-quality rendering. We introduce the novel concept of screen-space normal distribution functions (S-NDFs) for particle data. S-NDFs represent the distribution of surface normals that map to a given pixel in screen space, which enables high-quality re-lighting without re-rendering particles. In order to facilitate interactive zooming, we cache S-NDFs in a screen-space mipmap (S-MIP). Together, these two concepts enable interactive, scale-consistent re-lighting and shading changes, as well as zooming, without having to re-sample the particle data. We show how our method facilitates the interactive exploration of real-world large-scale MD simulation data in different scenarios.
Automatic extraction of blocks from 3D point clouds of fractured rock
NASA Astrophysics Data System (ADS)
Chen, Na; Kemeny, John; Jiang, Qinghui; Pan, Zhiwen
2017-12-01
This paper presents a new method for extracting blocks and calculating block size automatically from rock surface 3D point clouds. Block size is an important rock mass characteristic and forms the basis for several rock mass classification schemes. The proposed method consists of four steps: 1) the automatic extraction of discontinuities using an improved Ransac Shape Detection method, 2) the calculation of discontinuity intersections based on plane geometry, 3) the extraction of block candidates based on three discontinuities intersecting one another to form corners, and 4) the identification of "true" blocks using an improved Floodfill algorithm. The calculated block sizes were compared with manual measurements in two case studies, one with fabricated cardboard blocks and the other from an actual rock mass outcrop. The results demonstrate that the proposed method is accurate and overcomes the inaccuracies, safety hazards, and biases of traditional techniques.
Convolutional neural network for road extraction
NASA Astrophysics Data System (ADS)
Li, Junping; Ding, Yazhou; Feng, Fajie; Xiong, Baoyu; Cui, Weihong
2017-11-01
In this paper, the convolution neural network with large block input and small block output was used to extract road. To reflect the complex road characteristics in the study area, a deep convolution neural network VGG19 was conducted for road extraction. Based on the analysis of the characteristics of different sizes of input block, output block and the extraction effect, the votes of deep convolutional neural networks was used as the final road prediction. The study image was from GF-2 panchromatic and multi-spectral fusion in Yinchuan. The precision of road extraction was 91%. The experiments showed that model averaging can improve the accuracy to some extent. At the same time, this paper gave some advice about the choice of input block size and output block size.
Urban Geocaching: what Happened in Lisbon during the Last Decade?
NASA Astrophysics Data System (ADS)
Nogueira Mendes, R.; Rodrigues, T.; Rodrigues, A. M.
2013-05-01
Created in 2000 in the United States of America, Geocaching has become a major phenomenon all around the world, counting actually with millions of Geocaches (or caches) that work as a recreational motivation for millions of users, called Geocachers. During the last 30 days over 5,000,000 new logs have been submitted worldwide, disseminating individual experiences, motivations, emotions and photos through the official Geocaching website (www.geocaching.com), and several official or informal national web forums. The activity itself can be compared with modern treasure hunting that uses handheld GPS, Smartphones or Tablets, WEB 2.0, wiki features and technologies to keep Geocachers engaged with their activity, in a strong social-network. All these characteristics make Geocaching an activity with a strong geographic component that deals closely with the surrounding environment where each cache has been hidden. From previous work, significance correlation has been found regarding hides and natural/rural environments, but metropolitan and urban areas like Lisbon municipality (that holds 3.23% of the total 27534 Portuguese caches), still registers the higher density of Geocaches, and logs numbers. Lacking "natural/rural" environment, Geocaching in cities tend to happen in symbolic areas, like public parks and places, sightseeing spots and historical neighborhoods. The present study looks to Geocaching within the city of Lisbon, in order to understand how it works, and if this activity reflects the city itself, promoting its image and cultural heritage. From a freely available dataset that includes all Geocaches that have been placed in Lisbon since February 2001, spatial analysis has been conducted, showing the informal preferences of this activity. Results show a non-random distribution of caches within the study area, similar to the land use distribution. Preferable locations tend to be in iconic places of the city, usually close to the Tagus River, that concentrates 25% of the total caches. Since most of these places are known to be touristic destinations, the TOP15 logged Caches were also analyzed regarding their description and logs in order to understand if Geocaching reflects tourism and if it works as a tourist promotion tool within urban environments. Final results also reflect the Geocaching performance and major trends within urban environments providing new insights regarding this activity impacts and implications.
Caching strategies for improving performance of web-based Geographic applications
NASA Astrophysics Data System (ADS)
Liu, M.; Brodzik, M.; Collins, J. A.; Lewis, S.; Oldenburg, J.
2012-12-01
The NASA Operation IceBridge mission collects airborne remote sensing measurements to bridge the gap between NASA's Ice, Cloud and Land Elevation Satellite (ICESat) mission and the upcoming ICESat-2 mission. The IceBridge Data Portal from the National Snow and Ice Data Center provides an intuitive web interface for accessing IceBridge mission observations and measurements. Scientists and users usually do not have knowledge about the individual campaigns but are interested in data collected in a specific place. We have developed a high-performance map interface to allow users to quickly zoom to an area of interest and see any Operation IceBridge overflights. The map interface consists of two layers: the user can pan and zoom on the base map layer; the flight line layer that overlays the base layer provides all the campaign missions that intersect with the current map view. The user can click on the flight campaigns and download the data as needed. The OpenGIS® Web Map Service Interface Standard (WMS) provides a simple HTTP interface for requesting geo-registered map images from one or more distributed geospatial databases. Web Feature Service (WFS) provides an interface allowing requests for geographical features across the web using platform-independent calls. OpenLayers provides vector support (points, polylines and polygons) to build a WMS/WFS client for displaying both layers on the screen. Map Server, an open source development environment for building spatially enabled internet applications, is serving the WMS and WFS spatial data to OpenLayers. Early releases of the portal displayed unacceptably poor load time performance for flight lines and the base map tiles. This issue was caused by long response times from the map server in generating all map tiles and flight line vectors. We resolved the issue by implementing various caching strategies on top of the WMS and WFS services, including the use of Squid (www.squid-cache.org) to cache frequently-used content. Our presentation includes the architectural design of the application, and how we use OpenLayers, WMS and WFS with Squid to build a responsive web application capable of efficiently displaying geospatial data to allow the user to quickly interact with the displayed information. We describe the design, implementation and performance improvement of our caching strategies, and the tools and techniques developed to assist our data caching strategies.
EarthCache as a Tool to Promote Earth-Science in Public School Classrooms
NASA Astrophysics Data System (ADS)
Gochis, E. E.; Rose, W. I.; Klawiter, M.; Vye, E. C.; Engelmann, C. A.
2011-12-01
Geoscientists often find it difficult to bridge the gap in communication between university research and what is learned in the public schools. Today's schools operate in a high stakes environment that only allow instruction based on State and National Earth Science curriculum standards. These standards are often unknown by academics or are written in a style that obfuscates the transfer of emerging scientific research to students in the classroom. Earth Science teachers are in an ideal position to make this link because they have a background in science as well as a solid understanding of the required curriculum standards for their grade and the pedagogical expertise to pass on new information to their students. As part of the Michigan Teacher Excellence Program (MiTEP), teachers from Grand Rapids, Kalamazoo, and Jackson school districts participate in 2 week field courses with Michigan Tech University to learn from earth science experts about how the earth works. This course connects Earth Science Literacy Principles' Big Ideas and common student misconceptions with standards-based education. During the 2011 field course, we developed and began to implement a three-phase EarthCache model that will provide a geospatial interactive medium for teachers to translate the material they learn in the field to the students in their standards based classrooms. MiTEP participants use GPS and Google Earth to navigate to Michigan sites of geo-significance. At each location academic experts aide participants in making scientific observations about the locations' geologic features, and "reading the rocks" methodology to interpret the area's geologic history. The participants are then expected to develop their own EarthCache site to be used as pedagogical tool bridging the gap between standards-based classroom learning, contemporary research and unique outdoor field experiences. The final phase supports teachers in integrating inquiry based, higher-level learning student activities to EarthCache sites near their own urban communities, or in regional areas such as nature preserves and National Parks. By working together, MiTEP participants are developing a network of regional EarthCache sites and shared lesson plans which explore places that are meaningful to students while simultaneously connecting them to geologic concepts they are learning in school. We believe that the MiTEP EarthCaching model will help participants emerge as leaders of inquiry style, and virtual place-based educators within their districts.
1979-02-01
classified as Porno , Lake Miwok, and Patwin. Recent surveys within the Clear Lake-Cache Creek Basin have located 28 archeological sites, some of which...additional 8,400 acre-feet annually to the Lakeport area. Porno Reservoir on Kelsey Creek, being studied by Lake County, also would supplement M&l water...project on Scotts Creek could provide 9,100 acre- feet annually of irrigation water. Also, as previously discussed, Porno Reservoir would furnish
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations
Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel; ...
2017-06-01
As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel
As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
The iMars WebGIS - Spatio-Temporal Data Queries and Single Image Map Web Services
NASA Astrophysics Data System (ADS)
Walter, Sebastian; Steikert, Ralf; Schreiner, Bjoern; Muller, Jan-Peter; van Gasselt, Stephan; Sidiropoulos, Panagiotis; Lanz-Kroechert, Julia
2017-04-01
Introduction: Web-based planetary image dissemination platforms usually show outline coverages of the data and offer querying for metadata as well as preview and download, e.g. the HRSC Mapserver (Walter & van Gasselt, 2014). Here we introduce a new approach for a system dedicated to change detection by simultanous visualisation of single-image time series in a multi-temporal context. While the usual form of presenting multi-orbit datasets is the merge of the data into a larger mosaic, we want to stay with the single image as an important snapshot of the planetary surface at a specific time. In the context of the EU FP-7 iMars project we process and ingest vast amounts of automatically co-registered (ACRO) images. The base of the co-registration are the high precision HRSC multi-orbit quadrangle image mosaics, which are based on bundle-block-adjusted multi-orbit HRSC DTMs. Additionally we make use of the existing bundle-adjusted HRSC single images available at the PDS archives. A prototype demonstrating the presented features is available at http://imars.planet.fu-berlin.de. Multi-temporal database: In order to locate multiple coverage of images and select images based on spatio-temporal queries, we converge available coverage catalogs for various NASA imaging missions into a relational database management system with geometry support. We harvest available metadata entries during our processing pipeline using the Integrated Software for Imagers and Spectrometers (ISIS) software. Currently, this database contains image outlines from the MGS/MOC, MRO/CTX and the MO/THEMIS instruments with imaging dates ranging from 1996 to the present. For the MEx/HRSC data, we already maintain a database which we automatically update with custom software based on the VICAR environment. Web Map Service with time support: The MapServer software is connected to the database and provides Web Map Services (WMS) with time support based on the START_TIME image attribute. It allows temporal WMS GetMap requests by setting additional TIME parameter values in the request. The values for the parameter represent an interval defined by its lower and upper bounds. As the WMS time standard only supports one time variable, only the start times of the images are considered. If no time values are submitted with the request, the full time range of all images is assumed as the default. Dynamic single image WMS: To compare images from different acquisition times at sites of multiple coverage, we have to load every image as a single WMS layer. Due to the vast amount of single images we need a way to set up the layers in a dynamic way - the map server does not know the images to be served beforehand. We use the MapScript interface to dynamically access MapServer's objects and configure the file name and path of the requested image in the map configuration. The layers are created on-the-fly each representing only one single image. On the frontend side, the vendor-specific WMS request parameter (PRODUCTID) has to be appended to the regular set of WMS parameters. The request is then passed on to the MapScript instance. Web Map Tile Cache: In order to speed up access of the WMS requests, a MapCache instance has been integrated in the pipeline. As it is not aware of the available PDS product IDs which will be queried, the PRODUCTID parameter is configured as an additional dimension of the cache. The WMS request is received by the Apache webserver configured with the MapCache module. If the tile is available in the tile cache, it is immediately commited to the client. If not available, the tile request is forwarded to Apache and the MapScript module. The Python script intercepts the WMS request and extracts the product ID from the parameter chain. It loads the layer object from the map file and appends the file name and path of the inquired image. After some possible further image processing inside the script (stretching, color matching), the request is submitted to the MapServer backend which in turn delivers the response back to the MapCache instance. Web frontend: We have implemented a web-GIS frontend based on various OpenLayers components. The basemap is a global color-hillshaded HRSC bundle-adjusted DTM mosaic with a resolution of 50 m per pixel. The new bundle-block-adjusted qudrangle mosaics of the MC-11 quadrangle, both image and DTM, are included with opacity slider options. The layer user interface has been adapted on the base of the ol3-layerswitcher and extended by foldable and switchable groups, layer sorting (by resolution, by time and alphabeticallly) and reordering (drag-and-drop). A collapsible time panel accomodates a time slider interface where the user can filter the visible data by a range of Mars or Earth dates and/or by solar longitudes. The visualisation of time-series of single images is controlled by a specific toolbar enabling the workflow of image selection (by point or bounding box), dynamic image loading and playback of single images in a video player-like environment. During a stress-test campaign we could demonstrate that the system is capable of serving up to 10 simultaneous users on its current lightweight development hardware. It is planned to relocate the software to more powerful hardware by the time of this conference. Conclusions/Outlook: The iMars webGIS is an expert tool for the detection and visualization of surface changes. We demonstrate a technique to dynamically retrieve and display single images based on the time-series structure of the data. Together with the multi-temporal database and its MapServer/MapCache backend it provides a stable and high performance environment for the dissemination of the various iMars products. Acknowledgements: This research has received funding from the EU's FP7 Programme under iMars 607379 and by the German Space Agency (DLR Bonn), grant 50 QM 1301 (HRSC on Mars Express).
A performance model for GPUs with caches
Dao, Thanh Tuan; Kim, Jungwon; Seo, Sangmin; ...
2014-06-24
To exploit the abundant computational power of the world's fastest supercomputers, an even workload distribution to the typically heterogeneous compute devices is necessary. While relatively accurate performance models exist for conventional CPUs, accurate performance estimation models for modern GPUs do not exist. This paper presents two accurate models for modern GPUs: a sampling-based linear model, and a model based on machine-learning (ML) techniques which improves the accuracy of the linear model and is applicable to modern GPUs with and without caches. We first construct the sampling-based linear model to predict the runtime of an arbitrary OpenCL kernel. Based on anmore » analysis of NVIDIA GPUs' scheduling policies we determine the earliest sampling points that allow an accurate estimation. The linear model cannot capture well the significant effects that memory coalescing or caching as implemented in modern GPUs have on performance. We therefore propose a model based on ML techniques that takes several compiler-generated statistics about the kernel as well as the GPU's hardware performance counters as additional inputs to obtain a more accurate runtime performance estimation for modern GPUs. We demonstrate the effectiveness and broad applicability of the model by applying it to three different NVIDIA GPU architectures and one AMD GPU architecture. On an extensive set of OpenCL benchmarks, on average, the proposed model estimates the runtime performance with less than 7 percent error for a second-generation GTX 280 with no on-chip caches and less than 5 percent for the Fermi-based GTX 580 with hardware caches. On the Kepler-based GTX 680, the linear model has an error of less than 10 percent. On an AMD GPU architecture, Radeon HD 6970, the model estimates with 8 percent of error rates. As a result, the proposed technique outperforms existing models by a factor of 5 to 6 in terms of accuracy.« less
NASA Astrophysics Data System (ADS)
Jerbi, Chahir; Fourno, André; Noetinger, Benoit; Delay, Frederick
2017-05-01
Single and multiphase flows in fractured porous media at the scale of natural reservoirs are often handled by resorting to homogenized models that avoid the heavy computations associated with a complete discretization of both fractures and matrix blocks. For example, the two overlapping continua (fractures and matrix) of a dual porosity system are coupled by way of fluid flux exchanges that deeply condition flow at the large scale. This characteristic is a key to realistic flow simulations, especially for multiphase flow as capillary forces and contrasts of fluid mobility compete in the extraction of a fluid from a capacitive matrix then conveyed through the fractures. The exchange rate between fractures and matrix is conditioned by the so-called mean matrix block size which can be viewed as the size of a single matrix block neighboring a single fracture within a mesh of a dual porosity model. We propose a new evaluation of this matrix block size based on the analysis of discrete fracture networks. The fundaments rely upon establishing at the scale of a fractured block the equivalence between the actual fracture network and a Warren and Root network only made of three regularly spaced fracture families parallel to the facets of the fractured block. The resulting matrix block sizes are then compared via geometrical considerations and two-phase flow simulations to the few other available methods. It is shown that the new method is stable in the sense it provides accurate sizes irrespective of the type of fracture network investigated. The method also results in two-phase flow simulations from dual porosity models very close to that from references calculated in finely discretized networks. Finally, calculations of matrix block sizes by this new technique reveal very rapid, which opens the way to cumbersome applications such as preconditioning a dual porosity approach applied to regional fractured reservoirs.
Vleugels, Leo F W; Pollet, Jennifer; Tuinier, Remco
2015-05-21
Polyelectrolyte-surfactant complexes (PESC) are a class of materials which form spontaneously by self-assembly driven by electrostatic and hydrophobic interactions. PESC containing sodium lauryl ether sulfates (SLES) have found wide application in hair care products like shampoo. Typically, SLES with only one or two ethylene oxide (EO) groups are used for this application. We have studied the influence of the size of the EO block (ranging from 0 to 30 EO groups) on complexation with two model polycations: linear polyDADMAC and branched PEI. PESC size and electrostatic properties were determined during stepwise titration of buffered polycation solutions. The critical aggregation concentration (CAC) of PESC was determined by surface tension measurements and fluorescence spectroscopy. For polyDADMAC, there is no influence of the size of the EO block on the complexation behavior; the stiff polycation governs the structure formation. For PEI, it was seen that the EO block size does affect the structure of the complexes. The CAC value of the investigated complexes turns out to be rather independent of the EO block size; however, the CMC/CAC ratio decreases with increasing size of the EO block. This latter observation explains why the Lochhead-Goddard effect is most effective for small EO blocks.
Parallel Implementation of the Discontinuous Galerkin Method
NASA Technical Reports Server (NTRS)
Baggag, Abdalkader; Atkins, Harold; Keyes, David
1999-01-01
This paper describes a parallel implementation of the discontinuous Galerkin method. Discontinuous Galerkin is a spatially compact method that retains its accuracy and robustness on non-smooth unstructured grids and is well suited for time dependent simulations. Several parallelization approaches are studied and evaluated. The most natural and symmetric of the approaches has been implemented in all object-oriented code used to simulate aeroacoustic scattering. The parallel implementation is MPI-based and has been tested on various parallel platforms such as the SGI Origin, IBM SP2, and clusters of SGI and Sun workstations. The scalability results presented for the SGI Origin show slightly superlinear speedup on a fixed-size problem due to cache effects.
NASA Astrophysics Data System (ADS)
Liu, Fenglai; Kong, Jing
2018-07-01
Unique technical challenges and their solutions for implementing semi-numerical Hartree-Fock exchange on the Phil Processor are discussed, especially concerning the single- instruction-multiple-data type of processing and small cache size. Benchmark calculations on a series of buckyball molecules with various Gaussian basis sets on a Phi processor and a six-core CPU show that the Phi processor provides as much as 12 times of speedup with large basis sets compared with the conventional four-center electron repulsion integration approach performed on the CPU. The accuracy of the semi-numerical scheme is also evaluated and found to be comparable to that of the resolution-of-identity approach.
Cao, Lin; Wang, Zhenyu; Yan, Chuan; Chen, Jin; Guo, Cong; Zhang, Zhibin
2016-11-01
Rodent preference for scatter-hoarding large seeds has been widely considered to favor the evolution of large seeds. Previous studies supporting this conclusion were primarily based on observations at earlier stages of seed dispersal, or on a limited sample of successfully established seedlings. Because seed dispersal comprises multiple dispersal stages, we hypothesized that differential foraging preference on seed size by animal dispersers at different dispersal stages would ultimately result in medium-sized seeds having the highest dispersal success rates. In this study, by tracking a large number of seeds for 5 yr, we investigated the effects of seed size on seed fates from seed removal to seedling establishment of a dominant plant Pittosporopsis kerrii (Icacinaceae) dispersed by scatter-hoarding rodents in tropical forest in southwest China. We found that small seeds had a lower survival rate at the early dispersal stage where more small seeds were predated at seed stations and after removal; large seeds had a lower survival rate at the late dispersal stage, more large seeds were recovered, predated after being cached, or larder-hoarded. Medium-sized seeds experienced the highest dispersal success. Our study suggests that differential foraging preferences by scatter-hoarding rodents at different stages of seed dispersal could result in conflicting selective pressures on seed size and higher dispersal success of medium-sized seeds. © 2016 by the Ecological Society of America.
A High-Precision Counter Using the DSP Technique
2004-09-01
DSP is not good enough to process all the 1-second samples. The cache memory is also not sufficient to store all the sampling data. So we cut the...sampling number in a cycle is not good enough to achieve an accuracy less than 2×10-11. For this reason, a correlation operation is performed for... not good enough to process all the 1-second samples. The cache memory is also not sufficient to store all the sampling data. We will solve this
Compiler-directed cache management in multiprocessors
NASA Technical Reports Server (NTRS)
Cheong, Hoichi; Veidenbaum, Alexander V.
1990-01-01
The necessity of finding alternatives to hardware-based cache coherence strategies for large-scale multiprocessor systems is discussed. Three different software-based strategies sharing the same goals and general approach are presented. They consist of a simple invalidation approach, a fast selective invalidation scheme, and a version control scheme. The strategies are suitable for shared-memory multiprocessor systems with interconnection networks and a large number of processors. Results of trace-driven simulations conducted on numerical benchmark routines to compare the performance of the three schemes are presented.
Developing Tools and Technologies to Meet MSR Planetary Protection Requirements
NASA Technical Reports Server (NTRS)
Lin, Ying
2013-01-01
This paper describes the tools and technologies that need to be developed for a Caching Rover mission in order to meet the overall Planetary Protection requirements for future Mars Sample Return (MSR) campaign. This is the result of an eight-month study sponsored by the Mars Exploration Program Office. The goal of this study is to provide a future MSR project with a focused technology development plan for achieving the necessary planetary protection and sample integrity capabilities for a Mars Caching Rover mission.
Evaluating small-body landing hazards due to blocks
NASA Astrophysics Data System (ADS)
Ernst, C.; Rodgers, D.; Barnouin, O.; Murchie, S.; Chabot, N.
2014-07-01
Introduction: Landed missions represent a vital stage of spacecraft exploration of planetary bodies. Landed science allows for a wide variety of measurements essential to unraveling the origin and evolution of a body that are not possible remotely, including but not limited to compositional measurements, microscopic grain characterization, and the physical properties of the regolith. To date, two spacecraft have performed soft landings on the surface of a small body. In 2001, the Near Earth Asteroid Rendezvous (NEAR) mission performed a controlled descent and landing on (433) Eros following the completion of its mission [1]; in 2005, the Hayabusa spacecraft performed two touch-and-go maneuvers at (25143) Itokawa [2]. Both landings were preceded by rendezvous spacecraft reconnaissance, which enabled selection of a safe landing site. Three current missions have plans to land on small bodies (Rosetta, Hayabusa 2, and OSIRIS-REx); several other mission concepts also include small-body landings. Small-body landers need to land at sites having slopes and block abundances within spacecraft design limits. Due to the small scale of the potential hazards, it can be difficult or impossible to fully characterize a landing surface before the arrival of the spacecraft at the body. Although a rendezvous mission phase can provide global reconnaissance from which a landing site can be chosen, reasonable a priori assurance that a safe landing site exists is needed to validate the design approach for the spacecraft. Method: Many robotic spacecraft have landed safely on the Moon and Mars. Images of these landing sites, as well as more recent, extremely high-resolution orbital datasets, have enabled the comparison of orbital block observations to the smaller blocks that pose hazards to landers. Analyses of the Surveyor [3], Viking 1 and 2, Mars Pathfinder, Phoenix, Spirit, Opportunity, and Curiosity landing sites [4--8] have indicated that for a reasonable difference in size (a factor of several to ten), the size-frequency distribution of blocks can be modeled, allowing extrapolation from large block distributions to estimate small block densities. From that estimate, the probability of a lander encountering hazardous blocks can be calculated for a given lander design. Such calculations are used routinely to vet candidate sites for Mars landers [5--8]. Application to Small Bodies: To determine whether a similar approach will work for small bodies, we must determine if the large and small block populations can be linked. To do so, we analyze the comprehensive block datasets for the intermediate-sized Eros [9,10] and the small Itokawa [11,12]. Global and local block size-frequency distributions for Eros and Itokawa have power-law slopes on the order of -3 and match reasonably well between larger block sizes (from lower-resolution images) and smaller block sizes (from higher-resolution images). Although absolute block densities differ regionally on each asteroid, the slopes match reasonably well between Itokawa and Eros, with the geologic implications of this result discussed in [10]. For Eros and Itokawa, the approach of extending the size-frequency distribution from large, tens-of-meter-sized blocks down to small, tens-of-centimeter-sized blocks using a power-law fit to the large population yields reasonable estimates of small block populations. It is important to note that geologic context matters for the absolute block density --- if the global counts include multiple geologic settings, they will not directly extend to local areas containing only one setting [10]. A small number of high-resolution images of Phobos are sufficient for measuring blocks. These images are concentrated in the area outside of Stickney crater, which is thought to be the source of most of the observed blocks [13]. Block counts by Thomas et al. [13] suggest a power-law slope similar to those of Eros [9] and Itokawa global counts, with the absolute density of blocks similar to that of global Eros. Because blocks tend to be more numerous proximal to large, young craters (e.g., Stickney on Phobos, Shoemaker on Eros), the block density across most of Phobos is likely to be lower than that observed in the available high-resolution images. We suggest that a power-law extrapolation of Eros or Phobos large-block distributions provides upper limits for assessing the block landing hazards faced by a Phobos lander.
An effective detection algorithm for region duplication forgery in digital images
NASA Astrophysics Data System (ADS)
Yavuz, Fatih; Bal, Abdullah; Cukur, Huseyin
2016-04-01
Powerful image editing tools are very common and easy to use these days. This situation may cause some forgeries by adding or removing some information on the digital images. In order to detect these types of forgeries such as region duplication, we present an effective algorithm based on fixed-size block computation and discrete wavelet transform (DWT). In this approach, the original image is divided into fixed-size blocks, and then wavelet transform is applied for dimension reduction. Each block is processed by Fourier Transform and represented by circle regions. Four features are extracted from each block. Finally, the feature vectors are lexicographically sorted, and duplicated image blocks are detected according to comparison metric results. The experimental results show that the proposed algorithm presents computational efficiency due to fixed-size circle block architecture.
Chaotic Mountain Blocks in Pluto’s Sputnik Planitia
NASA Astrophysics Data System (ADS)
Singer, Kelsi N.; Knight, Katherine I.; Stern, S. Alan; Olkin, Catherine; Grundy, William M.; McKinnon, William B.; Moore, Jeffrey M.; Schenk, Paul M.; Spencer, John R.; Weaver, Harold A.; Young, Leslie; Ennico, Kimberly; New Horizons Geology, Geophysics and Imaging Science Theme Team, The New Horizons Surface Composition Science Theme Team
2017-10-01
One of the first high-resolution Pluto images returned by New Horizons displayed a collection of tall, jagged peaks rising out of the large nitrogen ice sheet informally known as Sputnik Planitia (SP). This mountain range was later revealed to be one of several along the western edge of SP. The mountains are several hundred broken-up blocks of Pluto’s primarily water ice lithosphere and some retain surface terrains similar to the nearby intact crust surrounding SP. Water ice with some fractures or porosity is likely >5% less dense than solid N2 ice at Pluto’s temperatures. Thus it is possible the blocks are, or were, floating icebergs or at least partially suspended to the point that some blocks appear to be tilted as if they have faltered (Moore et al., 2016, Science, 351, 1284-1293).We analyze four mountain ranges on the western edge of SP and compare to chaotic terrains on Europa and Mars. The blocks on Pluto have angular planforms but we characterize their size using block surface area converted to an equivalent circular diameter. Topography was used to define block extents. The blocks range in size from 3-30 km in diameter, with a mode of ~8-10 km. Blocks range from 0.2-3.8 km in height, and block height generally increases with block diameter. One or more dark layers can be identified in a few scarp faces, and are at a similar depth to each other and to layers seen in fault and crater walls elsewhere on Pluto. A large N-S trending fault system runs tangential to SP and may be the source of crustal disruption on the western side.On Europa and Mars block sizes vary greatly between different chaos regions, but Conamara Chaos has an average block size of ~5 km in diameter, smaller than that typically seen on Pluto. Also the blocks often transition into fractured terrain still connected to the surround lithosphere at the periphery of the chaos regions. The source regions for the blocks are more obvious on Europa and Mars. Additionally the block heights on Europa and Mars generally do not increase with block size. Thus, the main mechanism of crustal breakup is likely different between these bodies.
NASA Astrophysics Data System (ADS)
Kennett, Shane C.
Three low-carbon ASTM A514 microalloyed steels were used to assess the effects of austenite conditioning on the microstructure and mechanical properties of martensite. A range of prior austenite grain sizes with and without thermomechanical processing were produced in a Gleeble RTM 3500 and direct-quenched. Samples in the as-quenched, low temperature tempered, and high temperature tempered conditions were studied. The microstructure was characterized with scanning electron microscopy, electron backscattered diffraction, transmission electron microscopy, and x-ray diffraction. The uniaxial tensile properties and Charpy V-notch properties were measured and compared with the microstructural features (prior austenite grain size, packet size, block size, lath boundaries, and dislocation density). For the equiaxed prior austenite grain conditions, prior austenite grain size refinement decreases the packet size, decreases the block size, and increases the dislocation density of as-quenched martensite. However, after high temperature tempering the dislocation density decreases with prior austenite grain size refinement. Thermomechanical processing increases the low angle substructure, increases the dislocation density, and decreases the block size of as-quenched martensite. The dislocation density increase and block size refinement is sensitive to the austenite grain size before ausforming. The larger prior austenite grain size conditions have a larger increase in dislocation density, but the small prior austenite grain size conditions have the largest refinement in block size. Additionally, for the large prior austenite grain size conditions, the packet size increases with thermomechanical processing. The strength of martensite is often related to an effective grain size or carbon concentration. For the current work, it was concluded that the strength of martensite is primarily controlled by the dislocation density and dislocation substructure; which is related to a grain size and carbon concentration. In the microyielding regime, the strength and work hardening is related to the motion of unpinned dislocation segments. However, with tensile strain, a dislocation cell structure is developed and the flow strength (greater than 1% offset) is controlled by the dislocation density following a Taylor hardening model, thereby ruling out any grain size effects on the flow strength. Additionally, it is proposed that lath boundaries contribute to strength. It is shown that the strength differences associated with thermomechanically processed steels can be fully accounted for by dislocation density differences and the effect of lath boundaries. The low temperature ductile to brittle transition of martensite is controlled by the martensite block size, packet size, and prior austenite grain size. However, the effect of block size is likely small in comparison. The ductile to brittle transition temperature is best correlated to the inverse square root of the martensite packet size because large crack deflections are typical at packet boundaries.
Early post-tsunami disaster medical assistance to Banda Aceh: a personal account.
Garner, Alan A; Harrison, Ken
2006-02-01
The south Asian tsunami on 26 December, 2004, saw Australia deploy civilian teams to an international disaster in large numbers for the first time. The logistics of supporting such teams in both a self sustainability capacity and medical equipment had not previously been planned for or tested. For the first Australian team deployed to Banda Aceh, which arrived on the fourth day after the tsunami, equipment sourced from the New South Wales Fire Brigades Urban Search and Rescue (US&R) cache supplied all food, water, tents, generators and sleeping equipment. The medical equipment was largely sourced from the CareFlight US&R medical cache. There were significant deficits in surgical equipment as the medical cache had not been designed to provide a stand alone surgical capability. This resulted in the need for substantial improvisation by the surgical teams during the deployment. Despite this, the team performed nearly 140 major procedures in austere circumstances and significantly contributed to the early international response to this major humanitarian disaster.
Multicore and GPU algorithms for Nussinov RNA folding
2014-01-01
Background One segment of a RNA sequence might be paired with another segment of the same RNA sequence due to the force of hydrogen bonds. This two-dimensional structure is called the RNA sequence's secondary structure. Several algorithms have been proposed to predict an RNA sequence's secondary structure. These algorithms are referred to as RNA folding algorithms. Results We develop cache efficient, multicore, and GPU algorithms for RNA folding using Nussinov's algorithm. Conclusions Our cache efficient algorithm provides a speedup between 1.6 and 3.0 relative to a naive straightforward single core code. The multicore version of the cache efficient single core algorithm provides a speedup, relative to the naive single core algorithm, between 7.5 and 14.0 on a 6 core hyperthreaded CPU. Our GPU algorithm for the NVIDIA C2050 is up to 1582 times as fast as the naive single core algorithm and between 5.1 and 11.2 times as fast as the fastest previously known GPU algorithm for Nussinov RNA folding. PMID:25082539
EMR Database Upgrade from MUMPS to CACHE: Lessons Learned.
Alotaibi, Abduallah; Emshary, Mshary; Househ, Mowafa
2014-01-01
Over the past few years, Saudi hospitals have been implementing and upgrading Electronic Medical Record Systems (EMRs) to ensure secure data transfer and exchange between EMRs.This paper focuses on the process and lessons learned in upgrading the MUMPS database to a the newer Caché database to ensure the integrity of electronic data transfer within a local Saudi hospital. This paper examines the steps taken by the departments concerned, their action plans and how the change process was managed. Results show that user satisfaction was achieved after the upgrade was completed. The system was stable and offered better healthcare quality to patients as a result of the data exchange. Hardware infrastructure upgrades improved scalability and software upgrades to Caché improved stability. The overall performance was enhanced and new functions were added (CPOE) during the upgrades. The essons learned were: 1) Involve higher management; 2) Research multiple solutions available in the market; 3) Plan for a variety of implementation scenarios.
The Science of Computing: Virtual Memory
NASA Technical Reports Server (NTRS)
Denning, Peter J.
1986-01-01
In the March-April issue, I described how a computer's storage system is organized as a hierarchy consisting of cache, main memory, and secondary memory (e.g., disk). The cache and main memory form a subsystem that functions like main memory but attains speeds approaching cache. What happens if a program and its data are too large for the main memory? This is not a frivolous question. Every generation of computer users has been frustrated by insufficient memory. A new line of computers may have sufficient storage for the computations of its predecessor, but new programs will soon exhaust its capacity. In 1960, a longrange planning committee at MIT dared to dream of a computer with 1 million words of main memory. In 1985, the Cray-2 was delivered with 256 million words. Computational physicists dream of computers with 1 billion words. Computer architects have done an outstanding job of enlarging main memories yet they have never kept up with demand. Only the shortsighted believe they can.
Gonzalez Murcia, Josue D; Schmutz, Cameron; Munger, Caitlin; Perkes, Ammon; Gustin, Aaron; Peterson, Michael; Ebbert, Mark T W; Norton, Maria C; Tschanz, Joann T; Munger, Ronald G; Corcoran, Christopher D; Kauwe, John S K
2013-12-01
Recent studies have identified the rs75932628 (R47H) variant in TREM2 as an Alzheimer's disease risk factor with estimated odds ratio ranging from 2.9 to 5.1. The Cache County Memory Study is a large, population-based sample designed for the study of memory and aging. We genotyped R47H in 2974 samples (427 cases and 2540 control subjects) from the Cache County study using a custom TaqMan assay. We observed 7 heterozygous cases and 12 heterozygous control subjects with an odds ratio of 3.5 (95% confidence interval, 1.3-8.8; p = 0.0076). The minor allele frequency and population attributable fraction for R47H were 0.0029 and 0.004, respectively. This study replicates the association between R47H and Alzheimer's disease risk in a large, population-based sample, and estimates the population frequency and attributable risk of this rare variant. Copyright © 2013 Elsevier Inc. All rights reserved.
Multiphase complete exchange on a circuit switched hypercube
NASA Technical Reports Server (NTRS)
Bokhari, Shahid H.
1991-01-01
On a distributed memory parallel computer, the complete exchange (all-to-all personalized) communication pattern requires each of n processors to send a different block of data to each of the remaining n - 1 processors. This pattern is at the heart of many important algorithms, most notably the matrix transpose. For a circuit switched hypercube of dimension d(n = 2(sup d)), two algorithms for achieving complete exchange are known. These are (1) the Standard Exchange approach that employs d transmissions of size 2(sup d-1) blocks each and is useful for small block sizes, and (2) the Optimal Circuit Switched algorithm that employs 2(sup d) - 1 transmissions of 1 block each and is best for large block sizes. A unified multiphase algorithm is described that includes these two algorithms as special cases. The complete exchange on a hypercube of dimension d and block size m is achieved by carrying out k partial exchange on subcubes of dimension d(sub i) Sigma(sup k)(sub i=1) d(sub i) = d and effective block size m(sub i) = m2(sup d-di). When k = d and all d(sub i) = 1, this corresponds to algorithm (1) above. For the case of k = 1 and d(sub i) = d, this becomes the circuit switched algorithm (2). Changing the subcube dimensions d, varies the effective block size and permits a compromise between the data permutation and block transmission overhead of (1) and the startup overhead of (2). For a hypercube of dimension d, the number of possible combinations of subcubes is p(d), the number of partitions of the integer d. This is an exponential but very slowly growing function and it is feasible over these partitions to discover the best combination for a given message size. The approach was analyzed for, and implemented on, the Intel iPSC-860 circuit switched hypercube. Measurements show good agreement with predictions and demonstrate that the multiphase approach can substantially improve performance for block sizes in the 0 to 160 byte range. This range, which corresponds to 0 to 40 floating point numbers per processor, is commonly encountered in practical numeric applications. The multiphase technique is applicable to all circuit-switched hypercubes that use the common e-cube routing strategy.
Parallel Geospatial Data Management for Multi-Scale Environmental Data Analysis on GPUs
NASA Astrophysics Data System (ADS)
Wang, D.; Zhang, J.; Wei, Y.
2013-12-01
As the spatial and temporal resolutions of Earth observatory data and Earth system simulation outputs are getting higher, in-situ and/or post- processing such large amount of geospatial data increasingly becomes a bottleneck in scientific inquires of Earth systems and their human impacts. Existing geospatial techniques that are based on outdated computing models (e.g., serial algorithms and disk-resident systems), as have been implemented in many commercial and open source packages, are incapable of processing large-scale geospatial data and achieve desired level of performance. In this study, we have developed a set of parallel data structures and algorithms that are capable of utilizing massively data parallel computing power available on commodity Graphics Processing Units (GPUs) for a popular geospatial technique called Zonal Statistics. Given two input datasets with one representing measurements (e.g., temperature or precipitation) and the other one represent polygonal zones (e.g., ecological or administrative zones), Zonal Statistics computes major statistics (or complete distribution histograms) of the measurements in all regions. Our technique has four steps and each step can be mapped to GPU hardware by identifying its inherent data parallelisms. First, a raster is divided into blocks and per-block histograms are derived. Second, the Minimum Bounding Boxes (MBRs) of polygons are computed and are spatially matched with raster blocks; matched polygon-block pairs are tested and blocks that are either inside or intersect with polygons are identified. Third, per-block histograms are aggregated to polygons for blocks that are completely within polygons. Finally, for blocks that intersect with polygon boundaries, all the raster cells within the blocks are examined using point-in-polygon-test and cells that are within polygons are used to update corresponding histograms. As the task becomes I/O bound after applying spatial indexing and GPU hardware acceleration, we have developed a GPU-based data compression technique by reusing our previous work on Bitplane Quadtree (or BPQ-Tree) based indexing of binary bitmaps. Results have shown that our GPU-based parallel Zonal Statistic technique on 3000+ US counties over 20+ billion NASA SRTM 30 meter resolution Digital Elevation (DEM) raster cells has achieved impressive end-to-end runtimes: 101 seconds and 46 seconds a low-end workstation equipped with a Nvidia GTX Titan GPU using cold and hot cache, respectively; and, 60-70 seconds using a single OLCF TITAN computing node and 10-15 seconds using 8 nodes. Our experiment results clearly show the potentials of using high-end computing facilities for large-scale geospatial processing.
Thieving rodents as substitute dispersers of megafaunal seeds.
Jansen, Patrick A; Hirsch, Ben T; Emsens, Willem-Jan; Zamora-Gutierrez, Veronica; Wikelski, Martin; Kays, Roland
2012-07-31
The Neotropics have many plant species that seem to be adapted for seed dispersal by megafauna that went extinct in the late Pleistocene. Given the crucial importance of seed dispersal for plant persistence, it remains a mystery how these plants have survived more than 10,000 y without their mutualist dispersers. Here we present support for the hypothesis that secondary seed dispersal by scatter-hoarding rodents has facilitated the persistence of these large-seeded species. We used miniature radio transmitters to track the dispersal of reputedly megafaunal seeds by Central American agoutis, which scatter-hoard seeds in shallow caches in the soil throughout the forest. We found that seeds were initially cached at mostly short distances and then quickly dug up again. However, rather than eating the recovered seeds, agoutis continued to move and recache the seeds, up to 36 times. Agoutis dispersed an estimated 35% of seeds for >100 m. An estimated 14% of the cached seeds survived to the next year, when a new fruit crop became available to the rodents. Serial video-monitoring of cached seeds revealed that the stepwise dispersal was caused by agoutis repeatedly stealing and recaching each other's buried seeds. Although previous studies suggest that rodents are poor dispersers, we demonstrate that communities of rodents can in fact provide highly effective long-distance seed dispersal. Our findings suggest that thieving scatter-hoarding rodents could substitute for extinct megafaunal seed dispersers of tropical large-seeded trees.
Storageless and caching Tier-2 models in the UK context
NASA Astrophysics Data System (ADS)
Cadellin Skipsey, Samuel; Dewhurst, Alastair; Crooks, David; MacMahon, Ewan; Roy, Gareth; Smith, Oliver; Mohammed, Kashif; Brew, Chris; Britton, David
2017-10-01
Operational and other pressures have lead to WLCG experiments moving increasingly to a stratified model for Tier-2 resources, where “fat” Tier-2s (“T2Ds”) and “thin” Tier-2s (“T2Cs”) provide different levels of service. In the UK, this distinction is also encouraged by the terms of the current GridPP5 funding model. In anticipation of this, testing has been performed on the implications, and potential implementation, of such a distinction in our resources. In particular, this presentation presents the results of testing of storage T2Cs, where the “thin” nature is expressed by the site having either no local data storage, or only a thin caching layer; data is streamed or copied from a “nearby” T2D when needed by jobs. In OSG, this model has been adopted successfully for CMS AAA sites; but the network topology and capacity in the USA is significantly different to that in the UK (and much of Europe). We present the result of several operational tests: the in-production University College London (UCL) site, which runs ATLAS workloads using storage at the Queen Mary University of London (QMUL) site; the Oxford site, which has had scaling tests performed against T2Ds in various locations in the UK (to test network effects); and the Durham site, which has been testing the specific ATLAS caching solution of “Rucio Cache” integration with ARC’s caching layer.
On the Efficacy of Source Code Optimizations for Cache-Based Systems
NASA Technical Reports Server (NTRS)
VanderWijngaart, Rob F.; Saphir, William C.
1998-01-01
Obtaining high performance without machine-specific tuning is an important goal of scientific application programmers. Since most scientific processing is done on commodity microprocessors with hierarchical memory systems, this goal of "portable performance" can be achieved if a common set of optimization principles is effective for all such systems. It is widely believed, or at least hoped, that portable performance can be realized. The rule of thumb for optimization on hierarchical memory systems is to maximize temporal and spatial locality of memory references by reusing data and minimizing memory access stride. We investigate the effects of a number of optimizations on the performance of three related kernels taken from a computational fluid dynamics application. Timing the kernels on a range of processors, we observe an inconsistent and often counterintuitive impact of the optimizations on performance. In particular, code variations that have a positive impact on one architecture can have a negative impact on another, and variations expected to be unimportant can produce large effects. Moreover, we find that cache miss rates - as reported by a cache simulation tool, and confirmed by hardware counters - only partially explain the results. By contrast, the compiler-generated assembly code provides more insight by revealing the importance of processor-specific instructions and of compiler maturity, both of which strongly, and sometimes unexpectedly, influence performance. We conclude that it is difficult to obtain performance portability on modern cache-based computers, and comment on the implications of this result.
On the Efficacy of Source Code Optimizations for Cache-Based Systems
NASA Technical Reports Server (NTRS)
VanderWijngaart, Rob F.; Saphir, William C.; Saini, Subhash (Technical Monitor)
1998-01-01
Obtaining high performance without machine-specific tuning is an important goal of scientific application programmers. Since most scientific processing is done on commodity microprocessors with hierarchical memory systems, this goal of "portable performance" can be achieved if a common set of optimization principles is effective for all such systems. It is widely believed, or at least hoped, that portable performance can be realized. The rule of thumb for optimization on hierarchical memory systems is to maximize temporal and spatial locality of memory references by reusing data and minimizing memory access stride. We investigate the effects of a number of optimizations on the performance of three related kernels taken from a computational fluid dynamics application. Timing the kernels on a range of processors, we observe an inconsistent and often counterintuitive impact of the optimizations on performance. In particular, code variations that have a positive impact on one architecture can have a negative impact on another, and variations expected to be unimportant can produce large effects. Moreover, we find that cache miss rates-as reported by a cache simulation tool, and confirmed by hardware counters-only partially explain the results. By contrast, the compiler-generated assembly code provides more insight by revealing the importance of processor-specific instructions and of compiler maturity, both of which strongly, and sometimes unexpectedly, influence performance. We conclude that it is difficult to obtain performance portability on modern cache-based computers, and comment on the implications of this result.
Use of diuretics is associated with reduced risk of Alzheimer's disease: the Cache County Study.
Chuang, Yi-Fang; Breitner, John C S; Chiu, Yen-Ling; Khachaturian, Ara; Hayden, Kathleen; Corcoran, Chris; Tschanz, JoAnn; Norton, Maria; Munger, Ron; Welsh-Bohmer, Kathleen; Zandi, Peter P
2014-11-01
Although the use of antihypertensive medications has been associated with reduced risk of Alzheimer's disease (AD), it remains unclear which class provides the most benefit. The Cache County Study of Memory Health and Aging is a prospective longitudinal cohort study of dementing illnesses among the elderly population of Cache County, Utah. Using waves I to IV data of the Cache County Study, 3417 participants had a mean of 7.1 years of follow-up. Time-varying use of antihypertensive medications including different class of diuretics, angiotensin converting enzyme inhibitors, β-blockers, and calcium channel blockers was used to predict the incidence of AD using Cox proportional hazards analyses. During follow-up, 325 AD cases were ascertained with a total of 23,590 person-years. Use of any antihypertensive medication was associated with lower incidence of AD (adjusted hazard ratio [aHR], 0.77; 95% confidence interval [CI], 0.61-0.97). Among different classes of antihypertensive medications, thiazide (aHR, 0.7; 95% CI, 0.53-0.93), and potassium-sparing diuretics (aHR, 0.69; 95% CI, 0.48-0.99) were associated with the greatest reduction of AD risk. Thiazide and potassium-sparing diuretics were associated with decreased risk of AD. The inverse association of potassium-sparing diuretics confirms an earlier finding in this cohort, now with longer follow-up, and merits further investigation. Copyright © 2014 Elsevier Inc. All rights reserved.
Use of diuretics is associated with reduced risk of Alzheimer’s disease: the Cache County Study
Chuang, Yi-Fang; Breitner, John C.S.; Chiu, Yen-Ling; Khachaturian, Ara; Hayden, Kathleen; Corcoran, Chris; Tschanz, JoAnn; Norton, Maria; Munger, Ron; Welsh-Bohmer, Kathleen; Zandi, Peter P.
2015-01-01
Although the use of antihypertensive medications has been associated with reduced risk of Alzheimer’s disease (AD), it remains unclear which class provides the most benefit. The Cache County Study of Memory Health and Aging is a prospective longitudinal cohort study of dementing illnesses among the elderly population of Cache County, Utah. Using waves I to IV data of the Cache County Study, 3417 participants had a mean of 7.1 years of follow-up. Time-varying use of antihypertensive medications including different class of diuretics, angiotensin converting enzyme inhibitors, β-blockers, and calcium channel blockers was used to predict the incidence of AD using Cox proportional hazards analyses. During follow-up, 325 AD cases were ascertained with a total of 23,590 person-years. Use of any anti-hypertensive medication was associated with lower incidence of AD (adjusted hazard ratio [aHR], 0.77; 95% confidence interval [CI], 0.61–0.97). Among different classes of antihypertensive medications, thiazide (aHR, 0.7; 95% CI, 0.53–0.93), and potassium-sparing diuretics (aHR, 0.69; 95% CI, 0.48–0.99) were associated with the greatest reduction of AD risk. Thiazide and potassium-sparing diuretics were associated with decreased risk of AD. The inverse association of potassium-sparing diuretics confirms an earlier finding in this cohort, now with longer follow-up, and merits further investigation. PMID:24910391
NASA Astrophysics Data System (ADS)
Sakaguchi, Hidetsugu; Kadowaki, Shuntaro
2017-07-01
We study slowly pulling block-spring models in random media. Second-order phase transitions exist in a model pulled by a constant force in the case of velocity-strengthening friction. If external forces are slowly increased, nearly critical states are self-organized. Slips of various sizes occur, and the probability distributions of slip size roughly obey power laws. The exponent is close to that in the quenched Edwards-Wilkinson model. Furthermore, the slip-size distributions are investigated in cases of Coulomb friction, velocity-weakening friction, and two-dimensional block-spring models.
NASA Technical Reports Server (NTRS)
Zell, Peter
2012-01-01
A document describes a new way to integrate thermal protection materials on external surfaces of vehicles that experience the severe heating environments of atmospheric entry from space. Cured blocks of thermal protection materials are bonded into a compatible, large-cell honeycomb matrix that can be applied on the external surfaces of the vehicles. The honeycomb matrix cell size, and corresponding thermal protection material block size, is envisioned to be between 1 and 4 in. (.2.5 and 10 cm) on a side, with a depth required to protect the vehicle. The cell wall thickness is thin, between 0.01 and 0.10 in. (.0.025 and 0.25 cm). A key feature is that the honeycomb matrix is attached to the vehicle fs unprotected external surface prior to insertion of the thermal protection material blocks. The attachment integrity of the honeycomb can then be confirmed over the full range of temperature and loads that the vehicle will experience. Another key feature of the innovation is the use of uniform-sized thermal protection material blocks. This feature allows for the mass production of these blocks at a size that is convenient for quality control inspection. The honeycomb that receives the blocks must have cells with a compatible set of internal dimensions. The innovation involves the use of a faceted subsurface under the honeycomb. This provides a predictable surface with perpendicular cell walls for the majority of the blocks. Some cells will have positive tapers to accommodate mitered joints between honeycomb panels on each facet of the subsurface. These tapered cells have dimensions that may fall within the boundaries of the uniform-sized blocks.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seal, Sudip K; Perumalla, Kalyan S; Hirshman, Steven Paul
2013-01-01
Simulations that require solutions of block tridiagonal systems of equations rely on fast parallel solvers for runtime efficiency. Leading parallel solvers that are highly effective for general systems of equations, dense or sparse, are limited in scalability when applied to block tridiagonal systems. This paper presents scalability results as well as detailed analyses of two parallel solvers that exploit the special structure of block tridiagonal matrices to deliver superior performance, often by orders of magnitude. A rigorous analysis of their relative parallel runtimes is shown to reveal the existence of a critical block size that separates the parameter space spannedmore » by the number of block rows, the block size and the processor count, into distinct regions that favor one or the other of the two solvers. Dependence of this critical block size on the above parameters as well as on machine-specific constants is established. These formal insights are supported by empirical results on up to 2,048 cores of a Cray XT4 system. To the best of our knowledge, this is the highest reported scalability for parallel block tridiagonal solvers to date.« less
2006-06-14
Robert Graybill . A Raw hoard for the use of this project was provided by the Computer Architecture Croup at the Massachusetts Institute of Technology...simulator is presented by MIT as being an accurate model of the Raw chip, we have found that it does not accurately model the board. Our comparison...G4 processor, model 7410. with a 32 kbyte level-1 cache on-chip and a 2 Mbyte L2 cache connected through a 250 MH/ bus [12]. Each node has 256 Mbyte
Effects of block copolymer properties on nanocarrier protection from in vivo clearance
D’Addio, Suzanne M.; Saad, Walid; Ansell, Steven M.; Squiers, John J.; Adamson, Douglas; Herrera-Alonso, Margarita; Wohl, Adam R.; Hoye, Thomas R.; Macosko, Christopher W.; Mayer, Lawrence D.; Vauthier, Christine; Prud’homme, Robert K.
2012-01-01
Drug nanocarrier clearance by the immune system must be minimized to achieve targeted delivery to pathological tissues. There is considerable interest in finding in vitro tests that can predict in vivo clearance outcomes. In this work, we produce nanocarriers with dense PEG layers resulting from block copolymer-directed assembly during rapid precipitation. Nanocarriers are formed using block copolymers with hydrophobic blocks of polystyrene (PS), poly-ε-caprolactone (PCL), poly-D,L-lactide (PLA), or poly-lactide-co-glycolide (PLGA), and hydrophilic blocks of polyethylene glycol (PEG) with molecular weights from 1.5 kg/mol to 9 kg/mol. Nanocarriers with paclitaxel prodrugs are evaluated in vivo in Foxn1nu mice to determine relative rates of clearance. The amount of nanocarrier in circulation after 4 h varies from 10% to 85% of initial dose, depending on the block copolymer. In vitro complement activation assays are conducted in an effort to correlate the protection of the nanocarrier surface from complement binding and activation and in vivo circulation. Guidelines for optimizing block copolymer structure to maximize circulation of nanocarriers formed by rapid precipitation and directed assembly are proposed, relating to the relative size of the hydrophilic and hydrophobic block, the hydrophobicity of the anchoring block, the absolute size of the PEG block, and polymer crystallinity. The in vitro results distinguish between the poorly circulating PEG5k-PCL9k and the better circulating nanocarriers, but could not rank the better circulating nanocarriers in order of circulation time. Analysis of PEG surface packing on monodisperse 200 nm latex spheres indicates that the sizes of the hydrophobic PCL, PS, and PLA blocks are correlated with the PEG blob size, and possibly the clearance from circulation. Suggestions for next step in vitro measurements are made. PMID:22732478
Controlling the Pore Size of Mesoporous Carbon Thin Films through Thermal and Solvent Annealing.
Zhou, Zhengping; Liu, Guoliang
2017-04-01
Herein an approach to controlling the pore size of mesoporous carbon thin films from metal-free polyacrylonitrile-containing block copolymers is described. A high-molecular-weight poly(acrylonitrile-block-methyl methacrylate) (PAN-b-PMMA) is synthesized via reversible addition-fragmentation chain transfer (RAFT) polymerization. The authors systematically investigate the self-assembly behavior of PAN-b-PMMA thin films during thermal and solvent annealing, as well as the pore size of mesoporous carbon thin films after pyrolysis. The as-spin-coated PAN-b-PMMA is microphase-separated into uniformly spaced globular nanostructures, and these globular nanostructures evolve into various morphologies after thermal or solvent annealing. Surprisingly, through thermal annealing and subsequent pyrolysis of PAN-b-PMMA into mesoporous carbon thin films, the pore size and center-to-center spacing increase significantly with thermal annealing temperature, different from most block copolymers. In addition, the choice of solvent in solvent annealing strongly influences the block copolymer nanostructure and the pore size of mesoporous carbon thin films. The discoveries herein provide a simple strategy to control the pore size of mesoporous carbon thin films by tuning thermal or solvent annealing conditions, instead of synthesizing a series of block copolymers of various molecular weights and compositions. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Domagalski, Joseph L.; Alpers, Charles N.; Slotton, Darrell G.; Suchanek, Thomas H.; Ayers, Shaun M.
2004-01-01
Concentrations and mass loads of total mercury and methylmercury in streams draining abandoned mercury mines and near geothermal discharge in Cache Creek Basin, California, were measured during a 17-month period from January 2000 through May 2001. Rainfall and runoff averages during the study period were lower than long-term averages. Mass loads of mercury and methylmercury from upstream sources to downstream receiving waters, such as San Francisco Bay, were generally the highest during or after winter rainfall events. During the study period, mass loads of mercury and methylmercury from geothermal sources tended to be greater than those from abandoned mining areas because of a lack of large precipitation events capable of mobilizing significant amounts of either mercury-laden sediment or dissolved mercury and methylmercury from mine waste. Streambed sediments of Cache Creek are a source of mercury and methylmercury to downstream receiving bodies of water such as the Delta of the San Joaquin and Sacramento Rivers. Much of the mercury in these sediments was deposited over the last 150 years by erosion and stream discharge from abandoned mines or by continuous discharges from geothermal areas. Several geochemical constituents were useful as natural tracers for mining and geothermal areas. These constituents included aqueous concentrations of boron, chloride, lithium, and sulfate, and the stable isotopes of hydrogen and oxygen in water. Stable isotopes of water in areas draining geothermal discharges were enriched with more oxygen-18 relative to oxygen-16 than meteoric waters, whereas the enrichment by stable isotopes of water from much of the runoff from abandoned mines was similar to that of meteoric water. Geochemical signatures from stable isotopes and trace-element concentrations may be useful as tracers of total mercury or methylmercury from specific locations; however, mercury and methylmercury are not conservatively transported. A distinct mixing trend of trace elements and stable isotopes of hydrogen and oxygen from geothermal waters was apparent in Sulphur Creek and lower Bear Creek (tributaries to Cache Creek), but the signals are lost upon mixing with Cache Creek because of dilution.
Power generation technology options for a Mars mission
NASA Technical Reports Server (NTRS)
Bozek, John M.; Cataldo, Robert L.
1994-01-01
The power requirements and resultant power system performances of an aggressive Mars mission are characterized. The power system technologies discussed will support both cargo and piloted space transport vehicles as well as a six-person crew on the Martian surface for 600 days. The mission uses materials transported by cargo vehicles and materials produced using in-situ planetary feed stock to establish a life-support cache and infrastructure for the follow-on piloted lander. Numerous power system technical options are sized to meet the mission power requirements using conventional and solar, nuclear, and wireless power transmission technologies for stationary, mobile surface, and space applications. Technology selections will depend on key criteria such as mass, volume, area, maturity, and application flexibility.
Experimental scheme and restoration algorithm of block compression sensing
NASA Astrophysics Data System (ADS)
Zhang, Linxia; Zhou, Qun; Ke, Jun
2018-01-01
Compressed Sensing (CS) can use the sparseness of a target to obtain its image with much less data than that defined by the Nyquist sampling theorem. In this paper, we study the hardware implementation of a block compression sensing system and its reconstruction algorithms. Different block sizes are used. Two algorithms, the orthogonal matching algorithm (OMP) and the full variation minimum algorithm (TV) are used to obtain good reconstructions. The influence of block size on reconstruction is also discussed.
What makes specialized food-caching mountain chickadees successful city slickers?
Kozlovsky, Dovid Y; Weissgerber, Emily A; Pravosudov, Vladimir V
2017-05-31
Anthropogenic environments are a dominant feature of the modern world; therefore, understanding which traits allow animals to succeed in these urban environments is especially important. Overall, generalist species are thought to be most successful in urban environments, with better general cognition and less neophobia as suggested critical traits. It is less clear, however, which traits would be favoured in urban environments in highly specialized species. Here, we compared highly specialized food-caching mountain chickadees living in an urban environment (Reno, NV, USA) with those living in their natural environment to investigate what makes this species successful in the city. Using a 'common garden' paradigm, we found that urban mountain chickadees tended to explore a novel environment faster and moved more frequently, were better at novel problem-solving, had better long-term spatial memory retention and had a larger telencephalon volume compared with forest chickadees. There were no significant differences between urban and forest chickadees in neophobia, food-caching rates, spatial memory acquisition, hippocampus volume, or the total number of hippocampal neurons. Our results partially support the idea that some traits associated with behavioural flexibility and innovation are associated with successful establishment in urban environments, but differences in long-term spatial memory retention suggest that even this trait specialized for food-caching may be advantageous. Our results highlight the importance of environmental context, species biology, and temporal aspects of invasion in understanding how urban environments are associated with behavioural and cognitive phenotypes and suggest that there is likely no one suite of traits that makes urban animals successful. © 2017 The Author(s).
PEM public key certificate cache server
NASA Astrophysics Data System (ADS)
Cheung, T.
1993-12-01
Privacy Enhanced Mail (PEM) provides privacy enhancement services to users of Internet electronic mail. Confidentiality, authentication, message integrity, and non-repudiation of origin are provided by applying cryptographic measures to messages transferred between end systems by the Message Transfer System. PEM supports both symmetric and asymmetric key distribution. However, the prevalent implementation uses a public key certificate-based strategy, modeled after the X.509 directory authentication framework. This scheme provides an infrastructure compatible with X.509. According to RFC 1422, public key certificates can be stored in directory servers, transmitted via non-secure message exchanges, or distributed via other means. Directory services provide a specialized distributed database for OSI applications. The directory contains information about objects and then provides structured mechanisms for accessing that information. Since directory services are not widely available now, a good approach is to manage certificates in a centralized certificate server. This document describes the detailed design of a centralized certificate cache serve. This server manages a cache of certificates and a cache of Certificate Revocation Lists (CRL's) for PEM applications. PEMapplications contact the server to obtain/store certificates and CRL's. The server software is programmed in C and ELROS. To use this server, ISODE has to be configured and installed properly. The ISODE library 'libisode.a' has to be linked together with this library because ELROS uses the transport layer functions provided by 'libisode.a.' The X.500 DAP library that is included with the ELROS distribution has to be linked in also, since the server uses the DAP library functions to communicate with directory servers.
Initial Performance Results on IBM POWER6
NASA Technical Reports Server (NTRS)
Saini, Subbash; Talcott, Dale; Jespersen, Dennis; Djomehri, Jahed; Jin, Haoqiang; Mehrotra, Piysuh
2008-01-01
The POWER5+ processor has a faster memory bus than that of the previous generation POWER5 processor (533 MHz vs. 400 MHz), but the measured per-core memory bandwidth of the latter is better than that of the former (5.7 GB/s vs. 4.3 GB/s). The reason for this is that in the POWER5+, the two cores on the chip share the L2 cache, L3 cache and memory bus. The memory controller is also on the chip and is shared by the two cores. This serializes the path to memory. For consistently good performance on a wide range of applications, the performance of the processor, the memory subsystem, and the interconnects (both latency and bandwidth) should be balanced. Recognizing this, IBM has designed the Power6 processor so as to avoid the bottlenecks due to the L2 cache, memory controller and buffer chips of the POWER5+. Unlike the POWER5+, each core in the POWER6 has its own L2 cache (4 MB - double that of the Power5+), memory controller and buffer chips. Each core in the POWER6 runs at 4.7 GHz instead of 1.9 GHz in POWER5+. In this paper, we evaluate the performance of a dual-core Power6 based IBM p6-570 system, and we compare its performance with that of a dual-core Power5+ based IBM p575+ system. In this evaluation, we have used the High- Performance Computing Challenge (HPCC) benchmarks, NAS Parallel Benchmarks (NPB), and four real-world applications--three from computational fluid dynamics and one from climate modeling.
Hothem, Roger L.; Trejo, Bonnie S.; Bauer, Marissa L.; Crayon, John J.
2008-01-01
To evaluate mercury (Hg) and other element exposure in cliff swallows (Petrochelidon pyrrhonota), eggs were collected from 16 sites within the mining-impacted Cache Creek watershed, Colusa, Lake, and Yolo counties, California, USA, in 1997-1998. Nestlings were collected from seven sites in 1998. Geometric mean total Hg (THg) concentrations ranged from 0.013 to 0.208 ??g/g wet weight (ww) in cliff swallow eggs and from 0.047 to 0.347 ??g/g ww in nestlings. Mercury detected in eggs generally followed the spatial distribution of Hg in the watershed based on proximity to both anthropogenic and natural sources. Mean Hg concentrations in samples of eggs and nestlings collected from sites near Hg sources were up to five and seven times higher, respectively, than in samples from reference sites within the watershed. Concentrations of other detected elements, including aluminum, beryllium, boron, calcium, manganese, strontium, and vanadium, were more frequently elevated at sites near Hg sources. Overall, Hg concentrations in eggs from Cache Creek were lower than those reported in eggs of tree swallows (Tachycineta bicolor) from highly contaminated locations in North America. Total Hg concentrations were lower in all Cache Creek egg samples than adverse effects levels established for other species. Total Hg concentrations in bullfrogs (Rana catesbeiana) and foothill yellow-legged frogs (Rana boylii) collected from 10 of the study sites were both positively correlated with THg concentrations in cliff swallow eggs. Our data suggest that cliff swallows are reliable bioindicators of environmental Hg. ?? Springer Science+Business Media, LLC 2007.
Engineering aqueous fiber assembly into silk-elastin-like protein polymers.
Zeng, Like; Jiang, Linan; Teng, Weibing; Cappello, Joseph; Zohar, Yitshak; Wu, Xiaoyi
2014-07-01
Self-assembled peptide/protein nanofibers are valuable 1D building blocks for creating complex structures with designed properties and functions. It is reported that the self-assembly of silk-elastin-like protein polymers into nanofibers or globular aggregates in aqueous solutions can be modulated by tuning the temperature of the protein solutions, the size of the silk blocks, and the charge of the elastin blocks. A core-sheath model is proposed for nanofiber formation, with the silk blocks in the cores and the hydrated elastin blocks in the sheaths. The folding of the silk blocks into stable cores--affected by the size of the silk blocks and the charge of the elastin blocks--plays a critical role in the assembly of silk-elastin nanofibers. Furthermore, enhanced hydrophobic interactions between the elastin blocks at elevated temperatures greatly influence the nanoscale features of silk-elastin nanofibers. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Magnacca, Giuliana; Jadhav, Sushilkumar A; Scalarone, Dominique
2016-01-01
Summary Large-mesopore silica films with a narrow pore size distribution and high porosity have been obtained by a sol–gel reaction of a silicon oxide precursor (TEOS) and using polystyrene-block-poly(ethylene oxide) (PS-b-PEO) copolymers as templates in an acidic environment. PS-b-PEO copolymers with different molecular weight and composition have been studied in order to assess the effects of the block length on the pore size of the templated silica films. The changes in the morphology of the porous systems have been investigated by transmission electron microscopy and a systematic analysis has been carried out, evidencing the dependence between the hydrophilic/hydrophobic ratio of the two polymer blocks and the size of the final silica pores. The obtained results prove that by tuning the PS/PEO ratio, the pore size of the templated silica films can be easily and finely predicted. PMID:27826520
Planetary Sample Caching System Design Options
NASA Technical Reports Server (NTRS)
Collins, Curtis; Younse, Paulo; Backes, Paul
2009-01-01
Potential Mars Sample Return missions would aspire to collect small core and regolith samples using a rover with a sample acquisition tool and sample caching system. Samples would need to be stored in individual sealed tubes in a canister that could be transfered to a Mars ascent vehicle and returned to Earth. A sample handling, encapsulation and containerization system (SHEC) has been developed as part of an integrated system for acquiring and storing core samples for application to future potential MSR and other potential sample return missions. Requirements and design options for the SHEC system were studied and a recommended design concept developed. Two families of solutions were explored: 1)transfer of a raw sample from the tool to the SHEC subsystem and 2)transfer of a tube containing the sample to the SHEC subsystem. The recommended design utilizes sample tool bit change out as the mechanism for transferring tubes to and samples in tubes from the tool. The SHEC subsystem design, called the Bit Changeout Caching(BiCC) design, is intended for operations on a MER class rover.
Architectural Techniques For Managing Non-volatile Caches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh
As chip power dissipation becomes a critical challenge in scaling processor performance, computer architects are forced to fundamentally rethink the design of modern processors and hence, the chip-design industry is now at a major inflection point in its hardware roadmap. The high leakage power and low density of SRAM poses serious obstacles in its use for designing large on-chip caches and for this reason, researchers are exploring non-volatile memory (NVM) devices, such as spin torque transfer RAM, phase change RAM and resistive RAM. However, since NVMs are not strictly superior to SRAM, effective architectural techniques are required for making themmore » a universal memory solution. This book discusses techniques for designing processor caches using NVM devices. It presents algorithms and architectures for improving their energy efficiency, performance and lifetime. It also provides both qualitative and quantitative evaluation to help the reader gain insights and motivate them to explore further. This book will be highly useful for beginners as well as veterans in computer architecture, chip designers, product managers and technical marketing professionals.« less
[PACS: storage and retrieval of digital radiological image data].
Wirth, S; Treitl, M; Villain, S; Lucke, A; Nissen-Meyer, S; Mittermaier, I; Pfeifer, K-J; Reiser, M
2005-08-01
Efficient handling of both picture archiving and retrieval is a crucial factor when new PACS installations as well as technical upgrades are planned. For a large PACS installation for 200 actual studies, the number, modality,and body region of available priors were evaluated. In addition, image access time of 100 CT studies from hard disk (RAID), magneto-optic disk (MOD), and tape archives (TAPE) were accessed. For current examinations priors existed in 61.1% with an averaged quantity of 7.7 studies. Thereof 56.3% were within 0-3 months, 84.9% within 12 months, 91.7% within 24 months, and 96.2% within 36 months. On average, access to images from the hard disk cache was more than 100 times faster then from MOD or TAPE. Since only PACS RAID provides online image access, at least current imaging of the past 12 months should be available from cache. An accurate prefetching mechanism facilitates effective use of the expensive online cache area. For that, however, close interaction of PACS, RIS, and KIS is an indispensable prerequisite.
Experience with HEP analysis on mounted filesystems.
NASA Astrophysics Data System (ADS)
Fuhrmann, Patrick; Gasthuber, Martin; Kemp, Yves; Ozerov, Dmitry
2012-12-01
We present results on different approaches on mounted filesystems in use or under investigation at DESY. dCache, established since long as a storage system for physics data has implemented the NFS v4.1/pNFS protocol. New performance results will be shown with the most current version of the dCache server. In addition to the native usage of the mounted filesystem in a LAN environment, the results are given for the performance of the dCache NFS v4.1/pNFS in WAN case. Several commercial vendors are currently in alpha or beta phase of adding the NFS v4.1/pNFS protocol to their storage appliances. We will test some of these vendor solutions for their readiness for HEP analysis. DESY has recently purchased an IBM Sonas system. We will present the result of a thorough performance evaluation using the native protocols NFS (v3 or v4) and GPFS. As the emphasis is on the usability for end user analysis, we will use latest ROOT versions and current end user analysis code for benchmark scenarios.
Multiple channel data acquisition system
Crawley, H. Bert; Rosenberg, Eli I.; Meyer, W. Thomas; Gorbics, Mark S.; Thomas, William D.; McKay, Roy L.; Homer, Jr., John F.
1990-05-22
A multiple channel data acquisition system for the transfer of large amounts of data from a multiplicity of data channels has a plurality of modules which operate in parallel to convert analog signals to digital data and transfer that data to a communications host via a FASTBUS. Each module has a plurality of submodules which include a front end buffer (FEB) connected to input circuitry having an analog to digital converter with cache memory for each of a plurality of channels. The submodules are interfaced with the FASTBUS via a FASTBUS coupler which controls a module bus and a module memory. The system is triggered to effect rapid parallel data samplings which are stored to the cache memories. The cache memories are uploaded to the FEBs during which zero suppression occurs. The data in the FEBs is reformatted and compressed by a local processor during transfer to the module memory. The FASTBUS coupler is used by the communications host to upload the compressed and formatted data from the module memory. The local processor executes programs which are downloaded to the module memory through the FASTBUS coupler.
Multiple channel data acquisition system
Crawley, H.B.; Rosenberg, E.I.; Meyer, W.T.; Gorbics, M.S.; Thomas, W.D.; McKay, R.L.; Homer, J.F. Jr.
1990-05-22
A multiple channel data acquisition system for the transfer of large amounts of data from a multiplicity of data channels has a plurality of modules which operate in parallel to convert analog signals to digital data and transfer that data to a communications host via a FASTBUS. Each module has a plurality of submodules which include a front end buffer (FEB) connected to input circuitry having an analog to digital converter with cache memory for each of a plurality of channels. The submodules are interfaced with the FASTBUS via a FASTBUS coupler which controls a module bus and a module memory. The system is triggered to effect rapid parallel data samplings which are stored to the cache memories. The cache memories are uploaded to the FEBs during which zero suppression occurs. The data in the FEBs is reformatted and compressed by a local processor during transfer to the module memory. The FASTBUS coupler is used by the communications host to upload the compressed and formatted data from the module memory. The local processor executes programs which are downloaded to the module memory through the FASTBUS coupler. 25 figs.
Small domain-size multiblock copolymer electrolytes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pistorino, Jonathan; Eitouni, Hany Basam
2016-09-20
New block polymer electrolytes have been developed which have higher conductivities than previously reported for other block copolymer electrolytes. The new materials are constructed of multiple blocks (>5) of relatively low domain size. The small domain size provides greater protection against formation of dendrites during cycling against lithium in an electrochemical cell, while the large total molecular weight insures poor long range alignment, which leads to higher conductivity. In addition to higher conductivity, these materials can be more easily synthesized because of reduced requirements on the purity level of the reagents.
NASA Astrophysics Data System (ADS)
Zhao, Tao; Crosta, Giovanni Battista; Dattola, Giuseppe; Utili, Stefano
2018-04-01
The dynamic fragmentation of jointed rock blocks during rockslide avalanches has been investigated by discrete element method simulations for a multiple arrangement of a rock block sliding over a simple slope geometry. The rock blocks are released along an inclined sliding plane and subsequently collide onto a flat horizontal plane at a sharp kink point. The contact force chains generated by the impact appear initially at the bottom frontal corner of the rock block and then propagate radially upward to the top rear part of the block. The jointed rock blocks exhibit evident contact force concentration and discontinuity of force wave propagation near the joint, associating with high energy dissipation of granular dynamics. The corresponding force wave propagation velocity can be less than 200 m/s, which is much smaller than that of an intact rock (1,316 m/s). The concentration of contact forces at the bottom leads to high rock fragmentation intensity and momentum boosts, facilitating the spreading of many fine fragments to the distal ends. However, the upper rock block exhibits very low rock fragmentation intensity but high energy dissipation due to intensive friction and damping, resulting in the deposition of large fragments near the slope toe. The size and shape of large fragments are closely related to the orientation and distribution of the block joints. The cumulative fragment size distribution can be well fitted by the Weibull's distribution function, with very gentle and steep curvatures at the fine and coarse size ranges, respectively. The numerical results of fragment size distribution can match well some experimental and field observations.
Bayesian block-diagonal variable selection and model averaging
Papaspiliopoulos, O.; Rossell, D.
2018-01-01
Summary We propose a scalable algorithmic framework for exact Bayesian variable selection and model averaging in linear models under the assumption that the Gram matrix is block-diagonal, and as a heuristic for exploring the model space for general designs. In block-diagonal designs our approach returns the most probable model of any given size without resorting to numerical integration. The algorithm also provides a novel and efficient solution to the frequentist best subset selection problem for block-diagonal designs. Posterior probabilities for any number of models are obtained by evaluating a single one-dimensional integral, and other quantities of interest such as variable inclusion probabilities and model-averaged regression estimates are obtained by an adaptive, deterministic one-dimensional numerical integration. The overall computational cost scales linearly with the number of blocks, which can be processed in parallel, and exponentially with the block size, rendering it most adequate in situations where predictors are organized in many moderately-sized blocks. For general designs, we approximate the Gram matrix by a block-diagonal matrix using spectral clustering and propose an iterative algorithm that capitalizes on the block-diagonal algorithms to explore efficiently the model space. All methods proposed in this paper are implemented in the R library mombf. PMID:29861501
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dykstra, D.; Blomer, J.
Both the CernVM File System (CVMFS) and the Frontier Distributed Database Caching System (Frontier) distribute centrally updated data worldwide for LHC experiments using http proxy caches. Neither system provides privacy or access control on reading the data, but both control access to updates of the data and can guarantee the authenticity and integrity of the data transferred to clients over the internet. CVMFS has since its early days required digital signatures and secure hashes on all distributed data, and recently Frontier has added X.509-based authenticity and integrity checking. In this paper we detail and compare the security models of CVMFSmore » and Frontier.« less
2016-11-09
Total Number: Sub Contractors (DD882) Names of Personnel receiving masters degrees Names of personnel receiving PHDs Names of other research staff...Broadcom 5720 QP 1Gb Network Daughter Card (2) Intel Xeon E5-2680 v3 2.5GHz, 30M Cache, 9.60GT/s QPI, Turbo, HT , 12C/24T (120W...Broadcom 5720 QP 1Gb Network Daughter Card (2) Intel Xeon E5-2680 v3 2.5GHz, 30M Cache, 9.60GT/s QPI, Turbo, HT , 12C/24T (120W
Rytuba, James J.; Hothem, Roger L.; Brussee, Brianne E.; Goldstein, Daniel; May, Jason T.
2015-01-01
The Cache Creek watershed lies within California's North Coast Range, an area with abundant geologic sources of mercury (Hg) and a long history of Hg contamination (Rytuba, 2000). Bear Creek, Cache Creek, and the North Fork of Cache Creek are the major streams of the Cache Creek watershed, encompassing 2978 km2. The Cache Creek watershed contains soils naturally enriched in Hg as well as natural springs (both hot and cold) with varying levels of aqueous Hg (Domagalski and others, 2004, Suchanek and others, 2004, Holloway and others 2009). All three tributaries are known to be significant sources of anthropogenically derived Hg from historic mines, both Hg and gold (Au), and associated ore storage/processing sites and facilities (Slotton and others, 1995, 2004; CVRWQCB, 2003; Schwarzbach and others, 2001; Gassel and others, 2005; Suchanek and others., 2004, 2008a, 2009). Historically, two of the primary sources of mercury contamination in the upper part of Bear Creek have been the Rathburn and Petray Hg Mines. The Rathburn Hg mine was discovered and initially mined in the early 1890s. The Rathburn and the more recently developed Petray open pit mines are localized along fault zones in serpentinite that has been altered and cut by quartz and chalcedony veins. Cold saline-carbonate springs are located perepheral to the Hg deposits and effluent from the springs locally has high concentrations of Hg (Slowey and Rytuba, 2008). Several ephemeral tributaries to Bear Creek drain the mine area which is located on federal land managed by the U.S. Bureau of Land Management (USBLM). The USBLM requested that the U.S. Geological Survey (USGS) measure and characterize Hg and other geochemical constituents in sediment, water, and biota to establish baseline information prior to remediation of the Rathburn and Petray mines. Samples sites were established in Bear Creek upstream and downstream from the mine area. This report is made in response to the USBLM request, the lead agency mandated to conduct a Comprehensive Environmental Response, Compensation, and Liability Act (CERCLA) - Removal Site Investigation (RSI). The RSI applies to the possible removal of Hg-contaminated mine waste from Bear Creek. This report summarizes data obtained from field sampling of water, sediment, and biota in Bear Creek, above input from the mine area and downstream from the Rathburn-Petray mine area to the confluence with Cache Creek. Our results permit a preliminary assessment of the chemical constituents that could elevate levels of monomethyl Hg (MMeHg) in Bear Creek and its uptake by biota and provide baseline information for comparison to conditions after mine remediation is completed.
Codestream-Based Identification of JPEG 2000 Images with Different Coding Parameters
NASA Astrophysics Data System (ADS)
Watanabe, Osamu; Fukuhara, Takahiro; Kiya, Hitoshi
A method of identifying JPEG 2000 images with different coding parameters, such as code-block sizes, quantization-step sizes, and resolution levels, is presented. It does not produce false-negative matches regardless of different coding parameters (compression rate, code-block size, and discrete wavelet transform (DWT) resolutions levels) or quantization step sizes. This feature is not provided by conventional methods. Moreover, the proposed approach is fast because it uses the number of zero-bit-planes that can be extracted from the JPEG 2000 codestream by only parsing the header information without embedded block coding with optimized truncation (EBCOT) decoding. The experimental results revealed the effectiveness of image identification based on the new method.
Presenilin E318G variant and Alzheimer's disease risk: the Cache County study.
Hippen, Ariel A; Ebbert, Mark T W; Norton, Maria C; Tschanz, JoAnn T; Munger, Ronald G; Corcoran, Christopher D; Kauwe, John S K
2016-06-29
Alzheimer's disease is the leading cause of dementia in the elderly and the third most common cause of death in the United States. A vast number of genes regulate Alzheimer's disease, including Presenilin 1 (PSEN1). Multiple studies have attempted to locate novel variants in the PSEN1 gene that affect Alzheimer's disease status. A recent study suggested that one of these variants, PSEN1 E318G (rs17125721), significantly affects Alzheimer's disease status in a large case-control dataset, particularly in connection with the APOEε4 allele. Our study looks at the same variant in the Cache County Study on Memory and Aging, a large population-based dataset. We tested for association between E318G genotype and Alzheimer's disease status by running a series of Fisher's exact tests. We also performed logistic regression to test for an additive effect of E318G genotype on Alzheimer's disease status and for the existence of an interaction between E318G and APOEε4. In our Fisher's exact test, it appeared that APOEε4 carriers with an E318G allele have slightly higher risk for AD than those without the allele (3.3 vs. 3.8); however, the 95 % confidence intervals of those estimates overlapped completely, indicating non-significance. Our logistic regression model found a positive but non-significant main effect for E318G (p = 0.895). The interaction term between E318G and APOEε4 was also non-significant (p = 0.689). Our findings do not provide significant support for E318G as a risk factor for AD in APOEε4 carriers. Our calculations indicated that the overall sample used in the logistic regression models was adequately powered to detect the sort of effect sizes observed previously. However, the power analyses of our Fisher's exact tests indicate that our partitioned data was underpowered, particularly in regards to the low number of E318G carriers, both AD cases and controls, in the Cache county dataset. Thus, the differences in types of datasets used may help to explain the difference in effect magnitudes seen. Analyses in additional case-control datasets will be required to understand fully the effect of E318G on Alzheimer's disease status.
MEPAG Recommendations for a 2018 Mars Sample Return Caching Lander - Sample Types, Number, and Sizes
NASA Technical Reports Server (NTRS)
Allen, Carlton C.
2011-01-01
The return to Earth of geological and atmospheric samples from the surface of Mars is among the highest priority objectives of planetary science. The MEPAG Mars Sample Return (MSR) End-to-End International Science Analysis Group (MEPAG E2E-iSAG) was chartered to propose scientific objectives and priorities for returned sample science, and to map out the implications of these priorities, including for the proposed joint ESA-NASA 2018 mission that would be tasked with the crucial job of collecting and caching the samples. The E2E-iSAG identified four overarching scientific aims that relate to understanding: (A) the potential for life and its pre-biotic context, (B) the geologic processes that have affected the martian surface, (C) planetary evolution of Mars and its atmosphere, (D) potential for future human exploration. The types of samples deemed most likely to achieve the science objectives are, in priority order: (1A). Subaqueous or hydrothermal sediments (1B). Hydrothermally altered rocks or low temperature fluid-altered rocks (equal priority) (2). Unaltered igneous rocks (3). Regolith, including airfall dust (4). Present-day atmosphere and samples of sedimentary-igneous rocks containing ancient trapped atmosphere Collection of geologically well-characterized sample suites would add considerable value to interpretations of all collected rocks. To achieve this, the total number of rock samples should be about 30-40. In order to evaluate the size of individual samples required to meet the science objectives, the E2E-iSAG reviewed the analytical methods that would likely be applied to the returned samples by preliminary examination teams, for planetary protection (i.e., life detection, biohazard assessment) and, after distribution, by individual investigators. It was concluded that sample size should be sufficient to perform all high-priority analyses in triplicate. In keeping with long-established curatorial practice of extraterrestrial material, at least 40% by mass of each sample should be preserved to support future scientific investigations. Samples of 15-16 grams are considered optimal. The total mass of returned rocks, soils, blanks and standards should be approximately 500 grams. Atmospheric gas samples should be the equivalent of 50 cubic cm at 20 times Mars ambient atmospheric pressure.
Loayza, Andrea P.; Squeo, Francisco A.
2016-01-01
Scatter-hoarding rodents can act as both predators and dispersers for many large-seeded plants because they cache seeds for future use, but occasionally forget them in sites with high survival and establishment probabilities. The most important fruit or seed trait influencing rodent foraging behavior is seed size; rodents prefer large seeds because they have higher nutritional content, but this preference can be counterbalanced by the higher costs of handling larger seeds. We designed a cafeteria experiment to assess whether fruit and seed size of Myrcianthes coquimbensis, an endangered desert shrub, influence the decision-making process during foraging by three species of scatter-hoarding rodents differing in body size: Abrothrix olivaceus, Phyllotis darwini and Octodon degus. We found that the size of fruits and seeds influenced foraging behavior in the three rodent species; the probability of a fruit being harvested and hoarded was higher for larger fruits than for smaller ones. Patterns of fruit size preference were not affected by rodent size; all species were able to hoard fruits within the entire range of sizes offered. Finally, fruit and seed size had no effect on the probability of seed predation, rodents typically ate only the fleshy pulp of the fruits offered and discarded whole, intact seeds. In conclusion, our results reveal that larger M. coquimbensis fruits have higher probabilities of being harvested, and ultimately of its seeds being hoarded and dispersed by scatter-hoarding rodents. As this plant has no other dispersers, rodents play an important role in its recruitment dynamics. PMID:27861550
IDEAS and App Development Internship in Hardware and Software Design
NASA Technical Reports Server (NTRS)
Alrayes, Rabab D.
2016-01-01
In this report, I will discuss the tasks and projects I have completed while working as an electrical engineering intern during the spring semester of 2016 at NASA Kennedy Space Center. In the field of software development, I completed tasks for the G-O Caching Mobile App and the Asbestos Management Information System (AMIS) Web App. The G-O Caching Mobile App was written in HTML, CSS, and JavaScript on the Cordova framework, while the AMIS Web App is written in HTML, CSS, JavaScript, and C# on the AngularJS framework. My goals and objectives on these two projects were to produce an app with an eye-catching and intuitive User Interface (UI), which will attract more employees to participate; to produce a fully-tested, fully functional app which supports workforce engagement and exploration; to produce a fully-tested, fully functional web app that assists technicians working in asbestos management. I also worked in hardware development on the Integrated Display and Environmental Awareness System (IDEAS) wearable technology project. My tasks on this project were focused in PCB design and camera integration. My goals and objectives for this project were to successfully integrate fully functioning custom hardware extenders on the wearable technology headset to minimize the size of hardware on the smart glasses headset for maximum user comfort; to successfully integrate fully functioning camera onto the headset. By the end of this semester, I was able to successfully develop four extender boards to minimize hardware on the headset, and assisted in integrating a fully-functioning camera into the system.
Efficient Graph Based Assembly of Short-Read Sequences on Hybrid Core Architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sczyrba, Alex; Pratap, Abhishek; Canon, Shane
2011-03-22
Advanced architectures can deliver dramatically increased throughput for genomics and proteomics applications, reducing time-to-completion in some cases from days to minutes. One such architecture, hybrid-core computing, marries a traditional x86 environment with a reconfigurable coprocessor, based on field programmable gate array (FPGA) technology. In addition to higher throughput, increased performance can fundamentally improve research quality by allowing more accurate, previously impractical approaches. We will discuss the approach used by Convey?s de Bruijn graph constructor for short-read, de-novo assembly. Bioinformatics applications that have random access patterns to large memory spaces, such as graph-based algorithms, experience memory performance limitations on cache-based x86more » servers. Convey?s highly parallel memory subsystem allows application-specific logic to simultaneously access 8192 individual words in memory, significantly increasing effective memory bandwidth over cache-based memory systems. Many algorithms, such as Velvet and other de Bruijn graph based, short-read, de-novo assemblers, can greatly benefit from this type of memory architecture. Furthermore, small data type operations (four nucleotides can be represented in two bits) make more efficient use of logic gates than the data types dictated by conventional programming models.JGI is comparing the performance of Convey?s graph constructor and Velvet on both synthetic and real data. We will present preliminary results on memory usage and run time metrics for various data sets with different sizes, from small microbial and fungal genomes to very large cow rumen metagenome. For genomes with references we will also present assembly quality comparisons between the two assemblers.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yan, Rui; Praggastis, Brenda L.; Smith, William P.
While streaming data have become increasingly more popular in business and research communities, semantic models and processing software for streaming data have not kept pace. Traditional semantic solutions have not addressed transient data streams. Semantic web languages (e.g., RDF, OWL) have typically addressed static data settings and linked data approaches have predominantly addressed static or growing data repositories. Streaming data settings have some fundamental differences; in particular, data are consumed on the fly and data may expire. Stream reasoning, a combination of stream processing and semantic reasoning, has emerged with the vision of providing "smart" processing of streaming data. C-SPARQLmore » is a prominent stream reasoning system that handles semantic (RDF) data streams. Many stream reasoning systems including C-SPARQL use a sliding window and use data arrival time to evict data. For data streams that include expiration times, a simple arrival time scheme is inadequate if the window size does not match the expiration period. In this paper, we propose a cache-enabled, order-aware, ontology-based stream reasoning framework. This framework consumes RDF streams with expiration timestamps assigned by the streaming source. Our framework utilizes both arrival and expiration timestamps in its cache eviction policies. In addition, we introduce the notion of "semantic importance" which aims to address the relevance of data to the expected reasoning, thus enabling the eviction algorithms to be more context- and reasoning-aware when choosing what data to maintain for question answering. We evaluate this framework by implementing three different prototypes and utilizing five metrics. The trade-offs of deploying the proposed framework are also discussed.« less
Holloway, J.M.; Goldhaber, M.B.; Morrison, J.M.
2009-01-01
Historic Hg mining in the Cache Creek watershed in the Central California Coast Range has contributed to the downstream transport of Hg to the San Francisco Bay-Delta. Different aspects of Hg mobilization in soils, including pedogenesis, fluvial redistribution of sediment, volatilization and eolian transport were considered. The greatest soil concentrations (>30 mg Hg kg-1) in Cache Creek are associated with mineralized serpentinite, the host rock for Hg deposits. Upland soils with non-mineralized serpentine and sedimentary parent material also had elevated concentrations (0.9-3.7 mg Hg kg-1) relative to the average concentration in the region and throughout the conterminous United States (0.06 mg kg-1). Erosion of soil and destabilized rock and mobilization of tailings and calcines into surrounding streams have contributed to Hg-rich alluvial soil forming in wetlands and floodplains. The concentration of Hg in floodplain sediment shows sediment dispersion from low-order catchments (5.6-9.6 mg Hg kg-1 in Sulphur Creek; 0.5-61 mg Hg kg-1 in Davis Creek) to Cache Creek (0.1-0.4 mg Hg kg-1). These sediments, deposited onto the floodplain during high-flow storm events, yield elevated Hg concentrations (0.2-55 mg Hg kg-1) in alluvial soils in upland watersheds. Alluvial soils within the Cache Creek watershed accumulate Hg from upstream mining areas, with concentrations between 0.06 and 0.22 mg Hg kg-1 measured in soils ~90 km downstream from Hg mining areas. Alluvial soils have accumulated Hg released through historic mining activities, remobilizing this Hg to streams as the soils erode.
Kuprewicz, Erin K.
2015-01-01
Scatter hoarding of seeds by animals contributes significantly to forest-level processes, including plant recruitment and forest community composition. However, the potential positive and negative effects of caching on seed survival, germination success, and seedling survival have rarely been assessed through experimental studies. Here, I tested the hypothesis that seed burial mimicking caches made by scatter hoarding Central American agoutis (Dasyprocta punctate) enhances seed survival, germination, and growth by protecting seeds from seed predators and providing favorable microhabitats for germination. In a series of experiments, I used simulated agouti seed caches to assess how hoarding affects seed predation by ground-dwelling invertebrates and vertebrates for four plant species. I tracked germination and seedling growth of intact and beetle-infested seeds and, using exclosures, monitored the effects of mammals on seedling survival through time. All experiments were conducted over three years in a lowland wet forest in Costa Rica. The majority of hoarded palm seeds escaped predation by both invertebrates and vertebrates while exposed seeds suffered high levels of infestation and removal. Hoarding had no effect on infestation rates of D. panamensis, but burial negatively affected germination success by preventing endocarp dehiscence. Non-infested palm seeds had higher germination success and produced larger seedlings than infested seeds. Seedlings of A. alatum and I. deltoidea suffered high mortality by seed-eating mammals. Hoarding protected most seeds from predators and enhanced germination success (except for D. panamensis) and seedling growth, although mammals killed many seedlings of two plant species; all seedling deaths were due to seed removal from the plant base. Using experimental caches, this study shows that scatter hoarding is beneficial to most seeds and may positively affect plant propagation in tropical forests, although tradeoffs in seed survival do exist. PMID:25970832
Buck, Harleah G; Harkness, Karen; Ali, Muhammad Usman; Carroll, Sandra L; Kryworuchko, Jennifer; McGillion, Michael
2017-04-01
Caregivers (CGs) contribute important assistance with heart failure (HF) self-care, including daily maintenance, symptom monitoring, and management. Until CGs' contributions to self-care can be quantified, it is impossible to characterize it, account for its impact on patient outcomes, or perform meaningful cost analyses. The purpose of this study was to conduct psychometric testing and item reduction on the recently developed 34-item Caregiver Contribution to Heart Failure Self-care (CACHS) instrument using classical and item response theory methods. Fifty CGs (mean age 63 years ±12.84; 70% female) recruited from a HF clinic completed the CACHS in 2014 and results evaluated using classical test theory and item response theory. Items would be deleted for low (<.05) or high (>.95) endorsement, low (<.3) or high (>.7) corrected item-total correlations, significant pairwise correlation coefficients, floor or ceiling effects, relatively low latent trait and item information function levels (<1.5 and p > .5), and differential item functioning. After analysis, 14 items were excluded, resulting in a 20-item instrument (self-care maintenance eight items; monitoring seven items; and management five items). Most items demonstrated moderate to high discrimination (median 2.13, minimum .77, maximum 5.05), and appropriate item difficulty (-2.7 to 1.4). Internal consistency reliability was excellent (Cronbach α = .94, average inter-item correlation = .41) with no ceiling effects. The newly developed 20-item version of the CACHS is supported by rigorous instrument development and represents a novel instrument to measure CGs' contribution to HF self-care. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Malek, Esmaiel; Davis, Tess; Martin, Randal S.; Silva, Philip J.
2006-02-01
Logan, Utah, USA, had the nation's worst air pollution on 15 January, 2004. The high concentration of PM 2.5 (particulates smaller than 2.5 μm in diameter) in the air resulted from geographical, meteorological, and environmental aspects of Cache Valley. A strong inversion (increase of temperature with height) and light precipitation and/or wind were the major causes for trapping pollutants in the air. Other meteorological factors enhancing the inversion were: the prolonged high atmospheric surface pressure, a snow-covered surface which plunged temperatures to as low as - 23.6 °C on January 23rd and high reflection of solar radiation (up to about 80%), which caused less solar radiation absorption during the day throughout the most part of January 2004. Among non-meteorological factors are Cache Valley's small-basin geographical structure which traps air, with no big body of water to help the air circulation (as a result of differential heating and cooling rates for land and water), motor vehicle emissions, and existence of excess ammonia gas as a byproduct of livestock manure and urine. Concentration of PM 2.5 was monitored in downtown Logan. On January 15, 2004, the 24-h, filter-based concentration reached about 132.5 μg per cubic meter of air, an astonishingly high value compared to the values of 65 μg m - 3 and over, indicating a health alert for everyone. These tiny particles in the air have an enormous impact on health, aggravating heart and lung disease, triggering asthma and even death. The causes of this inversion and some suggestions to alleviate the wintertime particle concentration in Cache Valley will be addressed in this article.
WENGREEN, H.; NELSON, C.; MUNGER, R.G.; CORCORAN, C.
2013-01-01
Objective To examine associations between frequency of ready-to-eat-cereal (RTEC) consumption and cognitive function among elderly men and women of the Cache County Study on Memory and Aging in Utah. Design A population-based prospective cohort study established in Cache County, Utah in 1995. Setting and Participants 3831 men and women > 65 years of age who were living in Cache County, Utah in 1995. Measurement Diet was assessed using a 142-item food frequency questionnaire at baseline. Cognitive function was assessed using an adapted version of the Modified Mini-Mental State examination (3MS) at baseline and three subsequent interviews over 11 years. RTEC consumption was defined as daily, weekly, or infrequent use. Results In multivariable models, more frequent RTEC consumption was not associated with a cognitive benefit. Those consuming RTEC weekly but less than daily scored higher on their baseline 3MS than did those consuming RTEC more or less frequently (91.7, 90.6, 90.6, respectively; p-value <0.001). This association was maintained across 11 years of observation such that those consuming RTEC weekly but less than daily declined on average 3.96 points compared to an average 5.13 and 4.57 point decline for those consuming cereal more or less frequently (p-value = 0.0009). Conclusion Those consuming RTEC at least daily had poorer cognitive performance at baseline and over 11 years of follow-up compared to those who consumed cereal more or less frequently. RTEC is a nutrient dense food, but should not replace the consumption of other healthy foods in the diets’ of elderly people. Associations between RTEC consumption, dietary patterns, and cognitive function deserve further study. PMID:21369668
Performance implications from sizing a VM on multi-core systems: A Data analytic application s view
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lim, Seung-Hwan; Horey, James L; Begoli, Edmon
In this paper, we present a quantitative performance analysis of data analytics applications running on multi-core virtual machines. Such environments form the core of cloud computing. In addition, data analytics applications, such as Cassandra and Hadoop, are becoming increasingly popular on cloud computing platforms. This convergence necessitates a better understanding of the performance and cost implications of such hybrid systems. For example, the very rst step in hosting applications in virtualized environments, requires the user to con gure the number of virtual processors and the size of memory. To understand performance implications of this step, we benchmarked three Yahoo Cloudmore » Serving Benchmark (YCSB) workloads in a virtualized multi-core environment. Our measurements indicate that the performance of Cassandra for YCSB workloads does not heavily depend on the processing capacity of a system, while the size of the data set is critical to performance relative to allocated memory. We also identi ed a strong relationship between the running time of workloads and various hardware events (last level cache loads, misses, and CPU migrations). From this analysis, we provide several suggestions to improve the performance of data analytics applications running on cloud computing environments.« less
Johnson, Alfred A.; Carleton, John T.
1978-05-02
A graphite-moderated, water-cooled nuclear reactor including graphite blocks disposed in transverse alternate layers, one set of alternate layers consisting of alternate full size blocks and smaller blocks through which cooling tubes containing fuel extend, said smaller blocks consisting alternately of tube bearing blocks and support block, the support blocks being smaller than the tube bearing blocks, the aperture of each support block being tapered so as to provide the tube extending therethrough with a narrow region of support while being elsewhere spaced therefrom.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gylenhaal, J.; Bronevetsky, G.
2007-05-25
CLOMP is the C version of the Livermore OpenMP benchmark deeloped to measure OpenMP overheads and other performance impacts due to threading (like NUMA memory layouts, memory contention, cache effects, etc.) in order to influence future system design. Current best-in-class implementations of OpenMP have overheads at least ten times larger than is required by many of our applications for effective use of OpenMP. This benchmark shows the significant negative performance impact of these relatively large overheads and of other thread effects. The CLOMP benchmark highly configurable to allow a variety of problem sizes and threading effects to be studied andmore » it carefully checks its results to catch many common threading errors. This benchmark is expected to be included as part of the Sequoia Benchmark suite for the Sequoia procurement.« less
Parallel 3D-TLM algorithm for simulation of the Earth-ionosphere cavity
NASA Astrophysics Data System (ADS)
Toledo-Redondo, Sergio; Salinas, Alfonso; Morente-Molinera, Juan Antonio; Méndez, Antonio; Fornieles, Jesús; Portí, Jorge; Morente, Juan Antonio
2013-03-01
A parallel 3D algorithm for solving time-domain electromagnetic problems with arbitrary geometries is presented. The technique employed is the Transmission Line Modeling (TLM) method implemented in Shared Memory (SM) environments. The benchmarking performed reveals that the maximum speedup depends on the memory size of the problem as well as multiple hardware factors, like the disposition of CPUs, cache, or memory. A maximum speedup of 15 has been measured for the largest problem. In certain circumstances of low memory requirements, superlinear speedup is achieved using our algorithm. The model is employed to model the Earth-ionosphere cavity, thus enabling a study of the natural electromagnetic phenomena that occur in it. The algorithm allows complete 3D simulations of the cavity with a resolution of 10 km, within a reasonable timescale.
Change Detection of Mobile LIDAR Data Using Cloud Computing
NASA Astrophysics Data System (ADS)
Liu, Kun; Boehm, Jan; Alis, Christian
2016-06-01
Change detection has long been a challenging problem although a lot of research has been conducted in different fields such as remote sensing and photogrammetry, computer vision, and robotics. In this paper, we blend voxel grid and Apache Spark together to propose an efficient method to address the problem in the context of big data. Voxel grid is a regular geometry representation consisting of the voxels with the same size, which fairly suites parallel computation. Apache Spark is a popular distributed parallel computing platform which allows fault tolerance and memory cache. These features can significantly enhance the performance of Apache Spark and results in an efficient and robust implementation. In our experiments, both synthetic and real point cloud data are employed to demonstrate the quality of our method.
Mechanical analysis of the dry stone walls built by the Incas
NASA Astrophysics Data System (ADS)
Castro, Jaime; Vallejo, Luis E.; Estrada, Nicolas
2017-06-01
In this paper, the retaining walls in the agricultural terraces built by the Incas are analyzed from a mechanical point of view. In order to do so, ten different walls from the Lower Agricultural Sector of Machu Picchu, Perú, were selected using images from Google Street View and Google Earth Pro. Then, these walls were digitalized and their mechanical stability was evaluated. Firstly, it was found that these retaining walls are characterized by two distinctive features: disorder and a block size distribution with a large size span, i.e., the particle size varies from blocks that can be carried by one person to large blocks weighing several tons. Secondly, it was found that, thanks to the large span of the block size distribution, the factor of safety of the Inca retaining walls is remarkably close to those that are recommended in modern geotechnical design standards. This suggests that these structures were not only functional but also highly optimized, probably as a result of a careful trial and error procedure.
Accelerating a Particle-in-Cell Simulation Using a Hybrid Counting Sort
NASA Astrophysics Data System (ADS)
Bowers, K. J.
2001-11-01
In this article, performance limitations of the particle advance in a particle-in-cell (PIC) simulation are discussed. It is shown that the memory subsystem and cache-thrashing severely limit the speed of such simulations. Methods to implement a PIC simulation under such conditions are explored. An algorithm based on a counting sort is developed which effectively eliminates PIC simulation cache thrashing. Sustained performance gains of 40 to 70 percent are measured on commodity workstations for a minimal 2d2v electrostatic PIC simulation. More complete simulations are expected to have even better results as larger simulations are usually even more memory subsystem limited.
Introduction of Virtualization Technology to Multi-Process Model Checking
NASA Technical Reports Server (NTRS)
Leungwattanakit, Watcharin; Artho, Cyrille; Hagiya, Masami; Tanabe, Yoshinori; Yamamoto, Mitsuharu
2009-01-01
Model checkers find failures in software by exploring every possible execution schedule. Java PathFinder (JPF), a Java model checker, has been extended recently to cover networked applications by caching data transferred in a communication channel. A target process is executed by JPF, whereas its peer process runs on a regular virtual machine outside. However, non-deterministic target programs may produce different output data in each schedule, causing the cache to restart the peer process to handle the different set of data. Virtualization tools could help us restore previous states of peers, eliminating peer restart. This paper proposes the application of virtualization technology to networked model checking, concentrating on JPF.
Performance of hashed cache data migration schemes on multicomputers
NASA Technical Reports Server (NTRS)
Hiranandani, Seema; Saltz, Joel; Mehrotra, Piyush; Berryman, Harry
1991-01-01
After conducting an examination of several data-migration mechanisms which permit an explicit and controlled mapping of data to memory, a set of schemes for storage and retrieval of off-processor array elements is experimentally evaluated and modeled. All schemes considered have their basis in the use of hash tables for efficient access of nonlocal data. The techniques in question are those of hashed cache, partial enumeration, and full enumeration; in these, nonlocal data are stored in hash tables, so that the operative difference lies in the amount of memory used by each scheme and in the retrieval mechanism used for nonlocal data.