Twin-tailed fail-over for fileservers maintaining full performance in the presence of a failure
Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Heidelberger, Philip; Steinmacher-Burow, Burkhard D.
2008-02-12
A method for maintaining full performance of a file system in the presence of a failure is provided. The file system having N storage devices, where N is an integer greater than zero and N primary file servers where each file server is operatively connected to a corresponding storage device for accessing files therein. The file system further having a secondary file server operatively connected to at least one of the N storage devices. The method including: switching the connection of one of the N storage devices to the secondary file server upon a failure of one of the N primary file servers; and switching the connections of one or more of the remaining storage devices to a primary file server other than the failed file server as necessary so as to prevent a loss in performance and to provide each storage device with an operating file server.
NASA Technical Reports Server (NTRS)
Soltis, Steven R.; Ruwart, Thomas M.; OKeefe, Matthew T.
1996-01-01
The global file system (GFS) is a prototype design for a distributed file system in which cluster nodes physically share storage devices connected via a network-like fiber channel. Networks and network-attached storage devices have advanced to a level of performance and extensibility so that the previous disadvantages of shared disk architectures are no longer valid. This shared storage architecture attempts to exploit the sophistication of storage device technologies whereas a server architecture diminishes a device's role to that of a simple component. GFS distributes the file system responsibilities across processing nodes, storage across the devices, and file system resources across the entire storage pool. GFS caches data on the storage devices instead of the main memories of the machines. Consistency is established by using a locking mechanism maintained by the storage devices to facilitate atomic read-modify-write operations. The locking mechanism is being prototyped in the Silicon Graphics IRIX operating system and is accessed using standard Unix commands and modules.
Bent, John M.; Faibish, Sorin; Grider, Gary
2015-06-30
Cloud object storage is enabled for archived data, such as checkpoints and results, of high performance computing applications using a middleware process. A plurality of archived files, such as checkpoint files and results, generated by a plurality of processes in a parallel computing system are stored by obtaining the plurality of archived files from the parallel computing system; converting the plurality of archived files to objects using a log structured file system middleware process; and providing the objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
Storing files in a parallel computing system based on user or application specification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Faibish, Sorin; Bent, John M.; Nick, Jeffrey M.
2016-03-29
Techniques are provided for storing files in a parallel computing system based on a user-specification. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a specification from the distributed application indicating how the plurality of files should be stored; and storing one or more of the plurality of files in one or more storage nodes of a multi-tier storage system based on the specification. The plurality of files comprise a plurality of complete files and/or a plurality of sub-files. The specification can optionally be processed by a daemon executing on onemore » or more nodes in a multi-tier storage system. The specification indicates how the plurality of files should be stored, for example, identifying one or more storage nodes where the plurality of files should be stored.« less
Bent, John M.; Faibish, Sorin; Grider, Gary
2016-04-19
Cloud object storage is enabled for checkpoints of high performance computing applications using a middleware process. A plurality of files, such as checkpoint files, generated by a plurality of processes in a parallel computing system are stored by obtaining said plurality of files from said parallel computing system; converting said plurality of files to objects using a log structured file system middleware process; and providing said objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
High-performance metadata indexing and search in petascale data storage systems
NASA Astrophysics Data System (ADS)
Leung, A. W.; Shao, M.; Bisson, T.; Pasupathy, S.; Miller, E. L.
2008-07-01
Large-scale storage systems used for scientific applications can store petabytes of data and billions of files, making the organization and management of data in these systems a difficult, time-consuming task. The ability to search file metadata in a storage system can address this problem by allowing scientists to quickly navigate experiment data and code while allowing storage administrators to gather the information they need to properly manage the system. In this paper, we present Spyglass, a file metadata search system that achieves scalability by exploiting storage system properties, providing the scalability that existing file metadata search tools lack. In doing so, Spyglass can achieve search performance up to several thousand times faster than existing database solutions. We show that Spyglass enables important functionality that can aid data management for scientists and storage administrators.
LVFS: A Big Data File Storage Bridge for the HPC Community
NASA Astrophysics Data System (ADS)
Golpayegani, N.; Halem, M.; Mauoka, E.; Fonseca, L. F.
2015-12-01
Merging Big Data capabilities into High Performance Computing architecture starts at the file storage level. Heterogeneous storage systems are emerging which offer enhanced features for dealing with Big Data such as the IBM GPFS storage system's integration into Hadoop Map-Reduce. Taking advantage of these capabilities requires file storage systems to be adaptive and accommodate these new storage technologies. We present the extension of the Lightweight Virtual File System (LVFS) currently running as the production system for the MODIS Level 1 and Atmosphere Archive and Distribution System (LAADS) to incorporate a flexible plugin architecture which allows easy integration of new HPC hardware and/or software storage technologies without disrupting workflows, system architectures and only minimal impact on existing tools. We consider two essential aspects provided by the LVFS plugin architecture needed for the future HPC community. First, it allows for the seamless integration of new and emerging hardware technologies which are significantly different than existing technologies such as Segate's Kinetic disks and Intel's 3DXPoint non-volatile storage. Second is the transparent and instantaneous conversion between new software technologies and various file formats. With most current storage system a switch in file format would require costly reprocessing and nearly doubling of storage requirements. We will install LVFS on UMBC's IBM iDataPlex cluster with a heterogeneous storage architecture utilizing local, remote, and Seagate Kinetic storage as a case study. LVFS merges different kinds of storage architectures to show users a uniform layout and, therefore, prevent any disruption in workflows, architecture design, or tool usage. We will show how LVFS will convert HDF data produced by applying machine learning algorithms to Xco2 Level 2 data from the OCO-2 satellite to produce CO2 surface fluxes into GeoTIFF for visualization.
An analysis of image storage systems for scalable training of deep neural networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lim, Seung-Hwan; Young, Steven R; Patton, Robert M
This study presents a principled empirical evaluation of image storage systems for training deep neural networks. We employ the Caffe deep learning framework to train neural network models for three different data sets, MNIST, CIFAR-10, and ImageNet. While training the models, we evaluate five different options to retrieve training image data: (1) PNG-formatted image files on local file system; (2) pushing pixel arrays from image files into a single HDF5 file on local file system; (3) in-memory arrays to hold the pixel arrays in Python and C++; (4) loading the training data into LevelDB, a log-structured merge tree based key-valuemore » storage; and (5) loading the training data into LMDB, a B+tree based key-value storage. The experimental results quantitatively highlight the disadvantage of using normal image files on local file systems to train deep neural networks and demonstrate reliable performance with key-value storage based storage systems. When training a model on the ImageNet dataset, the image file option was more than 17 times slower than the key-value storage option. Along with measurements on training time, this study provides in-depth analysis on the cause of performance advantages/disadvantages of each back-end to train deep neural networks. We envision the provided measurements and analysis will shed light on the optimal way to architect systems for training neural networks in a scalable manner.« less
Long-Term file activity patterns in a UNIX workstation environment
NASA Technical Reports Server (NTRS)
Gibson, Timothy J.; Miller, Ethan L.
1998-01-01
As mass storage technology becomes more affordable for sites smaller than supercomputer centers, understanding their file access patterns becomes crucial for developing systems to store rarely used data on tertiary storage devices such as tapes and optical disks. This paper presents a new way to collect and analyze file system statistics for UNIX-based file systems. The collection system runs in user-space and requires no modification of the operating system kernel. The statistics package provides details about file system operations at the file level: creations, deletions, modifications, etc. The paper analyzes four months of file system activity on a university file system. The results confirm previously published results gathered from supercomputer file systems, but differ in several important areas. Files in this study were considerably smaller than those at supercomputer centers, and they were accessed less frequently. Additionally, the long-term creation rate on workstation file systems is sufficiently low so that all data more than a day old could be cheaply saved on a mass storage device, allowing the integration of time travel into every file system.
48 CFR 904.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 48 Federal Acquisition Regulations System 5 2012-10-01 2012-10-01 false Storage, handling, and disposal of contract files. 904.805 Section 904.805 Federal Acquisition Regulations System DEPARTMENT OF ENERGY GENERAL ADMINISTRATIVE MATTERS Government Contract Files 904.805 Storage, handling, and disposal...
48 CFR 2804.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 48 Federal Acquisition Regulations System 6 2012-10-01 2012-10-01 false Storage, handling, and disposal of contract files. 2804.805 Section 2804.805 Federal Acquisition Regulations System DEPARTMENT OF JUSTICE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 2804.805 Storage, handling, and disposal...
48 CFR 2804.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 48 Federal Acquisition Regulations System 6 2013-10-01 2013-10-01 false Storage, handling, and disposal of contract files. 2804.805 Section 2804.805 Federal Acquisition Regulations System DEPARTMENT OF JUSTICE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 2804.805 Storage, handling, and disposal...
48 CFR 904.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 48 Federal Acquisition Regulations System 5 2011-10-01 2011-10-01 false Storage, handling, and disposal of contract files. 904.805 Section 904.805 Federal Acquisition Regulations System DEPARTMENT OF ENERGY GENERAL ADMINISTRATIVE MATTERS Government Contract Files 904.805 Storage, handling, and disposal...
48 CFR 904.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 48 Federal Acquisition Regulations System 5 2014-10-01 2014-10-01 false Storage, handling, and disposal of contract files. 904.805 Section 904.805 Federal Acquisition Regulations System DEPARTMENT OF ENERGY GENERAL ADMINISTRATIVE MATTERS Government Contract Files 904.805 Storage, handling, and disposal...
48 CFR 904.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 48 Federal Acquisition Regulations System 5 2010-10-01 2010-10-01 false Storage, handling, and disposal of contract files. 904.805 Section 904.805 Federal Acquisition Regulations System DEPARTMENT OF ENERGY GENERAL ADMINISTRATIVE MATTERS Government Contract Files 904.805 Storage, handling, and disposal...
48 CFR 2804.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 48 Federal Acquisition Regulations System 6 2014-10-01 2014-10-01 false Storage, handling, and disposal of contract files. 2804.805 Section 2804.805 Federal Acquisition Regulations System DEPARTMENT OF JUSTICE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 2804.805 Storage, handling, and disposal...
48 CFR 2804.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 48 Federal Acquisition Regulations System 6 2011-10-01 2011-10-01 false Storage, handling, and disposal of contract files. 2804.805 Section 2804.805 Federal Acquisition Regulations System DEPARTMENT OF JUSTICE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 2804.805 Storage, handling, and disposal...
48 CFR 2804.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 48 Federal Acquisition Regulations System 6 2010-10-01 2010-10-01 true Storage, handling, and disposal of contract files. 2804.805 Section 2804.805 Federal Acquisition Regulations System DEPARTMENT OF JUSTICE General ADMINISTRATIVE MATTERS Government Contract Files 2804.805 Storage, handling, and disposal...
48 CFR 1304.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 48 Federal Acquisition Regulations System 5 2014-10-01 2014-10-01 false Storage, handling, and disposal of contract files. 1304.805 Section 1304.805 Federal Acquisition Regulations System DEPARTMENT OF COMMERCE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 1304.805 Storage, handling, and disposal...
48 CFR 1304.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 48 Federal Acquisition Regulations System 5 2012-10-01 2012-10-01 false Storage, handling, and disposal of contract files. 1304.805 Section 1304.805 Federal Acquisition Regulations System DEPARTMENT OF COMMERCE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 1304.805 Storage, handling, and disposal...
48 CFR 1304.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 48 Federal Acquisition Regulations System 5 2013-10-01 2013-10-01 false Storage, handling, and disposal of contract files. 1304.805 Section 1304.805 Federal Acquisition Regulations System DEPARTMENT OF COMMERCE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 1304.805 Storage, handling, and disposal...
48 CFR 904.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 48 Federal Acquisition Regulations System 5 2013-10-01 2013-10-01 false Storage, handling, and disposal of contract files. 904.805 Section 904.805 Federal Acquisition Regulations System DEPARTMENT OF ENERGY GENERAL ADMINISTRATIVE MATTERS Government Contract Files 904.805 Storage, handling, and disposal...
48 CFR 1304.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 48 Federal Acquisition Regulations System 5 2010-10-01 2010-10-01 false Storage, handling, and disposal of contract files. 1304.805 Section 1304.805 Federal Acquisition Regulations System DEPARTMENT OF COMMERCE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 1304.805 Storage, handling, and disposal...
48 CFR 1304.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 48 Federal Acquisition Regulations System 5 2011-10-01 2011-10-01 false Storage, handling, and disposal of contract files. 1304.805 Section 1304.805 Federal Acquisition Regulations System DEPARTMENT OF COMMERCE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 1304.805 Storage, handling, and disposal...
Storing files in a parallel computing system using list-based index to identify replica files
DOE Office of Scientific and Technical Information (OSTI.GOV)
Faibish, Sorin; Bent, John M.; Tzelnic, Percy
Improved techniques are provided for storing files in a parallel computing system using a list-based index to identify file replicas. A file and at least one replica of the file are stored in one or more storage nodes of the parallel computing system. An index for the file comprises at least one list comprising a pointer to a storage location of the file and a storage location of the at least one replica of the file. The file comprises one or more of a complete file and one or more sub-files. The index may also comprise a checksum value formore » one or more of the file and the replica(s) of the file. The checksum value can be evaluated to validate the file and/or the file replica(s). A query can be processed using the list.« less
48 CFR 4.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 48 Federal Acquisition Regulations System 1 2013-10-01 2013-10-01 false Storage, handling, and disposal of contract files. 4.805 Section 4.805 Federal Acquisition Regulations System FEDERAL ACQUISITION REGULATION GENERAL ADMINISTRATIVE MATTERS Government Contract Files 4.805 Storage, handling, and disposal of...
48 CFR 4.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 48 Federal Acquisition Regulations System 1 2011-10-01 2011-10-01 false Storage, handling, and disposal of contract files. 4.805 Section 4.805 Federal Acquisition Regulations System FEDERAL ACQUISITION REGULATION GENERAL ADMINISTRATIVE MATTERS Government Contract Files 4.805 Storage, handling, and disposal of...
48 CFR 4.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 48 Federal Acquisition Regulations System 1 2014-10-01 2014-10-01 false Storage, handling, and disposal of contract files. 4.805 Section 4.805 Federal Acquisition Regulations System FEDERAL ACQUISITION REGULATION GENERAL ADMINISTRATIVE MATTERS Government Contract Files 4.805 Storage, handling, and disposal of...
48 CFR 1304.805-70 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 48 Federal Acquisition Regulations System 5 2012-10-01 2012-10-01 false Storage, handling, and disposal of contract files. 1304.805-70 Section 1304.805-70 Federal Acquisition Regulations System DEPARTMENT OF COMMERCE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 1304.805-70 Storage, handling...
48 CFR 4.805 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 48 Federal Acquisition Regulations System 1 2012-10-01 2012-10-01 false Storage, handling, and disposal of contract files. 4.805 Section 4.805 Federal Acquisition Regulations System FEDERAL ACQUISITION REGULATION GENERAL ADMINISTRATIVE MATTERS Government Contract Files 4.805 Storage, handling, and disposal of...
48 CFR 1304.805-70 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 48 Federal Acquisition Regulations System 5 2011-10-01 2011-10-01 false Storage, handling, and disposal of contract files. 1304.805-70 Section 1304.805-70 Federal Acquisition Regulations System DEPARTMENT OF COMMERCE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 1304.805-70 Storage, handling...
48 CFR 1304.805-70 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 48 Federal Acquisition Regulations System 5 2014-10-01 2014-10-01 false Storage, handling, and disposal of contract files. 1304.805-70 Section 1304.805-70 Federal Acquisition Regulations System DEPARTMENT OF COMMERCE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 1304.805-70 Storage, handling...
48 CFR 1304.805-70 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 48 Federal Acquisition Regulations System 5 2010-10-01 2010-10-01 false Storage, handling, and disposal of contract files. 1304.805-70 Section 1304.805-70 Federal Acquisition Regulations System DEPARTMENT OF COMMERCE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 1304.805-70 Storage, handling...
48 CFR 1304.805-70 - Storage, handling, and disposal of contract files.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 48 Federal Acquisition Regulations System 5 2013-10-01 2013-10-01 false Storage, handling, and disposal of contract files. 1304.805-70 Section 1304.805-70 Federal Acquisition Regulations System DEPARTMENT OF COMMERCE GENERAL ADMINISTRATIVE MATTERS Government Contract Files 1304.805-70 Storage, handling...
The Design and Application of Data Storage System in Miyun Satellite Ground Station
NASA Astrophysics Data System (ADS)
Xue, Xiping; Su, Yan; Zhang, Hongbo; Liu, Bin; Yao, Meijuan; Zhao, Shu
2015-04-01
China has launched Chang'E-3 satellite in 2013, firstly achieved soft landing on moon for China's lunar probe. Miyun satellite ground station firstly used SAN storage network system based-on Stornext sharing software in Chang'E-3 mission. System performance fully meets the application requirements of Miyun ground station data storage.The Stornext file system is a sharing file system with high performance, supports multiple servers to access the file system using different operating system at the same time, and supports access to data on a variety of topologies, such as SAN and LAN. Stornext focused on data protection and big data management. It is announced that Quantum province has sold more than 70,000 licenses of Stornext file system worldwide, and its customer base is growing, which marks its leading position in the big data management.The responsibilities of Miyun satellite ground station are the reception of Chang'E-3 satellite downlink data and management of local data storage. The station mainly completes exploration mission management, receiving and management of observation data, and provides a comprehensive, centralized monitoring and control functions on data receiving equipment. The ground station applied SAN storage network system based on Stornext shared software for receiving and managing data reliable.The computer system in Miyun ground station is composed by business running servers, application workstations and other storage equipments. So storage systems need a shared file system which supports heterogeneous multi-operating system. In practical applications, 10 nodes simultaneously write data to the file system through 16 channels, and the maximum data transfer rate of each channel is up to 15MB/s. Thus the network throughput of file system is not less than 240MB/s. At the same time, the maximum capacity of each data file is up to 810GB. The storage system planned requires that 10 nodes simultaneously write data to the file system through 16 channels with 240MB/s network throughput.When it is integrated,sharing system can provide 1020MB/s write speed simultaneously.When the master storage server fails, the backup storage server takes over the normal service.The literacy of client will not be affected,in which switching time is less than 5s.The design and integrated storage system meet users requirements. Anyway, all-fiber way is too expensive in SAN; SCSI hard disk transfer rate may still be the bottleneck in the development of the entire storage system. Stornext can provide users with efficient sharing, management, automatic archiving of large numbers of files and hardware solutions. It occupies a leading position in big data management. Storage is the most popular sharing shareware, and there are drawbacks in Stornext: Firstly, Stornext software is expensive, in which charge by the sites. When the network scale is large, the purchase cost will be very high. Secondly, the parameters of Stornext software are more demands on the skills of technical staff. If there is a problem, it is difficult to exclude.
RAMA: A file system for massively parallel computers
NASA Technical Reports Server (NTRS)
Miller, Ethan L.; Katz, Randy H.
1993-01-01
This paper describes a file system design for massively parallel computers which makes very efficient use of a few disks per processor. This overcomes the traditional I/O bottleneck of massively parallel machines by storing the data on disks within the high-speed interconnection network. In addition, the file system, called RAMA, requires little inter-node synchronization, removing another common bottleneck in parallel processor file systems. Support for a large tertiary storage system can easily be integrated in lo the file system; in fact, RAMA runs most efficiently when tertiary storage is used.
Performance Modeling of Network-Attached Storage Device Based Hierarchical Mass Storage Systems
NASA Technical Reports Server (NTRS)
Menasce, Daniel A.; Pentakalos, Odysseas I.
1995-01-01
Network attached storage devices improve I/O performance by separating control and data paths and eliminating host intervention during the data transfer phase. Devices are attached to both a high speed network for data transfer and to a slower network for control messages. Hierarchical mass storage systems use disks to cache the most recently used files and a combination of robotic and manually mounted tapes to store the bulk of the files in the file system. This paper shows how queuing network models can be used to assess the performance of hierarchical mass storage systems that use network attached storage devices as opposed to host attached storage devices. Simulation was used to validate the model. The analytic model presented here can be used, among other things, to evaluate the protocols involved in 1/0 over network attached devices.
The storage system of PCM based on random access file system
NASA Astrophysics Data System (ADS)
Han, Wenbing; Chen, Xiaogang; Zhou, Mi; Li, Shunfen; Li, Gezi; Song, Zhitang
2016-10-01
Emerging memory technologies such as Phase change memory (PCM) tend to offer fast, random access to persistent storage with better scalability. It's a hot topic of academic and industrial research to establish PCM in storage hierarchy to narrow the performance gap. However, the existing file systems do not perform well with the emerging PCM storage, which access storage medium via a slow, block-based interface. In this paper, we propose a novel file system, RAFS, to bring about good performance of PCM, which is built in the embedded platform. We attach PCM chips to the memory bus and build RAFS on the physical address space. In the proposed file system, we simplify traditional system architecture to eliminate block-related operations and layers. Furthermore, we adopt memory mapping and bypassed page cache to reduce copy overhead between the process address space and storage device. XIP mechanisms are also supported in RAFS. To the best of our knowledge, we are among the first to implement file system on real PCM chips. We have analyzed and evaluated its performance with IOZONE benchmark tools. Our experimental results show that the RAFS on PCM outperforms Ext4fs on SDRAM with small record lengths. Based on DRAM, RAFS is significantly faster than Ext4fs by 18% to 250%.
Striped tertiary storage arrays
NASA Technical Reports Server (NTRS)
Drapeau, Ann L.
1993-01-01
Data stripping is a technique for increasing the throughput and reducing the response time of large access to a storage system. In striped magnetic or optical disk arrays, a single file is striped or interleaved across several disks; in a striped tape system, files are interleaved across tape cartridges. Because a striped file can be accessed by several disk drives or tape recorders in parallel, the sustained bandwidth to the file is greater than in non-striped systems, where access to the file are restricted to a single device. It is argued that applying striping to tertiary storage systems will provide needed performance and reliability benefits. The performance benefits of striping for applications using large tertiary storage systems is discussed. It will introduce commonly available tape drives and libraries, and discuss their performance limitations, especially focusing on the long latency of tape accesses. This section will also describe an event-driven tertiary storage array simulator that is being used to understand the best ways of configuring these storage arrays. The reliability problems of magnetic tape devices are discussed, and plans for modeling the overall reliability of striped tertiary storage arrays to identify the amount of error correction required are described. Finally, work being done by other members of the Sequoia group to address latency of accesses, optimizing tertiary storage arrays that perform mostly writes, and compression is discussed.
Simulating storage part of application with Simgrid
NASA Astrophysics Data System (ADS)
Wang, Cong
2017-10-01
Design of a file system simulation and visualization system, using simgrid API and visualization techniques to help users understanding and improving the file system portion of their application. The core of the simulator is the API provided by simgrid, cluefs tracks and catches the procedure of the I/O operation. Run the simulator simulating this application to generate the output visualization file, which can visualize the I/O action proportion and time series. Users can also change the parameters in the configuration file to change the parameters of the storage system such as reading and writing bandwidth, users can also adjust the storage strategy, test the performance, getting reference to be much easier to optimize the storage system. We have tested all the aspects of the simulator, the results suggest that the simulator performance can be believable.
Uncoupling File System Components for Bridging Legacy and Modern Storage Architectures
NASA Astrophysics Data System (ADS)
Golpayegani, N.; Halem, M.; Tilmes, C.; Prathapan, S.; Earp, D. N.; Ashkar, J. S.
2016-12-01
Long running Earth Science projects can span decades of architectural changes in both processing and storage environments. As storage architecture designs change over decades such projects need to adjust their tools, systems, and expertise to properly integrate such new technologies with their legacy systems. Traditional file systems lack the necessary support to accommodate such hybrid storage infrastructure resulting in more complex tool development to encompass all possible storage architectures used for the project. The MODIS Adaptive Processing System (MODAPS) and the Level 1 and Atmospheres Archive and Distribution System (LAADS) is an example of a project spanning several decades which has evolved into a hybrid storage architecture. MODAPS/LAADS has developed the Lightweight Virtual File System (LVFS) which ensures a seamless integration of all the different storage architectures, including standard block based POSIX compliant storage disks, to object based architectures such as the S3 compliant HGST Active Archive System, and the Seagate Kinetic disks utilizing the Kinetic Protocol. With LVFS, all analysis and processing tools used for the project continue to function unmodified regardless of the underlying storage architecture enabling MODAPS/LAADS to easily integrate any new storage architecture without the costly need to modify existing tools to utilize such new systems. Most file systems are designed as a single application responsible for using metadata to organizing the data into a tree, determine the location for data storage, and a method of data retrieval. We will show how LVFS' unique approach of treating these components in a loosely coupled fashion enables it to merge different storage architectures into a single uniform storage system which bridges the underlying hybrid architecture.
Tuning HDF5 subfiling performance on parallel file systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Byna, Suren; Chaarawi, Mohamad; Koziol, Quincey
Subfiling is a technique used on parallel file systems to reduce locking and contention issues when multiple compute nodes interact with the same storage target node. Subfiling provides a compromise between the single shared file approach that instigates the lock contention problems on parallel file systems and having one file per process, which results in generating a massive and unmanageable number of files. In this paper, we evaluate and tune the performance of recently implemented subfiling feature in HDF5. In specific, we explain the implementation strategy of subfiling feature in HDF5, provide examples of using the feature, and evaluate andmore » tune parallel I/O performance of this feature with parallel file systems of the Cray XC40 system at NERSC (Cori) that include a burst buffer storage and a Lustre disk-based storage. We also evaluate I/O performance on the Cray XC30 system, Edison, at NERSC. Our results show performance benefits of 1.2X to 6X performance advantage with subfiling compared to writing a single shared HDF5 file. We present our exploration of configurations, such as the number of subfiles and the number of Lustre storage targets to storing files, as optimization parameters to obtain superior I/O performance. Based on this exploration, we discuss recommendations for achieving good I/O performance as well as limitations with using the subfiling feature.« less
SAM-FS: LSC's New Solaris-Based Storage Management Product
NASA Technical Reports Server (NTRS)
Angell, Kent
1996-01-01
SAM-FS is a full featured hierarchical storage management (HSM) device that operates as a file system on Solaris-based machines. The SAM-FS file system provides the user with all of the standard UNIX system utilities and calls, and adds some new commands, i.e. archive, release, stage, sls, sfind, and a family of maintenance commands. The system also offers enhancements such as high performance virtual disk read and write, control of the disk through an extent array, and the ability to dynamically allocate block size. SAM-FS provides 'archive sets' which are groupings of data to be copied to secondary storage. In practice, as soon as a file is written to disk, SAM-FS will make copies onto secondary media. SAM-FS is a scalable storage management system. The system can manage millions of files per system, though this is limited today by the speed of UNIX and its utilities. In the future, a new search algorithm will be implemented that will remove logical and performance restrictions on the number of files managed.
The medium is NOT the message or Indefinitely long-term file storage at Leeds University
NASA Technical Reports Server (NTRS)
Holdsworth, David
1996-01-01
Approximately 3 years ago we implemented an archive file storage system which embodies experiences gained over more than 25 years of using and writing file storage systems. It is the third in-house system that we have written, and all three systems have been adopted by other institutions. This paper discusses the requirements for long-term data storage in a university environment, and describes how our present system is designed to meet these requirements indefinitely. Particular emphasis is laid on experiences from past systems, and their influence on current system design. We also look at the influence of the IEEE-MSS standard. We currently have the system operating in five UK universities. The system operates in a multi-server environment, and is currently operational with UNIX (SunOS4, Solaris2, SGI-IRIX, HP-UX), NetWare3 and NetWare4. PCs logged on to NetWare can also archive and recover files that live on their hard disks.
LOGISTIC MANAGEMENT INFORMATION SYSTEM - MANUAL DATA STORAGE AND RETRIEVAL SYSTEM.
Logistics Management Information System . The procedures are applicable to manual storage and retrieval of all data used in the Logistics Management ... Information System (LMIS) and include the following: (1) Action Officer data source file. (2) Action Officer presentation format file. (3) LMI Coordination
MarFS, a Near-POSIX Interface to Cloud Objects
DOE Office of Scientific and Technical Information (OSTI.GOV)
Inman, Jeffrey Thornton; Vining, William Flynn; Ransom, Garrett Wilson
The engineering forces driving development of “cloud” storage have produced resilient, cost-effective storage systems that can scale to 100s of petabytes, with good parallel access and bandwidth. These features would make a good match for the vast storage needs of High-Performance Computing datacenters, but cloud storage gains some of its capability from its use of HTTP-style Representational State Transfer (REST) semantics, whereas most large datacenters have legacy applications that rely on POSIX file-system semantics. MarFS is an open-source project at Los Alamos National Laboratory that allows us to present cloud-style object-storage as a scalable near-POSIX file system. We have alsomore » developed a new storage architecture to improve bandwidth and scalability beyond what’s available in commodity object stores, while retaining their resilience and economy. Additionally, we present a scheme for scaling the POSIX interface to allow billions of files in a single directory and trillions of files in total.« less
MarFS, a Near-POSIX Interface to Cloud Objects
Inman, Jeffrey Thornton; Vining, William Flynn; Ransom, Garrett Wilson; ...
2017-01-01
The engineering forces driving development of “cloud” storage have produced resilient, cost-effective storage systems that can scale to 100s of petabytes, with good parallel access and bandwidth. These features would make a good match for the vast storage needs of High-Performance Computing datacenters, but cloud storage gains some of its capability from its use of HTTP-style Representational State Transfer (REST) semantics, whereas most large datacenters have legacy applications that rely on POSIX file-system semantics. MarFS is an open-source project at Los Alamos National Laboratory that allows us to present cloud-style object-storage as a scalable near-POSIX file system. We have alsomore » developed a new storage architecture to improve bandwidth and scalability beyond what’s available in commodity object stores, while retaining their resilience and economy. Additionally, we present a scheme for scaling the POSIX interface to allow billions of files in a single directory and trillions of files in total.« less
Could Blobs Fuel Storage-Based Convergence between HPC and Big Data?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matri, Pierre; Alforov, Yevhen; Brandon, Alvaro
The increasingly growing data sets processed on HPC platforms raise major challenges for the underlying storage layer. A promising alternative to POSIX-IO- compliant file systems are simpler blobs (binary large objects), or object storage systems. Such systems offer lower overhead and better performance at the cost of largely unused features such as file hierarchies or permissions. Similarly, blobs are increasingly considered for replacing distributed file systems for big data analytics or as a base for storage abstractions such as key-value stores or time-series databases. This growing interest in such object storage on HPC and big data platforms raises the question:more » Are blobs the right level of abstraction to enable storage-based convergence between HPC and Big Data? In this paper we study the impact of blob-based storage for real-world applications on HPC and cloud environments. The results show that blobbased storage convergence is possible, leading to a significant performance improvement on both platforms« less
Storing files in a parallel computing system based on user-specified parser function
Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Manzanares, Adam; Torres, Aaron
2014-10-21
Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.
Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Torres, Aaron
2015-02-03
Techniques are provided for storing files in a parallel computing system using sub-files with semantically meaningful boundaries. A method is provided for storing at least one file generated by a distributed application in a parallel computing system. The file comprises one or more of a complete file and a plurality of sub-files. The method comprises the steps of obtaining a user specification of semantic information related to the file; providing the semantic information as a data structure description to a data formatting library write function; and storing the semantic information related to the file with one or more of the sub-files in one or more storage nodes of the parallel computing system. The semantic information provides a description of data in the file. The sub-files can be replicated based on semantically meaningful boundaries.
Efficient proof of ownership for cloud storage systems
NASA Astrophysics Data System (ADS)
Zhong, Weiwei; Liu, Zhusong
2017-08-01
Cloud storage system through the deduplication technology to save disk space and bandwidth, but the use of this technology has appeared targeted security attacks: the attacker can deceive the server to obtain ownership of the file by get the hash value of original file. In order to solve the above security problems and the different security requirements of the files in the cloud storage system, an efficient and information-theoretical secure proof of ownership sceme is proposed to support the file rating. Through the K-means algorithm to implement file rating, and use random seed technology and pre-calculation method to achieve safe and efficient proof of ownership scheme. Finally, the scheme is information-theoretical secure, and achieve better performance in the most sensitive areas of client-side I/O and computation.
Compiler-Directed File Layout Optimization for Hierarchical Storage Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ding, Wei; Zhang, Yuanrui; Kandemir, Mahmut
File layout of array data is a critical factor that effects the behavior of storage caches, and has so far taken not much attention in the context of hierarchical storage systems. The main contribution of this paper is a compiler-driven file layout optimization scheme for hierarchical storage caches. This approach, fully automated within an optimizing compiler, analyzes a multi-threaded application code and determines a file layout for each disk-resident array referenced by the code, such that the performance of the target storage cache hierarchy is maximized. We tested our approach using 16 I/O intensive application programs and compared its performancemore » against two previously proposed approaches under different cache space management schemes. Our experimental results show that the proposed approach improves the execution time of these parallel applications by 23.7% on average.« less
Compiler-Directed File Layout Optimization for Hierarchical Storage Systems
Ding, Wei; Zhang, Yuanrui; Kandemir, Mahmut; ...
2013-01-01
File layout of array data is a critical factor that effects the behavior of storage caches, and has so far taken not much attention in the context of hierarchical storage systems. The main contribution of this paper is a compiler-driven file layout optimization scheme for hierarchical storage caches. This approach, fully automated within an optimizing compiler, analyzes a multi-threaded application code and determines a file layout for each disk-resident array referenced by the code, such that the performance of the target storage cache hierarchy is maximized. We tested our approach using 16 I/O intensive application programs and compared its performancemore » against two previously proposed approaches under different cache space management schemes. Our experimental results show that the proposed approach improves the execution time of these parallel applications by 23.7% on average.« less
Parallel checksumming of data chunks of a shared data object using a log-structured file system
Bent, John M.; Faibish, Sorin; Grider, Gary
2016-09-06
Checksum values are generated and used to verify the data integrity. A client executing in a parallel computing system stores a data chunk to a shared data object on a storage node in the parallel computing system. The client determines a checksum value for the data chunk; and provides the checksum value with the data chunk to the storage node that stores the shared object. The data chunk can be stored on the storage node with the corresponding checksum value as part of the shared object. The storage node may be part of a Parallel Log-Structured File System (PLFS), and the client may comprise, for example, a Log-Structured File System client on a compute node or burst buffer. The checksum value can be evaluated when the data chunk is read from the storage node to verify the integrity of the data that is read.
NASA Technical Reports Server (NTRS)
Fanselow, J. L.; Vavrus, J. L.
1984-01-01
ARCH, file archival system for DEC VAX, provides for easy offline storage and retrieval of arbitrary files on DEC VAX system. System designed to eliminate situations that tie up disk space and lead to confusion when different programers develop different versions of same programs and associated files.
76 FR 66695 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-27
.... DWHS P04 System name: Reduction-In-Force Case Files (February 11, 2011, 76 FR 7825). Changes....'' * * * * * DWHS P04 System name: Reduction-In-Force Case Files. System location: Human Resources Directorate... system: Storage: Paper file folders. Retrievability: Filed alphabetically by last name. Safeguards...
High-performance mass storage system for workstations
NASA Technical Reports Server (NTRS)
Chiang, T.; Tang, Y.; Gupta, L.; Cooperman, S.
1993-01-01
Reduced Instruction Set Computer (RISC) workstations and Personnel Computers (PC) are very popular tools for office automation, command and control, scientific analysis, database management, and many other applications. However, when using Input/Output (I/O) intensive applications, the RISC workstations and PC's are often overburdened with the tasks of collecting, staging, storing, and distributing data. Also, by using standard high-performance peripherals and storage devices, the I/O function can still be a common bottleneck process. Therefore, the high-performance mass storage system, developed by Loral AeroSys' Independent Research and Development (IR&D) engineers, can offload a RISC workstation of I/O related functions and provide high-performance I/O functions and external interfaces. The high-performance mass storage system has the capabilities to ingest high-speed real-time data, perform signal or image processing, and stage, archive, and distribute the data. This mass storage system uses a hierarchical storage structure, thus reducing the total data storage cost, while maintaining high-I/O performance. The high-performance mass storage system is a network of low-cost parallel processors and storage devices. The nodes in the network have special I/O functions such as: SCSI controller, Ethernet controller, gateway controller, RS232 controller, IEEE488 controller, and digital/analog converter. The nodes are interconnected through high-speed direct memory access links to form a network. The topology of the network is easily reconfigurable to maximize system throughput for various applications. This high-performance mass storage system takes advantage of a 'busless' architecture for maximum expandability. The mass storage system consists of magnetic disks, a WORM optical disk jukebox, and an 8mm helical scan tape to form a hierarchical storage structure. Commonly used files are kept in the magnetic disk for fast retrieval. The optical disks are used as archive media, and the tapes are used as backup media. The storage system is managed by the IEEE mass storage reference model-based UniTree software package. UniTree software will keep track of all files in the system, will automatically migrate the lesser used files to archive media, and will stage the files when needed by the system. The user can access the files without knowledge of their physical location. The high-performance mass storage system developed by Loral AeroSys will significantly boost the system I/O performance and reduce the overall data storage cost. This storage system provides a highly flexible and cost-effective architecture for a variety of applications (e.g., realtime data acquisition with a signal and image processing requirement, long-term data archiving and distribution, and image analysis and enhancement).
78 FR 73509 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-06
... entering into CENTCOM's theater of operation. DATES: This proposed action will be effective on January 7... files and Manpower Authorization files, including name; grade/rank; Social Security Number (SSN); DoD ID..., retaining, and disposing of records in the system: Storage: Electronic storage media. Retrievability...
NASA Technical Reports Server (NTRS)
Berg, R. F.; Holcomb, J. E.; Kelroy, E. A.; Levine, D. A.; Mee, C., III
1970-01-01
Generalized information storage and retrieval system capable of generating and maintaining a file, gathering statistics, sorting output, and generating final reports for output is reviewed. File generation and file maintenance programs written for the system are general purpose routines.
Storage Optimization of Educational System Data
ERIC Educational Resources Information Center
Boja, Catalin
2006-01-01
There are described methods used to minimize data files dimension. There are defined indicators for measuring size of files and databases. The storage optimization process is based on selecting from a multitude of data storage models the one that satisfies the propose problem objective, maximization or minimization of the optimum criterion that is…
LVFS: A Scalable Petabye/Exabyte Data Storage System
NASA Astrophysics Data System (ADS)
Golpayegani, N.; Halem, M.; Masuoka, E. J.; Ye, G.; Devine, N. K.
2013-12-01
Managing petabytes of data with hundreds of millions of files is the first step necessary towards an effective big data computing and collaboration environment in a distributed system. We describe here the MODAPS LAADS Virtual File System (LVFS), a new storage architecture which replaces the previous MODAPS operational Level 1 Land Atmosphere Archive Distribution System (LAADS) NFS based approach to storing and distributing datasets from several instruments, such as MODIS, MERIS, and VIIRS. LAADS is responsible for the distribution of over 4 petabytes of data and over 300 million files across more than 500 disks. We present here the first LVFS big data comparative performance results and new capabilities not previously possible with the LAADS system. We consider two aspects in addressing inefficiencies of massive scales of data. First, is dealing in a reliable and resilient manner with the volume and quantity of files in such a dataset, and, second, minimizing the discovery and lookup times for accessing files in such large datasets. There are several popular file systems that successfully deal with the first aspect of the problem. Their solution, in general, is through distribution, replication, and parallelism of the storage architecture. The Hadoop Distributed File System (HDFS), Parallel Virtual File System (PVFS), and Lustre are examples of such file systems that deal with petabyte data volumes. The second aspect deals with data discovery among billions of files, the largest bottleneck in reducing access time. The metadata of a file, generally represented in a directory layout, is stored in ways that are not readily scalable. This is true for HDFS, PVFS, and Lustre as well. Recent experimental file systems, such as Spyglass or Pantheon, have attempted to address this problem through redesign of the metadata directory architecture. LVFS takes a radically different architectural approach by eliminating the need for a separate directory within the file system. The LVFS system replaces the NFS disk mounting approach of LAADS and utilizes the already existing highly optimized metadata database server, which is applicable to most scientific big data intensive compute systems. Thus, LVFS ties the existing storage system with the existing metadata infrastructure system which we believe leads to a scalable exabyte virtual file system. The uniqueness of the implemented design is not limited to LAADS but can be employed with most scientific data processing systems. By utilizing the Filesystem In Userspace (FUSE), a kernel module available in many operating systems, LVFS was able to replace the NFS system while staying POSIX compliant. As a result, the LVFS system becomes scalable to exabyte sizes owing to the use of highly scalable database servers optimized for metadata storage. The flexibility of the LVFS design allows it to organize data on the fly in different ways, such as by region, date, instrument or product without the need for duplication, symbolic links, or any other replication methods. We proposed here a strategic reference architecture that addresses the inefficiencies of scientific petabyte/exabyte file system access through the dynamic integration of the observing system's large metadata file.
Integration experiences and performance studies of A COTS parallel archive systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Hsing-bung; Scott, Cody; Grider, Bary
2010-01-01
Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf(COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching and lessmore » robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, ls, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petaflop/s computing system, LANL's Roadrunner, and demonstrated its capability to address requirements of future archival storage systems.« less
Integration experiments and performance studies of a COTS parallel archive system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Hsing-bung; Scott, Cody; Grider, Gary
2010-06-16
Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf (COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching andmore » less robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, Is, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petafiop/s computing system, LANL's Roadrunner machine, and demonstrated its capability to address requirements of future archival storage systems.« less
Securing the AliEn File Catalogue - Enforcing authorization with accountable file operations
NASA Astrophysics Data System (ADS)
Schreiner, Steffen; Bagnasco, Stefano; Sankar Banerjee, Subho; Betev, Latchezar; Carminati, Federico; Vladimirovna Datskova, Olga; Furano, Fabrizio; Grigoras, Alina; Grigoras, Costin; Mendez Lorenzo, Patricia; Peters, Andreas Joachim; Saiz, Pablo; Zhu, Jianlin
2011-12-01
The AliEn Grid Services, as operated by the ALICE Collaboration in its global physics analysis grid framework, is based on a central File Catalogue together with a distributed set of storage systems and the possibility to register links to external data resources. This paper describes several identified vulnerabilities in the AliEn File Catalogue access protocol regarding fraud and unauthorized file alteration and presents a more secure and revised design: a new mechanism, called LFN Booking Table, is introduced in order to keep track of access authorization in the transient state of files entering or leaving the File Catalogue. Due to a simplification of the original Access Envelope mechanism for xrootd-protocol-based storage systems, fundamental computational improvements of the mechanism were achieved as well as an up to 50% reduction of the credential's size. By extending the access protocol with signed status messages from the underlying storage system, the File Catalogue receives trusted information about a file's size and checksum and the protocol is no longer dependent on client trust. Altogether, the revised design complies with atomic and consistent transactions and allows for accountable, authentic, and traceable file operations. This paper describes these changes as part and beyond the development of AliEn version 2.19.
Checkpoint-Restart in User Space
DOE Office of Scientific and Technical Information (OSTI.GOV)
CRUISE implements a user-space file system that stores data in main memory and transparently spills over to other storage, like local flash memory or the parallel file system, as needed. CRUISE also exposes file contents fo remote direct memory access, allowing external tools to copy files to the parallel file system in the background with reduced CPU interruption.
Jefferson Lab Mass Storage and File Replication Services
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ian Bird; Ying Chen; Bryan Hess
Jefferson Lab has implemented a scalable, distributed, high performance mass storage system - JASMine. The system is entirely implemented in Java, provides access to robotic tape storage and includes disk cache and stage manager components. The disk manager subsystem may be used independently to manage stand-alone disk pools. The system includes a scheduler to provide policy-based access to the storage systems. Security is provided by pluggable authentication modules and is implemented at the network socket level. The tape and disk cache systems have well defined interfaces in order to provide integration with grid-based services. The system is in production andmore » being used to archive 1 TB per day from the experiments, and currently moves over 2 TB per day total. This paper will describe the architecture of JASMine; discuss the rationale for building the system, and present a transparent 3rd party file replication service to move data to collaborating institutes using JASMine, XM L, and servlet technology interfacing to grid-based file transfer mechanisms.« less
DICOM implementation on online tape library storage system
NASA Astrophysics Data System (ADS)
Komo, Darmadi; Dai, Hailei L.; Elghammer, David; Levine, Betty A.; Mun, Seong K.
1998-07-01
The main purpose of this project is to implement a Digital Image and Communications (DICOM) compliant online tape library system over the Internet. Once finished, the system will be used to store medical exams generated from U.S. ARMY Mobile ARMY Surgical Hospital (MASH) in Tuzla, Bosnia. A modified UC Davis implementation of DICOM storage class is used for this project. DICOM storage class user and provider are implemented as the system's interface to the Internet. The DICOM software provides flexible configuration options such as types of modalities and trusted remote DICOM hosts. Metadata is extracted from each exam and indexed in a relational database for query and retrieve purposes. The medical images are stored inside the Wolfcreek-9360 tape library system from StorageTek Corporation. The tape library system has nearline access to more than 1000 tapes. Each tape has a capacity of 800 megabytes making the total nearline tape access of around 1 terabyte. The tape library uses the Application Storage Manager (ASM) which provides cost-effective file management, storage, archival, and retrieval services. ASM automatically and transparently copies files from expensive magnetic disk to less expensive nearline tape library, and restores the files back when they are needed. The ASM also provides a crash recovery tool, which enable an entire file system restore in a short time. A graphical user interface (GUI) function is used to view the contents of the storage systems. This GUI also allows user to retrieve the stored exams and send the exams to anywhere on the Internet using DICOM protocols. With the integration of different components of the system, we have implemented a high capacity online tape library storage system that is flexible and easy to use. Using tape as an alternative storage media as opposed to the magnetic disk has the great potential of cost savings in terms of dollars per megabyte of storage. As this system matures, the Hospital Information Systems/Radiology Information Systems (HIS/RIS) or other components can be developed potentially as interfaces to the outside world thus widen the usage of the tape library system.
Emerging Cyber Infrastructure for NASA's Large-Scale Climate Data Analytics
NASA Astrophysics Data System (ADS)
Duffy, D.; Spear, C.; Bowen, M. K.; Thompson, J. H.; Hu, F.; Yang, C. P.; Pierce, D.
2016-12-01
The resolution of NASA climate and weather simulations have grown dramatically over the past few years with the highest-fidelity models reaching down to 1.5 KM global resolutions. With each doubling of the resolution, the resulting data sets grow by a factor of eight in size. As the climate and weather models push the envelope even further, a new infrastructure to store data and provide large-scale data analytics is necessary. The NASA Center for Climate Simulation (NCCS) has deployed the Data Analytics Storage Service (DASS) that combines scalable storage with the ability to perform in-situ analytics. Within this system, large, commonly used data sets are stored in a POSIX file system (write once/read many); examples of data stored include Landsat, MERRA2, observing system simulation experiments, and high-resolution downscaled reanalysis. The total size of this repository is on the order of 15 petabytes of storage. In addition to the POSIX file system, the NCCS has deployed file system connectors to enable emerging analytics built on top of the Hadoop File System (HDFS) to run on the same storage servers within the DASS. Coupled with a custom spatiotemporal indexing approach, users can now run emerging analytical operations built on MapReduce and Spark on the same data files stored within the POSIX file system without having to make additional copies. This presentation will discuss the architecture of this system and present benchmark performance measurements from traditional TeraSort and Wordcount to large-scale climate analytical operations on NetCDF data.
ERIC Educational Resources Information Center
Ranade, Sanjay; Schraeder, Jeff
1991-01-01
Presents an overview of the mass storage market and discusses mass storage systems as part of computer networks. Systems for personal computers, workstations, minicomputers, and mainframe computers are described; file servers are explained; system integration issues are raised; and future possibilities are suggested. (LRW)
Architecture and method for a burst buffer using flash technology
Tzelnic, Percy; Faibish, Sorin; Gupta, Uday K.; Bent, John; Grider, Gary Alan; Chen, Hsing-bung
2016-03-15
A parallel supercomputing cluster includes compute nodes interconnected in a mesh of data links for executing an MPI job, and solid-state storage nodes each linked to a respective group of the compute nodes for receiving checkpoint data from the respective compute nodes, and magnetic disk storage linked to each of the solid-state storage nodes for asynchronous migration of the checkpoint data from the solid-state storage nodes to the magnetic disk storage. Each solid-state storage node presents a file system interface to the MPI job, and multiple MPI processes of the MPI job write the checkpoint data to a shared file in the solid-state storage in a strided fashion, and the solid-state storage node asynchronously migrates the checkpoint data from the shared file in the solid-state storage to the magnetic disk storage and writes the checkpoint data to the magnetic disk storage in a sequential fashion.
Proof of cipher text ownership based on convergence encryption
NASA Astrophysics Data System (ADS)
Zhong, Weiwei; Liu, Zhusong
2017-08-01
Cloud storage systems save disk space and bandwidth through deduplication technology, but with the use of this technology has been targeted security attacks: the attacker can get the original file just use hash value to deceive the server to obtain the file ownership. In order to solve the above security problems and the different security requirements of cloud storage system files, an efficient information theory security proof of ownership scheme is proposed. This scheme protects the data through the convergence encryption method, and uses the improved block-level proof of ownership scheme, and can carry out block-level client deduplication to achieve efficient and secure cloud storage deduplication scheme.
Storage of sparse files using parallel log-structured file system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bent, John M.; Faibish, Sorin; Grider, Gary
A sparse file is stored without holes by storing a data portion of the sparse file using a parallel log-structured file system; and generating an index entry for the data portion, the index entry comprising a logical offset, physical offset and length of the data portion. The holes can be restored to the sparse file upon a reading of the sparse file. The data portion can be stored at a logical end of the sparse file. Additional storage efficiency can optionally be achieved by (i) detecting a write pattern for a plurality of the data portions and generating a singlemore » patterned index entry for the plurality of the patterned data portions; and/or (ii) storing the patterned index entries for a plurality of the sparse files in a single directory, wherein each entry in the single directory comprises an identifier of a corresponding sparse file.« less
Volume serving and media management in a networked, distributed client/server environment
NASA Technical Reports Server (NTRS)
Herring, Ralph H.; Tefend, Linda L.
1993-01-01
The E-Systems Modular Automated Storage System (EMASS) is a family of hierarchical mass storage systems providing complete storage/'file space' management. The EMASS volume server provides the flexibility to work with different clients (file servers), different platforms, and different archives with a 'mix and match' capability. The EMASS design considers all file management programs as clients of the volume server system. System storage capacities are tailored to customer needs ranging from small data centers to large central libraries serving multiple users simultaneously. All EMASS hardware is commercial off the shelf (COTS), selected to provide the performance and reliability needed in current and future mass storage solutions. All interfaces use standard commercial protocols and networks suitable to service multiple hosts. EMASS is designed to efficiently store and retrieve in excess of 10,000 terabytes of data. Current clients include CRAY's YMP Model E based Data Migration Facility (DMF), IBM's RS/6000 based Unitree, and CONVEX based EMASS File Server software. The VolSer software provides the capability to accept client or graphical user interface (GUI) commands from the operator's console and translate them to the commands needed to control any configured archive. The VolSer system offers advanced features to enhance media handling and particularly media mounting such as: automated media migration, preferred media placement, drive load leveling, registered MediaClass groupings, and drive pooling.
NASA Astrophysics Data System (ADS)
Poat, M. D.; Lauret, J.; Betts, W.
2015-12-01
The STAR online computing infrastructure has become an intensive dynamic system used for first-hand data collection and analysis resulting in a dense collection of data output. As we have transitioned to our current state, inefficient, limited storage systems have become an impediment to fast feedback to online shift crews. Motivation for a centrally accessible, scalable and redundant distributed storage system had become a necessity in this environment. OpenStack Swift Object Storage and Ceph Object Storage are two eye-opening technologies as community use and development have led to success elsewhere. In this contribution, OpenStack Swift and Ceph have been put to the test with single and parallel I/O tests, emulating real world scenarios for data processing and workflows. The Ceph file system storage, offering a POSIX compliant file system mounted similarly to an NFS share was of particular interest as it aligned with our requirements and was retained as our solution. I/O performance tests were run against the Ceph POSIX file system and have presented surprising results indicating true potential for fast I/O and reliability. STAR'S online compute farm historical use has been for job submission and first hand data analysis. The goal of reusing the online compute farm to maintain a storage cluster and job submission will be an efficient use of the current infrastructure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perelmutov, T.; Bakken, J.; Petravick, D.
Storage Resource Managers (SRMs) are middleware components whose function is to provide dynamic space allocation and file management on shared storage components on the Grid[1,2]. SRMs support protocol negotiation and reliable replication mechanism. The SRM standard supports independent SRM implementations, allowing for a uniform access to heterogeneous storage elements. SRMs allow site-specific policies at each location. Resource Reservations made through SRMs have limited lifetimes and allow for automatic collection of unused resources thus preventing clogging of storage systems with ''orphan'' files. At Fermilab, data handling systems use the SRM management interface to the dCache Distributed Disk Cache [5,6] and themore » Enstore Tape Storage System [15] as key components to satisfy current and future user requests [4]. The SAM project offers the SRM interface for its internal caches as well.« less
NASA Langley Research Center's distributed mass storage system
NASA Technical Reports Server (NTRS)
Pao, Juliet Z.; Humes, D. Creig
1993-01-01
There is a trend in institutions with high performance computing and data management requirements to explore mass storage systems with peripherals directly attached to a high speed network. The Distributed Mass Storage System (DMSS) Project at NASA LaRC is building such a system and expects to put it into production use by the end of 1993. This paper presents the design of the DMSS, some experiences in its development and use, and a performance analysis of its capabilities. The special features of this system are: (1) workstation class file servers running UniTree software; (2) third party I/O; (3) HIPPI network; (4) HIPPI/IPI3 disk array systems; (5) Storage Technology Corporation (STK) ACS 4400 automatic cartridge system; (6) CRAY Research Incorporated (CRI) CRAY Y-MP and CRAY-2 clients; (7) file server redundancy provision; and (8) a transition mechanism from the existent mass storage system to the DMSS.
Understanding Customer Dissatisfaction with Underutilized Distributed File Servers
NASA Technical Reports Server (NTRS)
Riedel, Erik; Gibson, Garth
1996-01-01
An important trend in the design of storage subsystems is a move toward direct network attachment. Network-attached storage offers the opportunity to off-load distributed file system functionality from dedicated file server machines and execute many requests directly at the storage devices. For this strategy to lead to better performance, as perceived by users, the response time of distributed operations must improve. In this paper we analyze measurements of an Andrew file system (AFS) server that we recently upgraded in an effort to improve client performance in our laboratory. While the original server's overall utilization was only about 3%, we show how burst loads were sufficiently intense to lead to period of poor response time significant enough to trigger customer dissatisfaction. In particular, we show how, after adjusting for network load and traffic to non-project servers, 50% of the variation in client response time was explained by variation in server central processing unit (CPU) use. That is, clients saw long response times in large part because the server was often over-utilized when it was used at all. Using these measures, we see that off-loading file server work in a network-attached storage architecture has to potential to benefit user response time. Computational power in such a system scales directly with storage capacity, so the slowdown during burst period should be reduced.
NASA Technical Reports Server (NTRS)
Kobler, Benjamin (Editor); Hariharan, P. C. (Editor)
1998-01-01
This document contains copies of those technical papers received in time for publication prior to the Sixth Goddard Conference on Mass Storage Systems and Technologies which is being held in cooperation with the Fifteenth IEEE Symposium on Mass Storage Systems at the University of Maryland-University College Inn and Conference Center March 23-26, 1998. As one of an ongoing series, this Conference continues to provide a forum for discussion of issues relevant to the management of large volumes of data. The Conference encourages all interested organizations to discuss long term mass storage requirements and experiences in fielding solutions. Emphasis is on current and future practical solutions addressing issues in data management, storage systems and media, data acquisition, long term retention of data, and data distribution. This year's discussion topics include architecture, tape optimization, new technology, performance, standards, site reports, vendor solutions. Tutorials will be available on shared file systems, file system backups, data mining, and the dynamics of obsolescence.
Organizing and Typing Persistent Objects Within an Object-Oriented Framework
NASA Technical Reports Server (NTRS)
Madany, Peter W.; Campbell, Roy H.
1991-01-01
Conventional operating systems provide little or no direct support for the services required for an efficient persistent object system implementation. We have built a persistent object scheme using a customization and extension of an object-oriented operating system called Choices. Choices includes a framework for the storage of persistent data that is suited to the construction of both conventional file system and persistent object system. In this paper we describe three areas in which persistent object support differs from file system support: storage organization, storage management, and typing. Persistent object systems must support various sizes of objects efficiently. Customizable containers, which are themselves persistent objects and can be nested, support a wide range of object sizes in Choices. Collections of persistent objects that are accessed as an aggregate and collections of light-weight persistent objects can be clustered in containers that are nested within containers for larger objects. Automated garbage collection schemes are added to storage management and have a major impact on persistent object applications. The Choices persistent object store provides extensible sets of persistent object types. The store contains not only the data for persistent objects but also the names of the classes to which they belong and the code for the operation of the classes. Besides presenting persistent object storage organization, storage management, and typing, this paper discusses how persistent objects are named and used within the Choices persistent data/file system framework.
Design and evaluation of a hybrid storage system in HEP environment
NASA Astrophysics Data System (ADS)
Xu, Qi; Cheng, Yaodong; Chen, Gang
2017-10-01
Nowadays, the High Energy Physics experiments produce a large amount of data. These data are stored in mass storage systems which need to balance the cost, performance and manageability. In this paper, a hybrid storage system including SSDs (Solid-state Drive) and HDDs (Hard Disk Drive) is designed to accelerate data analysis and maintain a low cost. The performance of accessing files is a decisive factor for the HEP computing system. A new deployment model of Hybrid Storage System in High Energy Physics is proposed which is proved to have higher I/O performance. The detailed evaluation methods and the evaluations about SSD/HDD ratio, and the size of the logic block are also given. In all evaluations, sequential-read, sequential-write, random-read and random-write are all tested to get the comprehensive results. The results show the Hybrid Storage System has good performance in some fields such as accessing big files in HEP.
Integrating UniTree with the data migration API
NASA Technical Reports Server (NTRS)
Schrodel, David G.
1994-01-01
The Data Migration Application Programming Interface (DMAPI) has the potential to allow developers of open systems Hierarchical Storage Management (HSM) products to virtualize native file systems without the requirement to make changes to the underlying operating system. This paper describes advantages of virtualizing native file systems in hierarchical storage management systems, the DMAPI at a high level, what the goals are for the interface, and the integration of the Convex UniTree+HSM with DMAPI along with some of the benefits derived in the resulting product.
Rethinking key–value store for parallel I/O optimization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kougkas, Anthony; Eslami, Hassan; Sun, Xian-He
2015-01-26
Key-value stores are being widely used as the storage system for large-scale internet services and cloud storage systems. However, they are rarely used in HPC systems, where parallel file systems are the dominant storage solution. In this study, we examine the architecture differences and performance characteristics of parallel file systems and key-value stores. We propose using key-value stores to optimize overall Input/Output (I/O) performance, especially for workloads that parallel file systems cannot handle well, such as the cases with intense data synchronization or heavy metadata operations. We conducted experiments with several synthetic benchmarks, an I/O benchmark, and a real application.more » We modeled the performance of these two systems using collected data from our experiments, and we provide a predictive method to identify which system offers better I/O performance given a specific workload. The results show that we can optimize the I/O performance in HPC systems by utilizing key-value stores.« less
Optimizing the Use of Storage Systems Provided by Cloud Computing Environments
NASA Astrophysics Data System (ADS)
Gallagher, J. H.; Potter, N.; Byrne, D. A.; Ogata, J.; Relph, J.
2013-12-01
Cloud computing systems present a set of features that include familiar computing resources (albeit augmented to support dynamic scaling of processing power) bundled with a mix of conventional and unconventional storage systems. The linux base on which many Cloud environments (e.g., Amazon) are based make it tempting to assume that any Unix software will run efficiently in this environment efficiently without change. OPeNDAP and NODC collaborated on a short project to explore how the S3 and Glacier storage systems provided by the Amazon Cloud Computing infrastructure could be used with a data server developed primarily to access data stored in a traditional Unix file system. Our work used the Amazon cloud system, but we strived for designs that could be adapted easily to other systems like OpenStack. Lastly, we evaluated different architectures from a computer security perspective. We found that there are considerable issues associated with treating S3 as if it is a traditional file system, even though doing so is conceptually simple. These issues include performance penalties because using a software tool that emulates a traditional file system to store data in S3 performs poorly when compared to a storing data directly in S3. We also found there are important benefits beyond performance to ensuring that data written to S3 can directly accessed without relying on a specific software tool. To provide a hierarchical organization to the data stored in S3, we wrote 'catalog' files, using XML. These catalog files map discrete files to S3 access keys. Like a traditional file system's directories, the catalogs can also contain references to other catalogs, providing a simple but effective hierarchy overlaid on top of S3's flat storage space. An added benefit to these catalogs is that they can be viewed in a web browser; our storage scheme provides both efficient access for the data server and access via a web browser. We also looked at the Glacier storage system and found that the system's response characteristics are very different from a traditional file system or database; it behaves like a near-line storage system. To be used by a traditional data server, the underlying access protocol must support asynchronous accesses. This is because the Glacier system takes a minimum of four hours to deliver any data object, so systems built with the expectation of instant access (i.e., most web systems) must be fundamentally changed to use Glacier. Part of a related project has been to develop an asynchronous access mode for OPeNDAP, and we have developed a design using that new addition to the DAP protocol with Glacier as a near-line mass store. In summary, we found that both S3 and Glacier require special treatment to be effectively used by a data server. It is important to add (new) interfaces to data servers that enable them to use these storage devices through their native interfaces. We also found that our designs could easily map to a cloud environment based on OpenStack. Lastly, we noted that while these designs invited more liberal use of remote references for data objects, that can expose software to new security risks.
[PVFS 2000: An operational parallel file system for Beowulf
NASA Technical Reports Server (NTRS)
Ligon, Walt
2004-01-01
The approach has been to develop Parallel Virtual File System version 2 (PVFS2) , retaining the basic philosophy of the original file system but completely rewriting the code. It shows the architecture of the server and client components. BMI - BMI is the network abstraction layer. It is designed with a common driver and modules for each protocol supported. The interface is non-blocking, and provides mechanisms for optimizations including pinning user buffers. Currently TCP/IP and GM(Myrinet) modules have been implemented. Trove -Trove is the storage abstraction layer. It provides for storing both data spaces and name/value pairs. Trove can also be implemented using different underlying storage mechanisms including native files, raw disk partitions, SQL and other databases. The current implementation uses native files for data spaces and Berkeley db for name/value pairs.
NASA ARCH- A FILE ARCHIVAL SYSTEM FOR THE DEC VAX
NASA Technical Reports Server (NTRS)
Scott, P. J.
1994-01-01
The function of the NASA ARCH system is to provide a permanent storage area for files that are infrequently accessed. The NASA ARCH routines were designed to provide a simple mechanism by which users can easily store and retrieve files. The user treats NASA ARCH as the interface to a black box where files are stored. There are only five NASA ARCH user commands, even though NASA ARCH employs standard VMS directives and the VAX BACKUP utility. Special care is taken to provide the security needed to insure file integrity over a period of years. The archived files may exist in any of three storage areas: a temporary buffer, the main buffer, and a magnetic tape library. When the main buffer fills up, it is transferred to permanent magnetic tape storage and deleted from disk. Files may be restored from any of the three storage areas. A single file, multiple files, or entire directories can be stored and retrieved. archived entities hold the same name, extension, version number, and VMS file protection scheme as they had in the user's account prior to archival. NASA ARCH is capable of handling up to 7 directory levels. Wildcards are supported. User commands include TEMPCOPY, DISKCOPY, DELETE, RESTORE, and DIRECTORY. The DIRECTORY command searches a directory of savesets covering all three archival areas, listing matches according to area, date, filename, or other criteria supplied by the user. The system manager commands include 1) ARCHIVE- to transfer the main buffer to duplicate magnetic tapes, 2) REPORTto determine when the main buffer is full enough to archive, 3) INCREMENT- to back up the partially filled main buffer, and 4) FULLBACKUP- to back up the entire main buffer. On-line help files are provided for all NASA ARCH commands. NASA ARCH is written in DEC VAX DCL for interactive execution and has been implemented on a DEC VAX computer operating under VMS 4.X. This program was developed in 1985.
Using the K-25 C TD Common File System: A guide to CFSI (CFS Interface)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1989-12-01
A CFS (Common File System) is a large, centralized file management and storage facility based on software developed at Los Alamos National Laboratory. This manual is a guide to use of the CFS available to users of the Cray UNICOS system at Martin Marietta Energy Systems, Inc., in Oak Ridge, Tennessee.
Experiences From NASA/Langley's DMSS Project
NASA Technical Reports Server (NTRS)
1996-01-01
There is a trend in institutions with high performance computing and data management requirements to explore mass storage systems with peripherals directly attached to a high speed network. The Distributed Mass Storage System (DMSS) Project at the NASA Langley Research Center (LaRC) has placed such a system into production use. This paper will present the experiences, both good and bad, we have had with this system since putting it into production usage. The system is comprised of: 1) National Storage Laboratory (NSL)/UniTree 2.1, 2) IBM 9570 HIPPI attached disk arrays (both RAID 3 and RAID 5), 3) IBM RS6000 server, 4) HIPPI/IPI3 third party transfers between the disk array systems and the supercomputer clients, a CRAY Y-MP and a CRAY 2, 5) a "warm spare" file server, 6) transition software to convert from CRAY's Data Migration Facility (DMF) based system to DMSS, 7) an NSC PS32 HIPPI switch, and 8) a STK 4490 robotic library accessed from the IBM RS6000 block mux interface. This paper will cover: the performance of the DMSS in the following areas: file transfer rates, migration and recall, and file manipulation (listing, deleting, etc.); the appropriateness of a workstation class of file server for NSL/UniTree with LaRC's present storage requirements in mind the role of the third party transfers between the supercomputers and the DMSS disk array systems in DMSS; a detailed comparison (both in performance and functionality) between the DMF and DMSS systems LaRC's enhancements to the NSL/UniTree system administration environment the mechanism for DMSS to provide file server redundancy the statistics on the availability of DMSS the design and experiences with the locally developed transparent transition software which allowed us to make over 1.5 million DMF files available to NSL/UniTree with minimal system outage
Dynamic Non-Hierarchical File Systems for Exascale Storage
DOE Office of Scientific and Technical Information (OSTI.GOV)
Long, Darrell E.; Miller, Ethan L
This constitutes the final report for “Dynamic Non-Hierarchical File Systems for Exascale Storage”. The ultimate goal of this project was to improve data management in scientific computing and high-end computing (HEC) applications, and to achieve this goal we proposed: to develop the first, HEC-targeted, file system featuring rich metadata and provenance collection, extreme scalability, and future storage hardware integration as core design goals, and to evaluate and develop a flexible non-hierarchical file system interface suitable for providing more powerful and intuitive data management interfaces to HEC and scientific computing users. Data management is swiftly becoming a serious problem in themore » scientific community – while copious amounts of data are good for obtaining results, finding the right data is often daunting and sometimes impossible. Scientists participating in a Department of Energy workshop noted that most of their time was spent “...finding, processing, organizing, and moving data and it’s going to get much worse”. Scientists should not be forced to become data mining experts in order to retrieve the data they want, nor should they be expected to remember the naming convention they used several years ago for a set of experiments they now wish to revisit. Ideally, locating the data you need would be as easy as browsing the web. Unfortunately, existing data management approaches are usually based on hierarchical naming, a 40 year-old technology designed to manage thousands of files, not exabytes of data. Today’s systems do not take advantage of the rich array of metadata that current high-end computing (HEC) file systems can gather, including content-based metadata and provenance1 information. As a result, current metadata search approaches are typically ad hoc and often work by providing a parallel management system to the “main” file system, as is done in Linux (the locate utility), personal computers, and enterprise search appliances. These search applications are often optimized for a single file system, making it difficult to move files and their metadata between file systems. Users have tried to solve this problem in several ways, including the use of separate databases to index file properties, the encoding of file properties into file names, and separately gathering and managing provenance data, but none of these approaches has worked well, either due to limited usefulness or scalability, or both. Our research addressed several key issues: High-performance, real-time metadata harvesting: extracting important attributes from files dynamically and immediately updating indexes used to improve search; Transparent, automatic, and secure provenance capture: recording the data inputs and processing steps used in the production of each file in the system; Scalable indexing: indexes that are optimized for integration with the file system; Dynamic file system structure: our approach provides dynamic directories similar to those in semantic file systems, but these are the native organization rather than a feature grafted onto a conventional system. In addition to these goals, our research effort will include evaluating the impact of new storage technologies on the file system design and performance. In particular, the indexing and metadata harvesting functions can potentially benefit from the performance improvements promised by new storage class memories.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bent, John M.; Faibish, Sorin; Pedone, Jr., James M.
A cluster file system is provided having a plurality of distributed metadata servers with shared access to one or more shared low latency persistent key-value metadata stores. A metadata server comprises an abstract storage interface comprising a software interface module that communicates with at least one shared persistent key-value metadata store providing a key-value interface for persistent storage of key-value metadata. The software interface module provides the key-value metadata to the at least one shared persistent key-value metadata store in a key-value format. The shared persistent key-value metadata store is accessed by a plurality of metadata servers. A metadata requestmore » can be processed by a given metadata server independently of other metadata servers in the cluster file system. A distributed metadata storage environment is also disclosed that comprises a plurality of metadata servers having an abstract storage interface to at least one shared persistent key-value metadata store.« less
Mass-storage management for distributed image/video archives
NASA Astrophysics Data System (ADS)
Franchi, Santina; Guarda, Roberto; Prampolini, Franco
1993-04-01
The realization of image/video database requires a specific design for both database structures and mass storage management. This issue has addressed the project of the digital image/video database system that has been designed at IBM SEMEA Scientific & Technical Solution Center. Proper database structures have been defined to catalog image/video coding technique with the related parameters, and the description of image/video contents. User workstations and servers are distributed along a local area network. Image/video files are not managed directly by the DBMS server. Because of their wide size, they are stored outside the database on network devices. The database contains the pointers to the image/video files and the description of the storage devices. The system can use different kinds of storage media, organized in a hierarchical structure. Three levels of functions are available to manage the storage resources. The functions of the lower level provide media management. They allow it to catalog devices and to modify device status and device network location. The medium level manages image/video files on a physical basis. It manages file migration between high capacity media and low access time media. The functions of the upper level work on image/video file on a logical basis, as they archive, move and copy image/video data selected by user defined queries. These functions are used to support the implementation of a storage management strategy. The database information about characteristics of both storage devices and coding techniques are used by the third level functions to fit delivery/visualization requirements and to reduce archiving costs.
Data Resilience in the dCache Storage System
Rossi, A. L.; Adeyemi, F.; Ashish, A.; ...
2017-11-23
In this study we discuss design, implementation considerations, and performance of a new Resilience Service in the dCache storage system responsible for file availability and durability functionality.
Permanent-File-Validation Utility Computer Program
NASA Technical Reports Server (NTRS)
Derry, Stephen D.
1988-01-01
Errors in files detected and corrected during operation. Permanent File Validation (PFVAL) utility computer program provides CDC CYBER NOS sites with mechanism to verify integrity of permanent file base. Locates and identifies permanent file errors in Mass Storage Table (MST) and Track Reservation Table (TRT), in permanent file catalog entries (PFC's) in permit sectors, and in disk sector linkage. All detected errors written to listing file and system and job day files. Program operates by reading system tables , catalog track, permit sectors, and disk linkage bytes to vaidate expected and actual file linkages. Used extensively to identify and locate errors in permanent files and enable online correction, reducing computer-system downtime.
NAFFS: network attached flash file system for cloud storage on portable consumer electronics
NASA Astrophysics Data System (ADS)
Han, Lin; Huang, Hao; Xie, Changsheng
Cloud storage technology has become a research hotspot in recent years, while the existing cloud storage services are mainly designed for data storage needs with stable high speed Internet connection. Mobile Internet connections are often unstable and the speed is relatively low. These native features of mobile Internet limit the use of cloud storage in portable consumer electronics. The Network Attached Flash File System (NAFFS) presented the idea of taking the portable device built-in NAND flash memory as the front-end cache of virtualized cloud storage device. Modern portable devices with Internet connection have built-in more than 1GB NAND Flash, which is quite enough for daily data storage. The data transfer rate of NAND flash device is much higher than mobile Internet connections[1], and its non-volatile feature makes it very suitable as the cache device of Internet cloud storage on portable device, which often have unstable power supply and intermittent Internet connection. In the present work, NAFFS is evaluated with several benchmarks, and its performance is compared with traditional network attached file systems, such as NFS. Our evaluation results indicate that the NAFFS achieves an average accessing speed of 3.38MB/s, which is about 3 times faster than directly accessing cloud storage by mobile Internet connection, and offers a more stable interface than that of directly using cloud storage API. Unstable Internet connection and sudden power off condition are tolerable, and no data in cache will be lost in such situation.
Using Solid State Disk Array as a Cache for LHC ATLAS Data Analysis
NASA Astrophysics Data System (ADS)
Yang, W.; Hanushevsky, A. B.; Mount, R. P.; Atlas Collaboration
2014-06-01
User data analysis in high energy physics presents a challenge to spinning-disk based storage systems. The analysis is data intense, yet reads are small, sparse and cover a large volume of data files. It is also unpredictable due to users' response to storage performance. We describe here a system with an array of Solid State Disk as a non-conventional, standalone file level cache in front of the spinning disk storage to help improve the performance of LHC ATLAS user analysis at SLAC. The system uses several days of data access records to make caching decisions. It can also use information from other sources such as a work-flow management system. We evaluate the performance of the system both in terms of caching and its impact on user analysis jobs. The system currently uses Xrootd technology, but the technique can be applied to any storage system.
The Scalable Checkpoint/Restart Library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moody, A.
The Scalable Checkpoint/Restart (SCR) library provides an interface that codes may use to worite our and read in application-level checkpoints in a scalable fashion. In the current implementation, checkpoint files are cached in local storage (hard disk or RAM disk) on the compute nodes. This technique provides scalable aggregate bandwidth and uses storage resources that are fully dedicated to the job. This approach addresses the two common drawbacks of checkpointing a large-scale application to a shared parallel file system, namely, limited bandwidth and file system contention. In fact, on current platforms, SCR scales linearly with the number of compute nodes.more » It has been benchmarked as high as 720GB/s on 1094 nodes of Atlas, which is nearly two orders of magnitude faster thanthe parallel file system.« less
ERIC Educational Resources Information Center
Glantz, Richard S.
Until recently, the emphasis in information storage and retrieval systems has been towards batch-processing of large files. In contrast, SHOEBOX is designed for the unformatted, personal file collection of the computer-naive individual. Operating through display terminals in a time-sharing, interactive environment on the IBM 360, the user can…
Nosql for Storage and Retrieval of Large LIDAR Data Collections
NASA Astrophysics Data System (ADS)
Boehm, J.; Liu, K.
2015-08-01
Developments in LiDAR technology over the past decades have made LiDAR to become a mature and widely accepted source of geospatial information. This in turn has led to an enormous growth in data volume. The central idea for a file-centric storage of LiDAR point clouds is the observation that large collections of LiDAR data are typically delivered as large collections of files, rather than single files of terabyte size. This split of the dataset, commonly referred to as tiling, was usually done to accommodate a specific processing pipeline. It makes therefore sense to preserve this split. A document oriented NoSQL database can easily emulate this data partitioning, by representing each tile (file) in a separate document. The document stores the metadata of the tile. The actual files are stored in a distributed file system emulated by the NoSQL database. We demonstrate the use of MongoDB a highly scalable document oriented NoSQL database for storing large LiDAR files. MongoDB like any NoSQL database allows for queries on the attributes of the document. As a specialty MongoDB also allows spatial queries. Hence we can perform spatial queries on the bounding boxes of the LiDAR tiles. Inserting and retrieving files on a cloud-based database is compared to native file system and cloud storage transfer speed.
Adding Data Management Services to Parallel File Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brandt, Scott
2015-03-04
The objective of this project, called DAMASC for “Data Management in Scientific Computing”, is to coalesce data management with parallel file system management to present a declarative interface to scientists for managing, querying, and analyzing extremely large data sets efficiently and predictably. Managing extremely large data sets is a key challenge of exascale computing. The overhead, energy, and cost of moving massive volumes of data demand designs where computation is close to storage. In current architectures, compute/analysis clusters access data in a physically separate parallel file system and largely leave it scientist to reduce data movement. Over the past decadesmore » the high-end computing community has adopted middleware with multiple layers of abstractions and specialized file formats such as NetCDF-4 and HDF5. These abstractions provide a limited set of high-level data processing functions, but have inherent functionality and performance limitations: middleware that provides access to the highly structured contents of scientific data files stored in the (unstructured) file systems can only optimize to the extent that file system interfaces permit; the highly structured formats of these files often impedes native file system performance optimizations. We are developing Damasc, an enhanced high-performance file system with native rich data management services. Damasc will enable efficient queries and updates over files stored in their native byte-stream format while retaining the inherent performance of file system data storage via declarative queries and updates over views of underlying files. Damasc has four key benefits for the development of data-intensive scientific code: (1) applications can use important data-management services, such as declarative queries, views, and provenance tracking, that are currently available only within database systems; (2) the use of these services becomes easier, as they are provided within a familiar file-based ecosystem; (3) common optimizations, e.g., indexing and caching, are readily supported across several file formats, avoiding effort duplication; and (4) performance improves significantly, as data processing is integrated more tightly with data storage. Our key contributions are: SciHadoop which explores changes to MapReduce assumption by taking advantage of semantics of structured data while preserving MapReduce’s failure and resource management; DataMods which extends common abstractions of parallel file systems so they become programmable such that they can be extended to natively support a variety of data models and can be hooked into emerging distributed runtimes such as Stanford’s Legion; and Miso which combines Hadoop and relational data warehousing to minimize time to insight, taking into account the overhead of ingesting data into data warehousing.« less
Online data handling and storage at the CMS experiment
NASA Astrophysics Data System (ADS)
Andre, J.-M.; Andronidis, A.; Behrens, U.; Branson, J.; Chaze, O.; Cittolin, S.; Darlea, G.-L.; Deldicque, C.; Demiragli, Z.; Dobson, M.; Dupont, A.; Erhan, S.; Gigi, D.; Glege, F.; Gómez-Ceballos, G.; Hegeman, J.; Holzner, A.; Jimenez-Estupiñán, R.; Masetti, L.; Meijers, F.; Meschi, E.; Mommsen, RK; Morovic, S.; Nuñez-Barranco-Fernández, C.; O'Dell, V.; Orsini, L.; Paus, C.; Petrucci, A.; Pieri, M.; Racz, A.; Roberts, P.; Sakulin, H.; Schwick, C.; Stieger, B.; Sumorok, K.; Veverka, J.; Zaza, S.; Zejdl, P.
2015-12-01
During the LHC Long Shutdown 1, the CMS Data Acquisition (DAQ) system underwent a partial redesign to replace obsolete network equipment, use more homogeneous switching technologies, and support new detector back-end electronics. The software and hardware infrastructure to provide input, execute the High Level Trigger (HLT) algorithms and deal with output data transport and storage has also been redesigned to be completely file- based. All the metadata needed for bookkeeping are stored in files as well, in the form of small documents using the JSON encoding. The Storage and Transfer System (STS) is responsible for aggregating these files produced by the HLT, storing them temporarily and transferring them to the T0 facility at CERN for subsequent offline processing. The STS merger service aggregates the output files from the HLT from ∼62 sources produced with an aggregate rate of ∼2GB/s. An estimated bandwidth of 7GB/s in concurrent read/write mode is needed. Furthermore, the STS has to be able to store several days of continuous running, so an estimated of 250TB of total usable disk space is required. In this article we present the various technological and implementation choices of the three components of the STS: the distributed file system, the merger service and the transfer system.
Online Data Handling and Storage at the CMS Experiment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andre, J. M.; et al.
2015-12-23
During the LHC Long Shutdown 1, the CMS Data Acquisition (DAQ) system underwent a partial redesign to replace obsolete network equipment, use more homogeneous switching technologies, and support new detector back-end electronics. The software and hardware infrastructure to provide input, execute the High Level Trigger (HLT) algorithms and deal with output data transport and storage has also been redesigned to be completely file- based. All the metadata needed for bookkeeping are stored in files as well, in the form of small documents using the JSON encoding. The Storage and Transfer System (STS) is responsible for aggregating these files produced bymore » the HLT, storing them temporarily and transferring them to the T0 facility at CERN for subsequent offline processing. The STS merger service aggregates the output files from the HLT from ~62 sources produced with an aggregate rate of ~2GB/s. An estimated bandwidth of 7GB/s in concurrent read/write mode is needed. Furthermore, the STS has to be able to store several days of continuous running, so an estimated of 250TB of total usable disk space is required. In this article we present the various technological and implementation choices of the three components of the STS: the distributed file system, the merger service and the transfer system.« less
10 CFR 2.305 - Service of documents, methods, proof.
Code of Federal Regulations, 2013 CFR
2013-01-01
... optical storage media containing the electronic document. (3) A participant granted an exemption under § 2... certificate of service. (i) If a document is served on participants through only the E-filing system, then the certificate of service must state that the document has been filed through the E-Filing system. (ii) If a...
10 CFR 2.305 - Service of documents, methods, proof.
Code of Federal Regulations, 2014 CFR
2014-01-01
... optical storage media containing the electronic document. (3) A participant granted an exemption under § 2... certificate of service. (i) If a document is served on participants through only the E-filing system, then the certificate of service must state that the document has been filed through the E-Filing system. (ii) If a...
Storage-based Intrusion Detection: Watching storage activity for suspicious behavior
2002-10-01
password management involves a pair of inter-related files (/etc/ passwd and /etc/shadow). The corresponding access pat- terns seen at the storage...example, consider a UNIX system password file (/etc/ passwd ), which con- sists of a set of well-defined records. Records are delimited by a line-break, and...etc/ passwd and verify that they conform to a set of basic integrity rules: 7-field records, non-empty password field, legal default shell, legal home
Yakami, Masahiro; Ishizu, Koichi; Kubo, Takeshi; Okada, Tomohisa; Togashi, Kaori
2011-04-01
Thin-slice CT data, useful for clinical diagnosis and research, is now widely available but is typically discarded in many institutions, after a short period of time due to data storage capacity limitations. We designed and built a low-cost high-capacity Digital Imaging and COmmunication in Medicine (DICOM) storage system able to store thin-slice image data for years, using off-the-shelf consumer hardware components, such as a Macintosh computer, a Windows PC, and network-attached storage units. "Ordinary" hierarchical file systems, instead of a centralized data management system such as relational database, were adopted to manage patient DICOM files by arranging them in directories enabling quick and easy access to the DICOM files of each study by following the directory trees with Windows Explorer via study date and patient ID. Software used for this system was open-source OsiriX and additional programs we developed ourselves, both of which were freely available via the Internet. The initial cost of this system was about $3,600 with an incremental storage cost of about $900 per 1 terabyte (TB). This system has been running since 7th Feb 2008 with the data stored increasing at the rate of about 1.3 TB per month. Total data stored was 21.3 TB on 23rd June 2009. The maintenance workload was found to be about 30 to 60 min once every 2 weeks. In conclusion, this newly developed DICOM storage system is useful for research due to its cost-effectiveness, enormous capacity, high scalability, sufficient reliability, and easy data access.
10 CFR 13.26 - Filing and service of papers.
Code of Federal Regulations, 2010 CFR
2010-01-01
... found in the E-Filing Guidance and on the NRC Web site at http://www.nrc.gov/site-help/e-submittals.html... electronically to the E-Filing system. In addition, optical storage media (OSM) containing the entire filing must... document (e.g., motion to quash subpoena). (6) Filing is complete when the filer performs the last act that...
The NEEDS Data Base Management and Archival Mass Memory System
NASA Technical Reports Server (NTRS)
Bailey, G. A.; Bryant, S. B.; Thomas, D. T.; Wagnon, F. W.
1980-01-01
A Data Base Management System and an Archival Mass Memory System are being developed that will have a 10 to the 12th bit on-line and a 10 to the 13th off-line storage capacity. The integrated system will accept packetized data from the data staging area at 50 Mbps, create a comprehensive directory, provide for file management, record the data, perform error detection and correction, accept user requests, retrieve the requested data files and provide the data to multiple users at a combined rate of 50 Mbps. Stored and replicated data files will have a bit error rate of less than 10 to the -9th even after ten years of storage. The integrated system will be demonstrated to prove the technology late in 1981.
Development of a Technical Data File On the Design and Use of Instructional Systems.
ERIC Educational Resources Information Center
Schumacher, Sanford P.
A technical data file concerned with the technology of Instructional System Development suitable for a variety of users was developed. The file was prepared in a way amenable to later computerized storage and retrieval. General information sources and indexes of highly probable relevance were reviewed with key words and relevant specialty journals…
File concepts for parallel I/O
NASA Technical Reports Server (NTRS)
Crockett, Thomas W.
1989-01-01
The subject of input/output (I/O) was often neglected in the design of parallel computer systems, although for many problems I/O rates will limit the speedup attainable. The I/O problem is addressed by considering the role of files in parallel systems. The notion of parallel files is introduced. Parallel files provide for concurrent access by multiple processes, and utilize parallelism in the I/O system to improve performance. Parallel files can also be used conventionally by sequential programs. A set of standard parallel file organizations is proposed, organizations are suggested, using multiple storage devices. Problem areas are also identified and discussed.
A Study of NetCDF as an Approach for High Performance Medical Image Storage
NASA Astrophysics Data System (ADS)
Magnus, Marcone; Coelho Prado, Thiago; von Wangenhein, Aldo; de Macedo, Douglas D. J.; Dantas, M. A. R.
2012-02-01
The spread of telemedicine systems increases every day. The systems and PACS based on DICOM images has become common. This rise reflects the need to develop new storage systems, more efficient and with lower computational costs. With this in mind, this article discusses a study for application in NetCDF data format as the basic platform for storage of DICOM images. The study case comparison adopts an ordinary database, the HDF5 and the NetCDF to storage the medical images. Empirical results, using a real set of images, indicate that the time to retrieve images from the NetCDF for large scale images has a higher latency compared to the other two methods. In addition, the latency is proportional to the file size, which represents a drawback to a telemedicine system that is characterized by a large amount of large image files.
Data Storage and Transfer | High-Performance Computing | NREL
High-Performance Computing (HPC) systems. Photo of computer server wiring and lights, blurred to show data. WinSCP for Windows File Transfers Use to transfer files from a local computer to a remote computer. Robinhood for File Management Use this tool to manage your data files on Peregrine. Best
76 FR 7825 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2011-02-11
... practices for storing, retrieving, accessing, retaining, and disposing of records in the system: Storage..., accessing, retaining, and disposing of records in the system: Storage: Paper file folders. Retrievability...; System of Records AGENCY: Office of the Secretary, DoD. ACTION: Notice to alter a system of records...
Implementation of a Campuswide Distributed Mass Storage Service: the Dream Versus Reality
NASA Technical Reports Server (NTRS)
Prahst, Stephen; Armstead, Betty Jo
1996-01-01
In 1990, a technical team at NASA Lewis Research Center, Cleveland, Ohio, began defining a Mass Storage Service to pro- wide long-term archival storage, short-term storage for very large files, distributed Network File System access, and backup services for critical data dw resides on workstations and personal computers. Because of software availability and budgets, the total service was phased in over dm years. During the process of building the service from the commercial technologies available, our Mass Storage Team refined the original vision and learned from the problems and mistakes that occurred. We also enhanced some technologies to better meet the needs of users and system administrators. This report describes our team's journey from dream to reality, outlines some of the problem areas that still exist, and suggests some solutions.
Faibish, Sorin; Bent, John M.; Tzelnic, Percy; Grider, Gary; Torres, Aaron
2015-10-20
Techniques are provided for storing files in a parallel computing system using different resolutions. A method is provided for storing at least one file generated by a distributed application in a parallel computing system. The file comprises one or more of a complete file and a sub-file. The method comprises the steps of obtaining semantic information related to the file; generating a plurality of replicas of the file with different resolutions based on the semantic information; and storing the file and the plurality of replicas of the file in one or more storage nodes of the parallel computing system. The different resolutions comprise, for example, a variable number of bits and/or a different sub-set of data elements from the file. A plurality of the sub-files can be merged to reproduce the file.
Implementing Journaling in a Linux Shared Disk File System
NASA Technical Reports Server (NTRS)
Preslan, Kenneth W.; Barry, Andrew; Brassow, Jonathan; Cattelan, Russell; Manthei, Adam; Nygaard, Erling; VanOort, Seth; Teigland, David; Tilstra, Mike; O'Keefe, Matthew;
2000-01-01
In computer systems today, speed and responsiveness is often determined by network and storage subsystem performance. Faster, more scalable networking interfaces like Fibre Channel and Gigabit Ethernet provide the scaffolding from which higher performance computer systems implementations may be constructed, but new thinking is required about how machines interact with network-enabled storage devices. In this paper we describe how we implemented journaling in the Global File System (GFS), a shared-disk, cluster file system for Linux. Our previous three papers on GFS at the Mass Storage Symposium discussed our first three GFS implementations, their performance, and the lessons learned. Our fourth paper describes, appropriately enough, the evolution of GFS version 3 to version 4, which supports journaling and recovery from client failures. In addition, GFS scalability tests extending to 8 machines accessing 8 4-disk enclosures were conducted: these tests showed good scaling. We describe the GFS cluster infrastructure, which is necessary for proper recovery from machine and disk failures in a collection of machines sharing disks using GFS. Finally, we discuss the suitability of Linux for handling the big data requirements of supercomputing centers.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-25
...-high, 2,932-foot-long earth embankment dam; (2) an upper reservoir with a surface area of 27.5 acres and an 2,262 acre-foot storage capacity; (3) a 120-foot-high, 2,475-foot-long earth embankment dam..., using the eComment system at http://www.ferc.gov/docs-filing/ecomment.asp . You must include your name...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-16
...-long earth embankment dam; (2) an upper reservoir with a surface area of 100 acres and an 7,100 acre-foot storage capacity; (3) a 120-foot-high, 920-foot-long earth embankment dam creating; (4) a lower..., using the eComment system at http://www.ferc.gov/docs-filing/ecomment.asp . You must include your name...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-25
...-long earth embankment dam; (2) an upper reservoir with a surface area of 57.4 acres and an 4,563 acre-foot storage capacity; (3) a 180-foot-high, 323-foot-long earth embankment dam creating; (4) a lower... registration, using the eComment system at http://www.ferc.gov/docs-filing/ecomment.asp . You must include your...
Goddard Conference on Mass Storage Systems and Technologies, volume 2
NASA Technical Reports Server (NTRS)
Kobler, Ben (Editor); Hariharan, P. C. (Editor)
1993-01-01
Papers and viewgraphs from the conference are presented. Discussion topics include the IEEE Mass Storage System Reference Model, data archiving standards, high-performance storage devices, magnetic and magneto-optic storage systems, magnetic and optical recording technologies, high-performance helical scan recording systems, and low end helical scan tape drives. Additional discussion topics addressed the evolution of the identifiable unit for processing (file, granule, data set, or some similar object) as data ingestion rates increase dramatically, and the present state of the art in mass storage technology.
NASA Technical Reports Server (NTRS)
Katz, Randy H.; Anderson, Thomas E.; Ousterhout, John K.; Patterson, David A.
1991-01-01
Rapid advances in high performance computing are making possible more complete and accurate computer-based modeling of complex physical phenomena, such as weather front interactions, dynamics of chemical reactions, numerical aerodynamic analysis of airframes, and ocean-land-atmosphere interactions. Many of these 'grand challenge' applications are as demanding of the underlying storage system, in terms of their capacity and bandwidth requirements, as they are on the computational power of the processor. A global view of the Earth's ocean chlorophyll and land vegetation requires over 2 terabytes of raw satellite image data. In this paper, we describe our planned research program in high capacity, high bandwidth storage systems. The project has four overall goals. First, we will examine new methods for high capacity storage systems, made possible by low cost, small form factor magnetic and optical tape systems. Second, access to the storage system will be low latency and high bandwidth. To achieve this, we must interleave data transfer at all levels of the storage system, including devices, controllers, servers, and communications links. Latency will be reduced by extensive caching throughout the storage hierarchy. Third, we will provide effective management of a storage hierarchy, extending the techniques already developed for the Log Structured File System. Finally, we will construct a protototype high capacity file server, suitable for use on the National Research and Education Network (NREN). Such research must be a Cornerstone of any coherent program in high performance computing and communications.
75 FR 74019 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2010-11-30
... storage media. Retrievability: Information is retrieved by individual's name, Social Security Number (SSN... ``Records include individual's name, Social Security Number (SSN); employee identification number....'' * * * * * Storage: Delete entry and replace with ``Paper records in file folders and electronic storage media...
Registered File Support for Critical Operations Files at (Space Infrared Telescope Facility) SIRTF
NASA Technical Reports Server (NTRS)
Turek, G.; Handley, Tom; Jacobson, J.; Rector, J.
2001-01-01
The SIRTF Science Center's (SSC) Science Operations System (SOS) has to contend with nearly one hundred critical operations files via comprehensive file management services. The management is accomplished via the registered file system (otherwise known as TFS) which manages these files in a registered file repository composed of a virtual file system accessible via a TFS server and a file registration database. The TFS server provides controlled, reliable, and secure file transfer and storage by registering all file transactions and meta-data in the file registration database. An API is provided for application programs to communicate with TFS servers and the repository. A command line client implementing this API has been developed as a client tool. This paper describes the architecture, current implementation, but more importantly, the evolution of these services based on evolving community use cases and emerging information system technology.
File Transfers from Peregrine to the Mass Storage System - Gyrfalcon |
login node or data-transfer queue node. Below is an example to access data-tranfer queue Interactively number of container files using the tar command. For example, $ cd /scratch/
Problems in the long-term storage of data obtained from scientific space experiments
NASA Technical Reports Server (NTRS)
Zlotin, G. N.; Khovanskiy, Y. D.
1975-01-01
It is shown that long-term data storage systems can be achieved when the system which organizes and conducts the scientific space experiments is equipped with a specialized subsystem: the information filing system. Its main functions are described along with the necessity of stage-by-stage development and compatibility with the data processing systems. The requirements for long-term data storage media are discussed.
MICE data handling on the Grid
NASA Astrophysics Data System (ADS)
Martyniak, J.; Mice Collaboration
2014-06-01
The international Muon Ionisation Cooling Experiment (MICE) is designed to demonstrate the principle of muon ionisation cooling for the first time, for application to a future Neutrino factory or Muon Collider. The experiment is currently under construction at the ISIS synchrotron at the Rutherford Appleton Laboratory (RAL), UK. In this paper we present a system - the Raw Data Mover, which allows us to store and distribute MICE raw data - and a framework for offline reconstruction and data management. The aim of the Raw Data Mover is to upload raw data files onto a safe tape storage as soon as the data have been written out by the DAQ system and marked as ready to be uploaded. Internal integrity of the files is verified and they are uploaded to the RAL Tier-1 Castor Storage Element (SE) and placed on two tapes for redundancy. We also make another copy at a separate disk-based SE at this stage to make it easier for users to access data quickly. Both copies are check-summed and the replicas are registered with an instance of the LCG File Catalog (LFC). On success a record with basic file properties is added to the MICE Metadata DB. The reconstruction process is triggered by new raw data records filled in by the mover system described above. Off-line reconstruction jobs for new raw files are submitted to RAL Tier-1 and the output is stored on tape. Batch reprocessing is done at multiple MICE enabled Grid sites and output files are shipped to central tape or disk storage at RAL using a custom File Transfer Controller.
Parallel compression of data chunks of a shared data object using a log-structured file system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bent, John M.; Faibish, Sorin; Grider, Gary
2016-10-25
Techniques are provided for parallel compression of data chunks being written to a shared object. A client executing on a compute node or a burst buffer node in a parallel computing system stores a data chunk generated by the parallel computing system to a shared data object on a storage node by compressing the data chunk; and providing the data compressed data chunk to the storage node that stores the shared object. The client and storage node may employ Log-Structured File techniques. The compressed data chunk can be de-compressed by the client when the data chunk is read. A storagemore » node stores a data chunk as part of a shared object by receiving a compressed version of the data chunk from a compute node; and storing the compressed version of the data chunk to the shared data object on the storage node.« less
Content-aware network storage system supporting metadata retrieval
NASA Astrophysics Data System (ADS)
Liu, Ke; Qin, Leihua; Zhou, Jingli; Nie, Xuejun
2008-12-01
Nowadays, content-based network storage has become the hot research spot of academy and corporation[1]. In order to solve the problem of hit rate decline causing by migration and achieve the content-based query, we exploit a new content-aware storage system which supports metadata retrieval to improve the query performance. Firstly, we extend the SCSI command descriptor block to enable system understand those self-defined query requests. Secondly, the extracted metadata is encoded by extensible markup language to improve the universality. Thirdly, according to the demand of information lifecycle management (ILM), we store those data in different storage level and use corresponding query strategy to retrieval them. Fourthly, as the file content identifier plays an important role in locating data and calculating block correlation, we use it to fetch files and sort query results through friendly user interface. Finally, the experiments indicate that the retrieval strategy and sort algorithm have enhanced the retrieval efficiency and precision.
Distributed Storage Algorithm for Geospatial Image Data Based on Data Access Patterns.
Pan, Shaoming; Li, Yongkai; Xu, Zhengquan; Chong, Yanwen
2015-01-01
Declustering techniques are widely used in distributed environments to reduce query response time through parallel I/O by splitting large files into several small blocks and then distributing those blocks among multiple storage nodes. Unfortunately, however, many small geospatial image data files cannot be further split for distributed storage. In this paper, we propose a complete theoretical system for the distributed storage of small geospatial image data files based on mining the access patterns of geospatial image data using their historical access log information. First, an algorithm is developed to construct an access correlation matrix based on the analysis of the log information, which reveals the patterns of access to the geospatial image data. Then, a practical heuristic algorithm is developed to determine a reasonable solution based on the access correlation matrix. Finally, a number of comparative experiments are presented, demonstrating that our algorithm displays a higher total parallel access probability than those of other algorithms by approximately 10-15% and that the performance can be further improved by more than 20% by simultaneously applying a copy storage strategy. These experiments show that the algorithm can be applied in distributed environments to help realize parallel I/O and thereby improve system performance.
NASA Technical Reports Server (NTRS)
Kobler, Benjamin (Editor); Hariharan, P. C. (Editor); Blasso, L. G. (Editor)
1992-01-01
Papers and viewgraphs from the conference are presented. This conference served as a broad forum for the discussion of a number of important issues in the field of mass storage systems. Topics include magnetic disk and tape technologies, optical disks and tape, software storage and file management systems, and experiences with the use of a large, distributed storage system. The technical presentations describe, among other things, integrated mass storage systems that are expected to be available commercially. Also included is a series of presentations from Federal Government organizations and research institutions covering their mass storage requirements for the 1990's.
76 FR 14654 - UGI Storage Company; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-17
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. RP11-1745-001] UGI Storage Company; Notice of Filing Take notice that on March 9, 2011, UGI Storage Company (UGI) submitted an amendment to its January 31, 2011, filing. Any person desiring to protest this filing must file in...
An information retrieval system for research file data
Joan E. Lengel; John W. Koning
1978-01-01
Research file data have been successfully retrieved at the Forest Products Laboratory through a high-speed cross-referencing system involving the computer program FAMULUS as modified by the Madison Academic Computing Center at the University of Wisconsin. The method of data input, transfer to computer storage, system utilization, and effectiveness are discussed....
Federal Register 2010, 2011, 2012, 2013, 2014
2010-01-12
...., desktop, laptop, handheld or other computer types) containing protected personal identifiers or PHI is... as the National Indian Women's Resource Center, to conduct analytical and evaluation studies. 8... SYSTEM: STORAGE: File folders, ledgers, card files, microfiche, microfilm, computer tapes, disk packs...
77 FR 789 - Tres Palacios Gas Storage LLC; Notice of Application
Federal Register 2010, 2011, 2012, 2013, 2014
2012-01-06
... Gas Storage LLC; Notice of Application Take notice that on December 20, 2011, Tres Palacios Gas Storage LLC (Tres Palacios), Two Brush Creek Boulevard, Kansas City, Missouri 64112, filed in the above... on its storage facility header pipeline system by: (i) Constructing a 19.7-mile, 24-inch diameter...
Mass storage technology in networks
NASA Astrophysics Data System (ADS)
Ishii, Katsunori; Takeda, Toru; Itao, Kiyoshi; Kaneko, Reizo
1990-08-01
Trends and features of mass storage subsystems in network are surveyed and their key technologies spotlighted. Storage subsystems are becoming increasingly important in new network systems in which communications and data processing are systematically combined. These systems require a new class of high-performance mass-information storage in order to effectively utilize their processing power. The requirements of high transfer rates, high transactional rates and large storage capacities, coupled with high functionality, fault tolerance and flexibility in configuration, are major challenges in storage subsystems. Recent progress in optical disk technology has resulted in improved performance of on-line external memories to optical disk drives, which are competing with mid-range magnetic disks. Optical disks are more effective than magnetic disks in using low-traffic random-access file storing multimedia data that requires large capacity, such as in archive use and in information distribution use by ROM disks. Finally, it demonstrates image coded document file servers for local area network use that employ 130mm rewritable magneto-optical disk subsystems.
NASA Astrophysics Data System (ADS)
Murata, K. T.
2014-12-01
Data-intensive or data-centric science is 4th paradigm after observational and/or experimental science (1st paradigm), theoretical science (2nd paradigm) and numerical science (3rd paradigm). Science cloud is an infrastructure for 4th science methodology. The NICT science cloud is designed for big data sciences of Earth, space and other sciences based on modern informatics and information technologies [1]. Data flow on the cloud is through the following three techniques; (1) data crawling and transfer, (2) data preservation and stewardship, and (3) data processing and visualization. Original tools and applications of these techniques have been designed and implemented. We mash up these tools and applications on the NICT Science Cloud to build up customized systems for each project. In this paper, we discuss science data processing through these three steps. For big data science, data file deployment on a distributed storage system should be well designed in order to save storage cost and transfer time. We developed a high-bandwidth virtual remote storage system (HbVRS) and data crawling tool, NICTY/DLA and Wide-area Observation Network Monitoring (WONM) system, respectively. Data files are saved on the cloud storage system according to both data preservation policy and data processing plan. The storage system is developed via distributed file system middle-ware (Gfarm: GRID datafarm). It is effective since disaster recovery (DR) and parallel data processing are carried out simultaneously without moving these big data from storage to storage. Data files are managed on our Web application, WSDBank (World Science Data Bank). The big-data on the cloud are processed via Pwrake, which is a workflow tool with high-bandwidth of I/O. There are several visualization tools on the cloud; VirtualAurora for magnetosphere and ionosphere, VDVGE for google Earth, STICKER for urban environment data and STARStouch for multi-disciplinary data. There are 30 projects running on the NICT Science Cloud for Earth and space science. In 2003 56 refereed papers were published. At the end, we introduce a couple of successful results of Earth and space sciences using these three techniques carried out on the NICT Sciences Cloud. [1] http://sc-web.nict.go.jp
Final Report for File System Support for Burst Buffers on HPC Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu, W.; Mohror, K.
Distributed burst buffers are a promising storage architecture for handling I/O workloads for exascale computing. As they are being deployed on more supercomputers, a file system that efficiently manages these burst buffers for fast I/O operations carries great consequence. Over the past year, FSU team has undertaken several efforts to design, prototype and evaluate distributed file systems for burst buffers on HPC systems. These include MetaKV: a Key-Value Store for Metadata Management of Distributed Burst Buffers, a user-level file system with multiple backends, and a specialized file system for large datasets of deep neural networks. Our progress for these respectivemore » efforts are elaborated further in this report.« less
Recommended Systems for the Incremental Automation of the Morgue of "The Daily Texan."
ERIC Educational Resources Information Center
Voges, Mickie; And Others
A modular program is recommended for automation of the clippings file of "The Daily Texan" (student newspaper of the University of Texas at Austin). The proposed system will lead ultimately to on-line storage of the index, on-line storage of local, staff-written news stories from the previous twenty-four months, micrographic storage for backup and…
75 FR 33789 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-15
... be effective without further notice on July 15, 2010 unless comments are received which result in a... of records in the system: Storage: Paper files folders and electronic storage media. Retrievability... categories: Individual, social workers, rehabilitation counselors, and/or health care personnel. Exemptions...
Data storage and retrieval system
NASA Technical Reports Server (NTRS)
Nakamoto, Glen
1991-01-01
The Data Storage and Retrieval System (DSRS) consists of off-the-shelf system components integrated as a file server supporting very large files. These files are on the order of one gigabyte of data per file, although smaller files on the order of one megabyte can be accommodated as well. For instance, one gigabyte of data occupies approximately six 9 track tape reels (recorded at 6250 bpi). Due to this large volume of media, it was desirable to shrink the size of the proposed media to a single portable cassette. In addition to large size, a key requirement was that the data needs to be transferred to a (VME based) workstation at very high data rates. One gigabyte (GB) of data needed to be transferred from an archiveable media on a file server to a workstation in less than 5 minutes. Equivalent size, on-line data needed to be transferred in less than 3 minutes. These requirements imply effective transfer rates on the order of four to eight megabytes per second (4-8 MB/s). The DSRS also needed to be able to send and receive data from a variety of other sources accessible from an Ethernet local area network.
Data storage and retrieval system
NASA Technical Reports Server (NTRS)
Nakamoto, Glen
1992-01-01
The Data Storage and Retrieval System (DSRS) consists of off-the-shelf system components integrated as a file server supporting very large files. These files are on the order of one gigabyte of data per file, although smaller files on the order of one megabyte can be accommodated as well. For instance, one gigabyte of data occupies approximately six 9-track tape reels (recorded at 6250 bpi). Due to this large volume of media, it was desirable to 'shrink' the size of the proposed media to a single portable cassette. In addition to large size, a key requirement was that the data needs to be transferred to a (VME based) workstation at very high data rates. One gigabyte (GB) of data needed to be transferred from an archiveable media on a file server to a workstation in less than 5 minutes. Equivalent size, on-line data needed to be transferred in less than 3 minutes. These requirements imply effective transfer rates on the order of four to eight megabytes per second (4-8 MB/s). The DSRS also needed to be able to send and receive data from a variety of other sources accessible from an Ethernet local area network.
Requirements for a network storage service
NASA Technical Reports Server (NTRS)
Kelly, Suzanne M.; Haynes, Rena A.
1992-01-01
Sandia National Laboratories provides a high performance classified computer network as a core capability in support of its mission of nuclear weapons design and engineering, physical sciences research, and energy research and development. The network, locally known as the Internal Secure Network (ISN), was designed in 1989 and comprises multiple distributed local area networks (LAN's) residing in Albuquerque, New Mexico and Livermore, California. The TCP/IP protocol suite is used for inner-node communications. Scientific workstations and mid-range computers, running UNIX-based operating systems, compose most LAN's. One LAN, operated by the Sandia Corporate Computing Directorate, is a general purpose resource providing a supercomputer and a file server to the entire ISN. The current file server on the supercomputer LAN is an implementation of the Common File System (CFS) developed by Los Alamos National Laboratory. Subsequent to the design of the ISN, Sandia reviewed its mass storage requirements and chose to enter into a competitive procurement to replace the existing file server with one more adaptable to a UNIX/TCP/IP environment. The requirements study for the network was the starting point for the requirements study for the new file server. The file server is called the Network Storage Services (NSS) and is requirements are described in this paper. The next section gives an application or functional description of the NSS. The final section adds performance, capacity, and access constraints to the requirements.
Files synchronization from a large number of insertions and deletions
NASA Astrophysics Data System (ADS)
Ellappan, Vijayan; Kumari, Savera
2017-11-01
Synchronization between different versions of files is becoming a major issue that most of the applications are facing. To make the applications more efficient a economical algorithm is developed from the previously used algorithm of “File Loading Algorithm”. I am extending this algorithm in three ways: First, dealing with non-binary files, Second backup is generated for uploaded files and lastly each files are synchronized with insertions and deletions. User can reconstruct file from the former file with minimizing the error and also provides interactive communication by eliminating the frequency without any disturbance. The drawback of previous system is overcome by using synchronization, in which multiple copies of each file/record is created and stored in backup database and is efficiently restored in case of any unwanted deletion or loss of data. That is, to introduce a protocol that user B may use to reconstruct file X from file Y with suitably low probability of error. Synchronization algorithms find numerous areas of use, including data storage, file sharing, source code control systems, and cloud applications. For example, cloud storage services such as Drop box synchronize between local copies and cloud backups each time users make changes to local versions. Similarly, synchronization tools are necessary in mobile devices. Specialized synchronization algorithms are used for video and sound editing. Synchronization tools are also capable of performing data duplication.
77 FR 67203 - Privacy Act of 1974; Republication of Systems of Records Notices
Federal Register 2010, 2011, 2012, 2013, 2014
2012-11-08
... file folders and on electronic media. RETRIEVABILITY: Accessed by name, tag number, and/or permit... DISPOSING OF RECORDS IN THE SYSTEM: STORAGE: Records are maintained on electronic media. RETRIEVABILITY... electronic media. RETRIEVABILITY: Records are accessed by individual action file number or by the name of the...
Fair-share scheduling algorithm for a tertiary storage system
NASA Astrophysics Data System (ADS)
Jakl, Pavel; Lauret, Jérôme; Šumbera, Michal
2010-04-01
Any experiment facing Peta bytes scale problems is in need for a highly scalable mass storage system (MSS) to keep a permanent copy of their valuable data. But beyond the permanent storage aspects, the sheer amount of data makes complete data-set availability onto live storage (centralized or aggregated space such as the one provided by Scalla/Xrootd) cost prohibitive implying that a dynamic population from MSS to faster storage is needed. One of the most challenging aspects of dealing with MSS is the robotic tape component. If a robotic system is used as the primary storage solution, the intrinsically long access times (latencies) can dramatically affect the overall performance. To speed the retrieval of such data, one could organize the requests according to criterion with an aim to deliver maximal data throughput. However, such approaches are often orthogonal to fair resource allocation and a trade-off between quality of service, responsiveness and throughput is necessary for achieving an optimal and practical implementation of a truly faire-share oriented file restore policy. Starting from an explanation of the key criterion of such a policy, we will present evaluations and comparisons of three different MSS file restoration algorithms which meet fair-share requirements, and discuss their respective merits. We will quantify their impact on a typical file restoration cycle for the RHIC/STAR experimental setup and this, within a development, analysis and production environment relying on a shared MSS service [1].
Sawmill: A Logging File System for a High-Performance RAID Disk Array
1995-01-01
from limiting disk performance, new controller architectures connect the disks directly to the network so that data movement bypasses the file server...These developments raise two questions for file systems: how to get the best performance from a RAID, and how to use such a controller architecture ...the RAID-II storage system; this architecture provides a fast data path that moves data rapidly among the disks, high-speed controller memory, and the
An effective XML based name mapping mechanism within StoRM
NASA Astrophysics Data System (ADS)
Corso, E.; Forti, A.; Ghiselli, A.; Magnoni, L.; Zappi, R.
2008-07-01
In a Grid environment the naming capability allows users to refer to specific data resources in a physical storage system using a high level logical identifier. This logical identifier is typically organized in a file system like structure, a hierarchical tree of names. Storage Resource Manager (SRM) services map the logical identifier to the physical location of data evaluating a set of parameters as the desired quality of services and the VOMS attributes specified in the requests. StoRM is a SRM service developed by INFN and ICTP-EGRID to manage file and space on standard POSIX and high performing parallel and cluster file systems. An upcoming requirement in the Grid data scenario is the orthogonality of the logical name and the physical location of data, in order to refer, with the same identifier, to different copies of data archived in various storage areas with different quality of service. The mapping mechanism proposed in StoRM is based on a XML document that represents the different storage components managed by the service, the storage areas defined by the site administrator, the quality of service they provide and the Virtual Organization that want to use the storage area. An appropriate directory tree is realized in each storage component reflecting the XML schema. In this scenario StoRM is able to identify the physical location of a requested data evaluating the logical identifier and the specified attributes following the XML schema, without querying any database service. This paper presents the namespace schema defined, the different entities represented and the technical details of the StoRM implementation.
75 FR 6000 - Privacy Act of 1974; Systems of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2010-02-05
... storage media. RETRIEVABILITY: Records are retrieved by the user's name, Social Security Number, or... amended. DATES: This proposed action will be effective without further notice on March 8, 2010, unless....'' STORAGE: Delete entry and replace with ``Paper in file folders and electronic storage media...
Performance Analysis of the Unitree Central File
NASA Technical Reports Server (NTRS)
Pentakalos, Odysseas I.; Flater, David
1994-01-01
This report consists of two parts. The first part briefly comments on the documentation status of two major systems at NASA#s Center for Computational Sciences, specifically the Cray C98 and the Convex C3830. The second part describes the work done on improving the performance of file transfers between the Unitree Mass Storage System running on the Convex file server and the users workstations distributed over a large georgraphic area.
Recent evolution of the offline computing model of the NOvA experiment
Habig, Alec; Norman, A.; Group, Craig
2015-12-23
The NOvA experiment at Fermilab is a long-baseline neutrino experiment designed to study ν e appearance in a ν μ beam. Over the last few years there has been intense work to streamline the computing infrastructure in preparation for data, which started to flow in from the far detector in Fall 2013. Major accomplishments for this effort include migration to the use of off-site resources through the use of the Open Science Grid and upgrading the file-handling framework from simple disk storage to a tiered system using a comprehensive data management and delivery system to find and access files onmore » either disk or tape storage. NOvA has already produced more than 6.5 million files and more than 1 PB of raw data and Monte Carlo simulation files which are managed under this model. In addition, the current system has demonstrated sustained rates of up to 1 TB/hour of file transfer by the data handling system. NOvA pioneered the use of new tools and this paved the way for their use by other Intensity Frontier experiments at Fermilab. Most importantly, the new framework places the experiment's infrastructure on a firm foundation, and is ready to produce the files needed for first physics.« less
Recent Evolution of the Offline Computing Model of the NOvA Experiment
NASA Astrophysics Data System (ADS)
Habig, Alec; Norman, A.
2015-12-01
The NOvA experiment at Fermilab is a long-baseline neutrino experiment designed to study νe appearance in a νμ beam. Over the last few years there has been intense work to streamline the computing infrastructure in preparation for data, which started to flow in from the far detector in Fall 2013. Major accomplishments for this effort include migration to the use of off-site resources through the use of the Open Science Grid and upgrading the file-handling framework from simple disk storage to a tiered system using a comprehensive data management and delivery system to find and access files on either disk or tape storage. NOvA has already produced more than 6.5 million files and more than 1 PB of raw data and Monte Carlo simulation files which are managed under this model. The current system has demonstrated sustained rates of up to 1 TB/hour of file transfer by the data handling system. NOvA pioneered the use of new tools and this paved the way for their use by other Intensity Frontier experiments at Fermilab. Most importantly, the new framework places the experiment's infrastructure on a firm foundation, and is ready to produce the files needed for first physics.
The challenge of a data storage hierarchy
NASA Technical Reports Server (NTRS)
Ruderman, Michael
1992-01-01
A discussion of Mesa Archival Systems' data archiving system is presented. This data archiving system is strictly a software system that is implemented on a mainframe and manages the data into permanent file storage. Emphasis is placed on the fact that any kind of client system on the network can be connected through the Unix interface of the data archiving system.
Measurements over distributed high performance computing and storage systems
NASA Technical Reports Server (NTRS)
Williams, Elizabeth; Myers, Tom
1993-01-01
A strawman proposal is given for a framework for presenting a common set of metrics for supercomputers, workstations, file servers, mass storage systems, and the networks that interconnect them. Production control and database systems are also included. Though other applications and third part software systems are not addressed, it is important to measure them as well.
76 FR 22682 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2011-04-22
...: Maintained in file folders and computer storage media. Retrievability: Retrieved by name and/or Social... folders and computer storage media.'' * * * * * System Manager(s) and address: Delete entry and replace... provide their full name, Social Security Number (SSN), any details which may assist in locating records...
Non-volatile main memory management methods based on a file system.
Oikawa, Shuichi
2014-01-01
There are upcoming non-volatile (NV) memory technologies that provide byte addressability and high performance. PCM, MRAM, and STT-RAM are such examples. Such NV memory can be used as storage because of its data persistency without power supply while it can be used as main memory because of its high performance that matches up with DRAM. There are a number of researches that investigated its uses for main memory and storage. They were, however, conducted independently. This paper presents the methods that enables the integration of the main memory and file system management for NV memory. Such integration makes NV memory simultaneously utilized as both main memory and storage. The presented methods use a file system as their basis for the NV memory management. We implemented the proposed methods in the Linux kernel, and performed the evaluation on the QEMU system emulator. The evaluation results show that 1) the proposed methods can perform comparably to the existing DRAM memory allocator and significantly better than the page swapping, 2) their performance is affected by the internal data structures of a file system, and 3) the data structures appropriate for traditional hard disk drives do not always work effectively for byte addressable NV memory. We also performed the evaluation of the effects caused by the longer access latency of NV memory by cycle-accurate full-system simulation. The results show that the effect on page allocation cost is limited if the increase of latency is moderate.
75 FR 36642 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-28
... relating to the issue, return, and accountability of keys to secure areas. Records may contain name, Social... disposing of records in the system: Storage: Paper records in file folders and electronic storage media. Retrievability: By name, Social Security Number (SSN), key number, personal identification number (PIN), Magnetic...
75 FR 17707 - Arlington Storage Company, LLC; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-04-07
... Storage Company, LLC; Notice of Filing March 30, 2010. Take notice that on March 24, 2010, Arlington Storage Company, LLC (ASC), Two Brush Creek Boulevard, Kansas City, Missouri 64112, filed an application... existing underground natural gas storage facility located in Schuyler County, New York known as the Seneca...
NASA Technical Reports Server (NTRS)
Kobler, Ben (Editor); Hariharan, P. C. (Editor); Blasso, L. G. (Editor)
1992-01-01
This report contains copies of nearly all of the technical papers and viewgraphs presented at the NSSDC Conference on Mass Storage Systems and Technologies for Space and Earth Science Application. This conference served as a broad forum for the discussion of a number of important issues in the field of mass storage systems. Topics include the following: magnetic disk and tape technologies; optical disk and tape; software storage and file management systems; and experiences with the use of a large, distributed storage system. The technical presentations describe, among other things, integrated mass storage systems that are expected to be available commercially. Also included is a series of presentations from Federal Government organizations and research institutions covering their mass storage requirements for the 1990's.
NASA Technical Reports Server (NTRS)
Kobler, Ben (Editor); Hariharan, P. C. (Editor); Blasso, L. G. (Editor)
1992-01-01
This report contains copies of nearly all of the technical papers and viewgraphs presented at the National Space Science Data Center (NSSDC) Conference on Mass Storage Systems and Technologies for Space and Earth Science Applications. This conference served as a broad forum for the discussion of a number of important issues in the field of mass storage systems. Topics include magnetic disk and tape technologies, optical disk and tape, software storage and file management systems, and experiences with the use of a large, distributed storage system. The technical presentations describe, among other things, integrated mass storage systems that are expected to be available commercially. Also included is a series of presentations from Federal Government organizations and research institutions covering their mass storage requirements for the 1990s.
High performance network and channel-based storage
NASA Technical Reports Server (NTRS)
Katz, Randy H.
1991-01-01
In the traditional mainframe-centered view of a computer system, storage devices are coupled to the system through complex hardware subsystems called input/output (I/O) channels. With the dramatic shift towards workstation-based computing, and its associated client/server model of computation, storage facilities are now found attached to file servers and distributed throughout the network. We discuss the underlying technology trends that are leading to high performance network-based storage, namely advances in networks, storage devices, and I/O controller and server architectures. We review several commercial systems and research prototypes that are leading to a new approach to high performance computing based on network-attached storage.
Integration of cloud-based storage in BES III computing environment
NASA Astrophysics Data System (ADS)
Wang, L.; Hernandez, F.; Deng, Z.
2014-06-01
We present an on-going work that aims to evaluate the suitability of cloud-based storage as a supplement to the Lustre file system for storing experimental data for the BES III physics experiment and as a backend for storing files belonging to individual members of the collaboration. In particular, we discuss our findings regarding the support of cloud-based storage in the software stack of the experiment. We report on our development work that improves the support of CERN' s ROOT data analysis framework and allows efficient remote access to data through several cloud storage protocols. We also present our efforts providing the experiment with efficient command line tools for navigating and interacting with cloud storage-based data repositories both from interactive sessions and grid jobs.
ZFS on RBODs - Leveraging RAID Controllers for Metrics and Enclosure Management
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stearman, D. M.
2015-03-30
Traditionally, the Lustre file system has relied on the ldiskfs file system with reliable RAID (Redundant Array of Independent Disks) storage underneath. As of Lustre 2.4, ZFS was added as a backend file system, with built-in software RAID, thereby removing the need of expensive RAID controllers. ZFS was designed to work with JBOD (Just a Bunch Of Disks) storage enclosures under the Solaris Operating System, which provided a rich device management system. Long time users of the Lustre file system have relied on the RAID controllers to provide metrics and enclosure monitoring and management services, with rich APIs and commandmore » line interfaces. This paper will study a hybrid approach using an advanced full featured RAID enclosure which is presented to the host as a JBOD, This RBOD (RAIDed Bunch Of Disks) allows ZFS to do the RAID protection and error correction, while the RAID controller handles management of the disks and monitors the enclosure. It was hoped that the value of the RAID controller features would offset the additional cost, and that performance would not suffer in this mode. The test results revealed that the hybrid RBOD approach did suffer reduced performance.« less
High volume data storage architecture analysis
NASA Technical Reports Server (NTRS)
Malik, James M.
1990-01-01
A High Volume Data Storage Architecture Analysis was conducted. The results, presented in this report, will be applied to problems of high volume data requirements such as those anticipated for the Space Station Control Center. High volume data storage systems at several different sites were analyzed for archive capacity, storage hierarchy and migration philosophy, and retrieval capabilities. Proposed architectures were solicited from the sites selected for in-depth analysis. Model architectures for a hypothetical data archiving system, for a high speed file server, and for high volume data storage are attached.
75 FR 63465 - Hill-Lake Gas Storage, LLC; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-15
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR10-137-000] Hill-Lake Gas Storage, LLC; Notice of Filing October 7, 2010. Take notice that on September 30, 2010, Hill-Lake Gas Storage, LLC (Hill-Lake) filed a revised Statement of Operating Conditions (SOC) for its Storage Services...
75 FR 35780 - ONEOK Texas Gas Storage, LLC; Notice of Baseline Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-23
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR10-36-000] ONEOK Texas Gas Storage, LLC; Notice of Baseline Filing June 16, 2010. Take notice that on June 15, 2010, ONEOK Texas Gas Storage, LLC submitted a baseline filing of its Storage Statement of Operating Conditions for services...
A Patient Record-Filing System for Family Practice
Levitt, Cheryl
1988-01-01
The efficient storage and easy retrieval of quality records are a central concern of good family practice. Many physicians starting out in practice have difficulty choosing a practical and lasting system for storing their records. Some who have established practices are installing computers in their offices and finding that their filing systems are worn, outdated, and incompatible with computerized systems. This article describes a new filing system installed simultaneously with a new computer system in a family-practice teaching centre. The approach adopted solved all identifiable problems and is applicable in family practices of all sizes.
Parallel file system with metadata distributed across partitioned key-value store c
Bent, John M.; Faibish, Sorin; Grider, Gary; Torres, Aaron
2017-09-19
Improved techniques are provided for storing metadata associated with a plurality of sub-files associated with a single shared file in a parallel file system. The shared file is generated by a plurality of applications executing on a plurality of compute nodes. A compute node implements a Parallel Log Structured File System (PLFS) library to store at least one portion of the shared file generated by an application executing on the compute node and metadata for the at least one portion of the shared file on one or more object storage servers. The compute node is also configured to implement a partitioned data store for storing a partition of the metadata for the shared file, wherein the partitioned data store communicates with partitioned data stores on other compute nodes using a message passing interface. The partitioned data store can be implemented, for example, using Multidimensional Data Hashing Indexing Middleware (MDHIM).
2007-01-15
it can detect specifically proscribed content changes to critical files (e.g., illegal shells inserted into /etc/ passwd ). Fourth, it can detect the...UNIX password management involves a pair of inter-related files (/etc/ passwd and /etc/shadow). The corresponding access patterns seen at the storage...content integrity verification is utilized. As a concrete example, consider a UNIX system password file (/etc/ passwd ), which consists of a set of well
Social Influences on User Behavior in Group Information Repositories
ERIC Educational Resources Information Center
Rader, Emilee Jeanne
2009-01-01
Group information repositories are systems for organizing and sharing files kept in a central location that all group members can access. These systems are often assumed to be tools for storage and control of files and their metadata, not tools for communication. The purpose of this research is to better understand user behavior in group…
cadcVOFS: A FUSE Based File System Layer for VOSpace
NASA Astrophysics Data System (ADS)
Kavelaars, J.; Dowler, P.; Jenkins, D.; Hill, N.; Damian, A.
2012-09-01
The CADC is now making extensive use of the VOSpace protocol for user managed storage. The VOSpace standard allows a diverse set of rich data services to be delivered to users via a simple protocol. We have recently developed the cadcVOFS, a FUSE based file-system layer for VOSpace. cadcVOFS provides a filesystem layer on-top of VOSpace so that standard Unix tools (such as ‘find’, ‘emacs’, ‘awk’ etc) can be used directly on the data objects stored in VOSpace. Once mounted the VOSpace appears as a network storage volume inside the operating system. Within the CADC Cloud Computing project (CANFAR) we have used VOSpace as the method for retrieving and storing processing inputs and products. The abstraction of storage is an important component of Cloud Computing and the high use level of our VOSpace service reflects this.
Systems and methods for an extensible business application framework
NASA Technical Reports Server (NTRS)
Bell, David G. (Inventor); Crawford, Michael (Inventor)
2012-01-01
Method and systems for editing data from a query result include requesting a query result using a unique collection identifier for a collection of individual files and a unique identifier for a configuration file that specifies a data structure for the query result. A query result is generated that contains a plurality of fields as specified by the configuration file, by combining each of the individual files associated with a unique identifier for a collection of individual files. The query result data is displayed with a plurality of labels as specified in the configuration file. Edits can be performed by querying a collection of individual files using the configuration file, editing a portion of the query result, and transmitting only the edited information for storage back into a data repository.
Scalable global grid catalogue for Run3 and beyond
NASA Astrophysics Data System (ADS)
Martinez Pedreira, M.; Grigoras, C.;
2017-10-01
The AliEn (ALICE Environment) file catalogue is a global unique namespace providing mapping between a UNIX-like logical name structure and the corresponding physical files distributed over 80 storage elements worldwide. Powerful search tools and hierarchical metadata information are integral parts of the system and are used by the Grid jobs as well as local users to store and access all files on the Grid storage elements. The catalogue has been in production since 2005 and over the past 11 years has grown to more than 2 billion logical file names. The backend is a set of distributed relational databases, ensuring smooth growth and fast access. Due to the anticipated fast future growth, we are looking for ways to enhance the performance and scalability by simplifying the catalogue schema while keeping the functionality intact. We investigated different backend solutions, such as distributed key value stores, as replacement for the relational database. This contribution covers the architectural changes in the system, together with the technology evaluation, benchmark results and conclusions.
Performance of the engineering analysis and data system 2 common file system
NASA Technical Reports Server (NTRS)
Debrunner, Linda S.
1993-01-01
The Engineering Analysis and Data System (EADS) was used from April 1986 to July 1993 to support large scale scientific and engineering computation (e.g. computational fluid dynamics) at Marshall Space Flight Center. The need for an updated system resulted in a RFP in June 1991, after which a contract was awarded to Cray Grumman. EADS II was installed in February 1993, and by July 1993 most users were migrated. EADS II is a network of heterogeneous computer systems supporting scientific and engineering applications. The Common File System (CFS) is a key component of this system. The CFS provides a seamless, integrated environment to the users of EADS II including both disk and tape storage. UniTree software is used to implement this hierarchical storage management system. The performance of the CFS suffered during the early months of the production system. Several of the performance problems were traced to software bugs which have been corrected. Other problems were associated with hardware. However, the use of NFS in UniTree UCFM software limits the performance of the system. The performance issues related to the CFS have led to a need to develop a greater understanding of the CFS organization. This paper will first describe the EADS II with emphasis on the CFS. Then, a discussion of mass storage systems will be presented, and methods of measuring the performance of the Common File System will be outlined. Finally, areas for further study will be identified and conclusions will be drawn.
Baran, Michael C; Moseley, Hunter N B; Sahota, Gurmukh; Montelione, Gaetano T
2002-10-01
Modern protein NMR spectroscopy laboratories have a rapidly growing need for an easily queried local archival system of raw experimental NMR datasets. SPINS (Standardized ProteIn Nmr Storage) is an object-oriented relational database that provides facilities for high-volume NMR data archival, organization of analyses, and dissemination of results to the public domain by automatic preparation of the header files required for submission of data to the BioMagResBank (BMRB). The current version of SPINS coordinates the process from data collection to BMRB deposition of raw NMR data by standardizing and integrating the storage and retrieval of these data in a local laboratory file system. Additional facilities include a data mining query tool, graphical database administration tools, and a NMRStar v2. 1.1 file generator. SPINS also includes a user-friendly internet-based graphical user interface, which is optionally integrated with Varian VNMR NMR data collection software. This paper provides an overview of the data model underlying the SPINS database system, a description of its implementation in Oracle, and an outline of future plans for the SPINS project.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-27
... Storage Water Supply, LLC; Notice of Preliminary Permit Application Accepted for Filing and Soliciting...-acre reservoir; (4) a turnout to supply project effluent water to an existing irrigation system; (5) a...,000 megawatt-hours. Applicant Contact: Bart M. O'Keeffe, West Maui Pumped Storage Water Supply, LLC, P...
Requirements for a network storage service
NASA Technical Reports Server (NTRS)
Kelly, Suzanne M.; Haynes, Rena A.
1991-01-01
Sandia National Laboratories provides a high performance classified computer network as a core capability in support of its mission of nuclear weapons design and engineering, physical sciences research, and energy research and development. The network, locally known as the Internal Secure Network (ISN), comprises multiple distributed local area networks (LAN's) residing in New Mexico and California. The TCP/IP protocol suite is used for inter-node communications. Scientific workstations and mid-range computers, running UNIX-based operating systems, compose most LAN's. One LAN, operated by the Sandia Corporate Computing Computing Directorate, is a general purpose resource providing a supercomputer and a file server to the entire ISN. The current file server on the supercomputer LAN is an implementation of the Common File Server (CFS). Subsequent to the design of the ISN, Sandia reviewed its mass storage requirements and chose to enter into a competitive procurement to replace the existing file server with one more adaptable to a UNIX/TCP/IP environment. The requirements study for the network was the starting point for the requirements study for the new file server. The file server is called the Network Storage Service (NSS) and its requirements are described. An application or functional description of the NSS is given. The final section adds performance, capacity, and access constraints to the requirements.
Active Storage with Analytics Capabilities and I/O Runtime System for Petascale Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choudhary, Alok
Computational scientists must understand results from experimental, observational and computational simulation generated data to gain insights and perform knowledge discovery. As systems approach the petascale range, problems that were unimaginable a few years ago are within reach. With the increasing volume and complexity of data produced by ultra-scale simulations and high-throughput experiments, understanding the science is largely hampered by the lack of comprehensive I/O, storage, acceleration of data manipulation, analysis, and mining tools. Scientists require techniques, tools and infrastructure to facilitate better understanding of their data, in particular the ability to effectively perform complex data analysis, statistical analysis and knowledgemore » discovery. The goal of this work is to enable more effective analysis of scientific datasets through the integration of enhancements in the I/O stack, from active storage support at the file system layer to MPI-IO and high-level I/O library layers. We propose to provide software components to accelerate data analytics, mining, I/O, and knowledge discovery for large-scale scientific applications, thereby increasing productivity of both scientists and the systems. Our approaches include 1) design the interfaces in high-level I/O libraries, such as parallel netCDF, for applications to activate data mining operations at the lower I/O layers; 2) Enhance MPI-IO runtime systems to incorporate the functionality developed as a part of the runtime system design; 3) Develop parallel data mining programs as part of runtime library for server-side file system in PVFS file system; and 4) Prototype an active storage cluster, which will utilize multicore CPUs, GPUs, and FPGAs to carry out the data mining workload.« less
Random access in large-scale DNA data storage.
Organick, Lee; Ang, Siena Dumas; Chen, Yuan-Jyue; Lopez, Randolph; Yekhanin, Sergey; Makarychev, Konstantin; Racz, Miklos Z; Kamath, Govinda; Gopalan, Parikshit; Nguyen, Bichlien; Takahashi, Christopher N; Newman, Sharon; Parker, Hsing-Yeh; Rashtchian, Cyrus; Stewart, Kendall; Gupta, Gagan; Carlson, Robert; Mulligan, John; Carmean, Douglas; Seelig, Georg; Ceze, Luis; Strauss, Karin
2018-03-01
Synthetic DNA is durable and can encode digital data with high density, making it an attractive medium for data storage. However, recovering stored data on a large-scale currently requires all the DNA in a pool to be sequenced, even if only a subset of the information needs to be extracted. Here, we encode and store 35 distinct files (over 200 MB of data), in more than 13 million DNA oligonucleotides, and show that we can recover each file individually and with no errors, using a random access approach. We design and validate a large library of primers that enable individual recovery of all files stored within the DNA. We also develop an algorithm that greatly reduces the sequencing read coverage required for error-free decoding by maximizing information from all sequence reads. These advances demonstrate a viable, large-scale system for DNA data storage and retrieval.
76 FR 4102 - Washington 10 Storage Corporation; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-01-24
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR11-80-000] Washington 10 Storage Corporation; Notice of Filing January 13, 2011. Take notice that on January 12, 2011, Washington 10 Storage Corporation filed a revised Statement of Operating Conditions (SOC) to correct...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-09-29
... efficiency; (7) optimization of generation and energy storage ranging from a 4 unit, 500 megawatts (MW) (4... brief comments up to 6,000 characters, without prior registration, using the eComment system at http...
Hierarchical storage management system evaluation
NASA Technical Reports Server (NTRS)
Woodrow, Thomas S.
1993-01-01
The Numerical Aerodynamic Simulation (NAS) Program at NASA Ames Research Center has been developing a hierarchical storage management system, NAStore, for some 6 years. This evaluation compares functionality, performance, reliability, and other factors of NAStore and three commercial alternatives. FileServ is found to be slightly better overall than NAStore and DMF. UniTree is found to be severely lacking in comparison.
Cooperative storage of shared files in a parallel computing system with dynamic block size
Bent, John M.; Faibish, Sorin; Grider, Gary
2015-11-10
Improved techniques are provided for parallel writing of data to a shared object in a parallel computing system. A method is provided for storing data generated by a plurality of parallel processes to a shared object in a parallel computing system. The method is performed by at least one of the processes and comprises: dynamically determining a block size for storing the data; exchanging a determined amount of the data with at least one additional process to achieve a block of the data having the dynamically determined block size; and writing the block of the data having the dynamically determined block size to a file system. The determined block size comprises, e.g., a total amount of the data to be stored divided by the number of parallel processes. The file system comprises, for example, a log structured virtual parallel file system, such as a Parallel Log-Structured File System (PLFS).
76 FR 5571 - Combined Notice of Filings No. 2
Federal Register 2010, 2011, 2012, 2013, 2014
2011-02-01
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission Combined Notice of Filings No. 2 January.... Description: Freebird Gas Storage, L.L.C. submits tariff filing per 154.203: Freebird Gas Storage Baseline....203: ASC Baseline Compliance Filing, to be effective 1/19/2011. Filed Date: 01/19/2011. Accession...
NASA Technical Reports Server (NTRS)
Blackwell, Kim; Blasso, Len (Editor); Lipscomb, Ann (Editor)
1991-01-01
The proceedings of the National Space Science Data Center Conference on Mass Storage Systems and Technologies for Space and Earth Science Applications held July 23 through 25, 1991 at the NASA/Goddard Space Flight Center are presented. The program includes a keynote address, invited technical papers, and selected technical presentations to provide a broad forum for the discussion of a number of important issues in the field of mass storage systems. Topics include magnetic disk and tape technologies, optical disk and tape, software storage and file management systems, and experiences with the use of a large, distributed storage system. The technical presentations describe integrated mass storage systems that are expected to be available commercially. Also included is a series of presentations from Federal Government organizations and research institutions covering their mass storage requirements for the 1990's.
A high-speed network for cardiac image review.
Elion, J L; Petrocelli, R R
1994-01-01
A high-speed fiber-based network for the transmission and display of digitized full-motion cardiac images has been developed. Based on Asynchronous Transfer Mode (ATM), the network is scaleable, meaning that the same software and hardware is used for a small local area network or for a large multi-institutional network. The system can handle uncompressed digital angiographic images, considered to be at the "high-end" of the bandwidth requirements. Along with the networking, a general-purpose multi-modality review station has been implemented without specialized hardware. This station can store a full injection sequence in "loop RAM" in a 512 x 512 format, then interpolate to 1024 x 1024 while displaying at 30 frames per second. The network and review stations connect to a central file server that uses a virtual file system to make a large high-speed RAID storage disk and associated off-line storage tapes and cartridges all appear as a single large file system to the software. In addition to supporting archival storage and review, the system can also digitize live video using high-speed Direct Memory Access (DMA) from the frame grabber to present uncompressed data to the network. Fully functional prototypes have provided the proof of concept, with full deployment in the institution planned as the next stage.
A high-speed network for cardiac image review.
Elion, J. L.; Petrocelli, R. R.
1994-01-01
A high-speed fiber-based network for the transmission and display of digitized full-motion cardiac images has been developed. Based on Asynchronous Transfer Mode (ATM), the network is scaleable, meaning that the same software and hardware is used for a small local area network or for a large multi-institutional network. The system can handle uncompressed digital angiographic images, considered to be at the "high-end" of the bandwidth requirements. Along with the networking, a general-purpose multi-modality review station has been implemented without specialized hardware. This station can store a full injection sequence in "loop RAM" in a 512 x 512 format, then interpolate to 1024 x 1024 while displaying at 30 frames per second. The network and review stations connect to a central file server that uses a virtual file system to make a large high-speed RAID storage disk and associated off-line storage tapes and cartridges all appear as a single large file system to the software. In addition to supporting archival storage and review, the system can also digitize live video using high-speed Direct Memory Access (DMA) from the frame grabber to present uncompressed data to the network. Fully functional prototypes have provided the proof of concept, with full deployment in the institution planned as the next stage. PMID:7949964
76 FR 2368 - Washington 10 Storage Corporation; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-01-13
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR11-78-000] Washington 10 Storage Corporation; Notice of Filing January 5, 2011. Take notice that on January 4, 2011, Washington 10 Storage Corporation filed a Statement of Operating Conditions to revise certain provisions of its Firm and...
77 FR 10490 - Arcadia Gas Storage, LLC; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-22
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR12-12-001] Arcadia Gas Storage, LLC; Notice of Filing Take notice that on February 13, 2012, Arcadia Gas Storage, LLC filed a revised Statement of Operating Conditions to further define the priority of service of its proposed...
77 FR 6107 - Arcadia Gas Storage, LLC; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-07
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR12-12-000] Arcadia Gas Storage, LLC; Notice of Filing Take notice that on January 30, 2012, Arcadia Gas Storage, LLC filed a Statement of Operating Conditions to set forth the addition of its Enhanced Authorized Overrun Service. Any...
75 FR 74706 - Washington 10 Storage Corporation; Notice of Baseline Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-12-01
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR10-37-002] Washington 10 Storage Corporation; Notice of Baseline Filing November 23, 2010. Take notice that on November 19, 2010, Washington 10 Storage Corporation submitted a revised baseline filing of its Statement of Operating...
75 FR 37786 - Washington 10 Storage Corporation; Notice of Baseline Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-30
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR10-37-000] Washington 10 Storage Corporation; Notice of Baseline Filing June 23, 2010. Take notice that on June 18, 2010, Washington 10 Storage Corporation submitted a baseline filing of its Statement of Operating Conditions for...
76 FR 26719 - Washington 10 Storage Corporation; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-09
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR11-80-001] Washington 10 Storage Corporation; Notice of Filing Take notice that on April 29, 2011, Washington 10 Storage Corporation (Washington 10) filed a revised Statement of Operating Conditions (SOC) to comply with an April 25...
76 FR 78915 - Washington 10 Storage Corporation; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-12-20
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR12-10-000] Washington 10 Storage Corporation; Notice of Filing Take notice that on December 13, 2011, Washington 10 Storage Corporation (Washington 10) filed a Statement of Operating Conditions to revise certain provisions of its Firm...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-28
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Project No. 14125-000] Reliable Storage 1 LLC; Notice of Preliminary Permit Application Accepted for Filing and Soliciting Comments, Motions To Intervene, and Competing Applications On March 25, 2011, Reliable Storage 1 LLC, filed an...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-28
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Project No. 14121-000] Reliable Storage 1 LLC; Notice of Preliminary Permit Application Accepted for Filing and Soliciting Comments, Motions To Intervene, and Competing Applications On March 25, 2011, Reliable Storage 1 LLC, filed an...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-28
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Project No. 14122-000] Reliable Storage 1 LLC; Notice of Preliminary Permit Application Accepted for Filing and Soliciting Comments, Motions To Intervene, and Competing Applications On March 25, 2011, Reliable Storage 1 LLC, filed an...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-28
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Project No. 14124-000] Reliable Storage 1 LLC; Notice of Preliminary Permit Application Accepted for Filing and Soliciting Comments, Motions To Intervene, and Competing Applications On March 25, 2011, Reliable Storage 1 LLC, filed an...
Grid data access on widely distributed worker nodes using scalla and SRM
NASA Astrophysics Data System (ADS)
Jakl, P.; Lauret, J.; Hanushevsky, A.; Shoshani, A.; Sim, A.; Gu, J.
2008-07-01
Facing the reality of storage economics, NP experiments such as RHIC/STAR have been engaged in a shift of the analysis model, and now heavily rely on using cheap disks attached to processing nodes, as such a model is extremely beneficial over expensive centralized storage. Additionally, exploiting storage aggregates with enhanced distributed computing capabilities such as dynamic space allocation (lifetime of spaces), file management on shared storages (lifetime of files, pinning file), storage policies or a uniform access to heterogeneous storage solutions is not an easy task. The Xrootd/Scalla system allows for storage aggregation. We will present an overview of the largest deployment of Scalla (Structured Cluster Architecture for Low Latency Access) in the world spanning over 1000 CPUs co-sharing the 350 TB Storage Elements and the experience on how to make such a model work in the RHIC/STAR standard analysis framework. We will explain the key features and approach on how to make access to mass storage (HPSS) possible in such a large deployment context. Furthermore, we will give an overview of a fully 'gridified' solution using the plug-and-play features of Scalla architecture, replacing standard storage access with grid middleware SRM (Storage Resource Manager) components designed for space management and will compare the solution with the standard Scalla approach in use in STAR for the past 2 years. Integration details, future plans and status of development will be explained in the area of best transfer strategy between multiple-choice data pools and best placement with respect of load balancing and interoperability with other SRM aware tools or implementations.
Low-cost high performance distributed data storage for multi-channel observations
NASA Astrophysics Data System (ADS)
Liu, Ying-bo; Wang, Feng; Deng, Hui; Ji, Kai-fan; Dai, Wei; Wei, Shou-lin; Liang, Bo; Zhang, Xiao-li
2015-10-01
The New Vacuum Solar Telescope (NVST) is a 1-m solar telescope that aims to observe the fine structures in both the photosphere and the chromosphere of the Sun. The observational data acquired simultaneously from one channel for the chromosphere and two channels for the photosphere bring great challenges to the data storage of NVST. The multi-channel instruments of NVST, including scientific cameras and multi-band spectrometers, generate at least 3 terabytes data per day and require high access performance while storing massive short-exposure images. It is worth studying and implementing a storage system for NVST which would balance the data availability, access performance and the cost of development. In this paper, we build a distributed data storage system (DDSS) for NVST and then deeply evaluate the availability of real-time data storage on a distributed computing environment. The experimental results show that two factors, i.e., the number of concurrent read/write and the file size, are critically important for improving the performance of data access on a distributed environment. Referring to these two factors, three strategies for storing FITS files are presented and implemented to ensure the access performance of the DDSS under conditions of multi-host write and read simultaneously. The real applications of the DDSS proves that the system is capable of meeting the requirements of NVST real-time high performance observational data storage. Our study on the DDSS is the first attempt for modern astronomical telescope systems to store real-time observational data on a low-cost distributed system. The research results and corresponding techniques of the DDSS provide a new option for designing real-time massive astronomical data storage system and will be a reference for future astronomical data storage.
NASA Astrophysics Data System (ADS)
Piasecki, M.; Ji, P.
2014-12-01
Geoscience data comes in many flavors that are determined by type of data such as continous on a grid or mesh or discrete colelcted at point either as one time samples or a stream of data coming of sensors, but coudl also encompass digital files of any time type such text files, WORD or EXCEL documents, or audio and video files. We present a storage facility that is comprsed of 6 nodes each of speciaized to host a certain data type: grid based data (netCDF on a THREDDS server), GIS data (shapefiles using GeoServer), point time series data (CUAHSI ODM), sample data (EDBS), and any digital data (RAMADAA) plus a server fro Remote sensing data and its products. While there is overlap in data type storage capabilities (rasters can go into several of these nodes) we prefer to use dedicated storage facilities that are a) freeware, and b) have a good degree of maturity, and c) have shown their utility for stroing a cetain type. In addition it allows to place these commonly used software stacks and storage solutiosn side-by-side to develop interoprability strategies. We have used a DRUPAL based system to handle user regoistration and authentication, and also use the system for data submission and data search. In support for tis system we developed an extensive controlled vocabulary system that is an amalgamation of various CVs used in the geosciecne community in order to achieve as high a degree of recognition, such the CF conventions, CUAHSI Cvs, , NASA (GCMD), EPA and USGS taxonomies, GEMET, in addition to ontological representations such as SWEET.
DOE Office of Scientific and Technical Information (OSTI.GOV)
BYNA, SUNRENDRA; DONG, BIN; WU, KESHENG
Data Elevator: Efficient Asynchronous Data Movement in Hierarchical Storage Systems Multi-layer storage subsystems, including SSD-based burst buffers and disk-based parallel file systems (PFS), are becoming part of HPC systems. However, software for this storage hierarchy is still in its infancy. Applications may have to explicitly move data among the storage layers. We propose Data Elevator for transparently and efficiently moving data between a burst buffer and a PFS. Users specify the final destination for their data, typically on PFS, Data Elevator intercepts the I/O calls, stages data on burst buffer, and then asynchronously transfers the data to their final destinationmore » in the background. This system allows extensive optimizations, such as overlapping read and write operations, choosing I/O modes, and aligning buffer boundaries. In tests with large-scale scientific applications, Data Elevator is as much as 4.2X faster than Cray DataWarp, the start-of-art software for burst buffer, and 4X faster than directly writing to PFS. The Data Elevator library uses HDF5's Virtual Object Layer (VOL) for intercepting parallel I/O calls that write data to PFS. The intercepted calls are redirected to the Data Elevator, which provides a handle to write the file in a faster and intermediate burst buffer system. Once the application finishes writing the data to the burst buffer, the Data Elevator job uses HDF5 to move the data to final destination in an asynchronous manner. Hence, using the Data Elevator library is currently useful for applications that call HDF5 for writing data files. Also, the Data Elevator depends on the HDF5 VOL functionality.« less
Cloud Engineering Principles and Technology Enablers for Medical Image Processing-as-a-Service.
Bao, Shunxing; Plassard, Andrew J; Landman, Bennett A; Gokhale, Aniruddha
2017-04-01
Traditional in-house, laboratory-based medical imaging studies use hierarchical data structures (e.g., NFS file stores) or databases (e.g., COINS, XNAT) for storage and retrieval. The resulting performance from these approaches is, however, impeded by standard network switches since they can saturate network bandwidth during transfer from storage to processing nodes for even moderate-sized studies. To that end, a cloud-based "medical image processing-as-a-service" offers promise in utilizing the ecosystem of Apache Hadoop, which is a flexible framework providing distributed, scalable, fault tolerant storage and parallel computational modules, and HBase, which is a NoSQL database built atop Hadoop's distributed file system. Despite this promise, HBase's load distribution strategy of region split and merge is detrimental to the hierarchical organization of imaging data (e.g., project, subject, session, scan, slice). This paper makes two contributions to address these concerns by describing key cloud engineering principles and technology enhancements we made to the Apache Hadoop ecosystem for medical imaging applications. First, we propose a row-key design for HBase, which is a necessary step that is driven by the hierarchical organization of imaging data. Second, we propose a novel data allocation policy within HBase to strongly enforce collocation of hierarchically related imaging data. The proposed enhancements accelerate data processing by minimizing network usage and localizing processing to machines where the data already exist. Moreover, our approach is amenable to the traditional scan, subject, and project-level analysis procedures, and is compatible with standard command line/scriptable image processing software. Experimental results for an illustrative sample of imaging data reveals that our new HBase policy results in a three-fold time improvement in conversion of classic DICOM to NiFTI file formats when compared with the default HBase region split policy, and nearly a six-fold improvement over a commonly available network file system (NFS) approach even for relatively small file sets. Moreover, file access latency is lower than network attached storage.
Painless File Extraction: The A(rc)--Z(oo) of Internet Archive Formats.
ERIC Educational Resources Information Center
Simmonds, Curtis
1993-01-01
Discusses extraction programs needed to postprocess software downloaded from the Internet that has been archived and compressed for the purposes of storage and file transfer. Archiving formats for DOS, Macintosh, and UNIX operating systems are described; and cross-platform compression utilities are explained. (LRW)
Cardio-PACs: a new opportunity
NASA Astrophysics Data System (ADS)
Heupler, Frederick A., Jr.; Thomas, James D.; Blume, Hartwig R.; Cecil, Robert A.; Heisler, Mary
2000-05-01
It is now possible to replace film-based image management in the cardiac catheterization laboratory with a Cardiology Picture Archiving and Communication System (Cardio-PACS) based on digital imaging technology. The first step in the conversion process is installation of a digital image acquisition system that is capable of generating high-quality DICOM-compatible images. The next three steps, which are the subject of this presentation, involve image display, distribution, and storage. Clinical requirements and associated cost considerations for these three steps are listed below: Image display: (1) Image quality equal to film, with DICOM format, lossless compression, image processing, desktop PC-based with color monitor, and physician-friendly imaging software; (2) Performance specifications include: acquire 30 frames/sec; replay 15 frames/sec; access to file server 5 seconds, and to archive 5 minutes; (3) Compatibility of image file, transmission, and processing formats; (4) Image manipulation: brightness, contrast, gray scale, zoom, biplane display, and quantification; (5) User-friendly control of image review. Image distribution: (1) Standard IP-based network between cardiac catheterization laboratories, file server, long-term archive, review stations, and remote sites; (2) Non-proprietary formats; (3) Bidirectional distribution. Image storage: (1) CD-ROM vs disk vs tape; (2) Verification of data integrity; (3) User-designated storage capacity for catheterization laboratory, file server, long-term archive. Costs: (1) Image acquisition equipment, file server, long-term archive; (2) Network infrastructure; (3) Review stations and software; (4) Maintenance and administration; (5) Future upgrades and expansion; (6) Personnel.
75 FR 18200 - Monroe Gas Storage Company, LLC; Notice of Compliance Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-04-09
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. RP09-447-004] Monroe Gas Storage Company, LLC; Notice of Compliance Filing April 1, 2010. Take notice that on March 23, 2010, Monroe Gas Storage Company, LLC (Monroe), submitted a compliance filing to comply with the February 18...
76 FR 28970 - Worsham-Steed Gas Storage, LLC; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-19
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR11-109-000] Worsham-Steed Gas Storage, LLC; Notice of Filing Take notice that on May 12, 2011, Worsham-Steed Gas Storage, LLC filed to update its address in its Statement of Operating Conditions as more fully described in the...
76 FR 30338 - Hill-Lake Gas Storage, LLC; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-25
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR11-110-000] Hill-Lake Gas Storage, LLC; Notice of Filing Take notice that on May 13, 2011, Hill-Lake Gas Storage, LLC filed to update its address and to clarify definitions for Maximum Daily Withdrawal Quantity and Maximum Daily...
77 FR 14514 - Bay Gas Storage, LLC: Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2012-03-12
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR12-18-000] Bay Gas Storage, LLC: Notice of Filing Take notice that on March 2, 2012, Bay Gas Storage, LLC filed pursuant to Section 12.2.4 of its Statement of Operating Conditions to revise its Company Use Percentage as more fully...
76 FR 13611 - Bay Gas Storage, LLC; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-14
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR11-91-000] Bay Gas Storage, LLC; Notice of Filing Take notice that on February 28, 2011, Bay Gas Storage, LLC (Bay Gas) filed pursuant to Section 12.2.4 of its Statement of Operating Conditions to revise its Company Use Percentage as...
78 FR 2982 - Steuben Gas Storage Company (Steuben); Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-15
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. AC13-14-000] Steuben Gas Storage Company (Steuben); Notice of Filing Take notice that on October 19, 2012, Steuben Gas Storage Company (Steuben) submitted a request for a waiver of the reporting requirement to file the FERC Form 2-A...
78 FR 16495 - Bay Gas Storage, LLC; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2013-03-15
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR13-39-000] Bay Gas Storage, LLC; Notice of Filing Take notice that on February 28, 2013, Bay Gas Storage, LLC filed pursuant to Section 12.2.4 of its Statement of Operating Conditions to revise its Company Use Percentage as more fully...
77 FR 36527 - Enstor Katy Storage and Transportation, L.P.; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-19
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR12-29-000] Enstor Katy Storage and Transportation, L.P.; Notice of Filing Take notice that on June 12, 2012, Enstor Katy Storage and Transportation, L.P. filed to revise its Statement of Operating Conditions to correct, update, and...
76 FR 48841 - Liberty Gas Storage, LLC; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-09
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. AC11-121-000] Liberty Gas Storage, LLC; Notice of Filing Take notice that on July 25, 2011, Liberty Gas Storage, LLC (Liberty) submitted a request for confirmation that it is not required to file FERC Form No. 2-A and will not be...
76 FR 47569 - Arcadia Gas Storage, LLC; Notice of Baseline Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-05
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket Nos. PR11-111-000; PR11-111-001] Arcadia Gas Storage, LLC; Notice of Baseline Filing Take notice that on May 19, 2011 and July 26, 2011, Arcadia Gas Storage, LLC submitted a revised baseline filing of their Statement of Operating Conditions...
75 FR 66077 - Bay Gas Storage Company Ltd.; Notice of Compliance Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-27
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR10-15-002] Bay Gas Storage Company Ltd.; Notice of Compliance Filing October 20, 2010. Take notice that on October 13, 2010, Bay Gas Storage Company Ltd. (Bay Gas) filed its Refund Report pursuant to its August 30, 2010 Settlement...
76 FR 25328 - Worsham-Steed Gas Storage, LLC; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-04
... Gas Storage, LLC; Notice of Filing Take notice that on April 27, 2011, Worsham-Steed Gas Storage, LLC... of Operating Conditions for Gas Storage and Transportation Services provided under section 311 of the... and to consolidate various firm and interruptible storage and transportation services into more...
Attaching IBM-compatible 3380 disks to Cray X-MP
DOE Office of Scientific and Technical Information (OSTI.GOV)
Engert, D.E.; Midlock, J.L.
1989-01-01
A method of attaching IBM-compatible 3380 disks directly to a Cray X-MP via the XIOP with a BMC is described. The IBM 3380 disks appear to the UNICOS operating system as DD-29 disks with UNICOS file systems. IBM 3380 disks provide cheap, reliable large capacity disk storage. Combined with a small number of high-speed Cray disks, the IBM disks provide for the bulk of the storage for small files and infrequently used files. Cray Research designed the BMC and its supporting software in the XIOP to allow IBM tapes and other devices to be attached to the X-MP. No hardwaremore » changes were necessary, and we added less than 2000 lines of code to the XIOP to accomplish this project. This system has been in operation for over eight months. Future enhancements such as the use of a cache controller and attachment to a Y-MP are also described. 1 tab.« less
INFOL for the CDC 6400 Information Storage and Retrieval System. Reference Manual.
ERIC Educational Resources Information Center
Mittman, B.; And Others
INFOL for the CDC 6400 is a rewrite in FORTRAN IV of the CDC 3600/3800 INFOL (Information Oriented Language), a generalized information storage and retrieval system developed by the Control Data Corporation for the CDC 3600/3800 computer. With INFOL, selected pieces of information are extracted from a file and presented to the user quickly and…
DOE Office of Scientific and Technical Information (OSTI.GOV)
HOLDEN, N.E.
A short history of CSISRS, pronounced ''scissors'' and standing for the Cross Section Information Storage and Retrieval System, is given. The relationship of CSISRS to CINDA, to the neutron nuclear data four-centers, to EXFOR and to ENDF, the evaluated neutron nuclear data file, is briefly explained.
DIRAC File Replica and Metadata Catalog
NASA Astrophysics Data System (ADS)
Tsaregorodtsev, A.; Poss, S.
2012-12-01
File replica and metadata catalogs are essential parts of any distributed data management system, which are largely determining its functionality and performance. A new File Catalog (DFC) was developed in the framework of the DIRAC Project that combines both replica and metadata catalog functionality. The DFC design is based on the practical experience with the data management system of the LHCb Collaboration. It is optimized for the most common patterns of the catalog usage in order to achieve maximum performance from the user perspective. The DFC supports bulk operations for replica queries and allows quick analysis of the storage usage globally and for each Storage Element separately. It supports flexible ACL rules with plug-ins for various policies that can be adopted by a particular community. The DFC catalog allows to store various types of metadata associated with files and directories and to perform efficient queries for the data based on complex metadata combinations. Definition of file ancestor-descendent relation chains is also possible. The DFC catalog is implemented in the general DIRAC distributed computing framework following the standard grid security architecture. In this paper we describe the design of the DFC and its implementation details. The performance measurements are compared with other grid file catalog implementations. The experience of the DFC Catalog usage in the CLIC detector project are discussed.
Computer Storage and Retrieval of Position - Dependent Data.
1982-06-01
This thesis covers the design of a new digital database system to replace the merged (observation and geographic location) record, one file per cruise...68 "The Digital Data Library System: Library Storage and Retrieval of Digital Geophysical Data" by Robert C. Groan) provided a relatively simple...dependent, ’geophysical’ data. The system is operational on a Digital Equipment Corporation VAX-11/780 computer. Values of measured and computed
Cloud archiving and data mining of High-Resolution Rapid Refresh forecast model output
NASA Astrophysics Data System (ADS)
Blaylock, Brian K.; Horel, John D.; Liston, Samuel T.
2017-12-01
Weather-related research often requires synthesizing vast amounts of data that need archival solutions that are both economical and viable during and past the lifetime of the project. Public cloud computing services (e.g., from Amazon, Microsoft, or Google) or private clouds managed by research institutions are providing object data storage systems potentially appropriate for long-term archives of such large geophysical data sets. We illustrate the use of a private cloud object store developed by the Center for High Performance Computing (CHPC) at the University of Utah. Since early 2015, we have been archiving thousands of two-dimensional gridded fields (each one containing over 1.9 million values over the contiguous United States) from the High-Resolution Rapid Refresh (HRRR) data assimilation and forecast modeling system. The archive is being used for retrospective analyses of meteorological conditions during high-impact weather events, assessing the accuracy of the HRRR forecasts, and providing initial and boundary conditions for research simulations. The archive is accessible interactively and through automated download procedures for researchers at other institutions that can be tailored by the user to extract individual two-dimensional grids from within the highly compressed files. Characteristics of the CHPC object storage system are summarized relative to network file system storage or tape storage solutions. The CHPC storage system is proving to be a scalable, reliable, extensible, affordable, and usable archive solution for our research.
Grand challenges in mass storage: A system integrator's perspective
NASA Technical Reports Server (NTRS)
Mintz, Dan; Lee, Richard
1993-01-01
The grand challenges are the following: to develop more innovation in approach; to expand the I/O barrier; to achieve increased volumetric efficiency and incremental cost improvements; to reinforce the 'weakest link' software; to implement improved architectures; and to minimize the impact of self-destructing technologies. Mass storage is defined as any type of storage system exceeding 100 GBytes in total size, under the control of a centralized file management scheme. The topics covered are presented in viewgraph form.
A system for the input and storage of data in the Besm-6 digital computer
NASA Technical Reports Server (NTRS)
Schmidt, K.; Blenke, L.
1975-01-01
Computer programs used for the decoding and storage of large volumes of data on the the BESM-6 computer are described. The following factors are discussed: the programming control language allows the programs to be run as part of a modular programming system used in data processing; data control is executed in a hierarchically built file on magnetic tape with sequential index storage; and the programs are not dependent on the structure of the data.
Scientific Data Storage for Cloud Computing
NASA Astrophysics Data System (ADS)
Readey, J.
2014-12-01
Traditionally data storage used for geophysical software systems has centered on file-based systems and libraries such as NetCDF and HDF5. In contrast cloud based infrastructure providers such as Amazon AWS, Microsoft Azure, and the Google Cloud Platform generally provide storage technologies based on an object based storage service (for large binary objects) complemented by a database service (for small objects that can be represented as key-value pairs). These systems have been shown to be highly scalable, reliable, and cost effective. We will discuss a proposed system that leverages these cloud-based storage technologies to provide an API-compatible library for traditional NetCDF and HDF5 applications. This system will enable cloud storage suitable for geophysical applications that can scale up to petabytes of data and thousands of users. We'll also cover other advantages of this system such as enhanced metadata search.
76 FR 52323 - Combined Notice of Filings; Filings Instituting Proceedings
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-22
.... Applicants: Young Gas Storage Company, Ltd. Description: Young Gas Storage Company, Ltd. submits tariff..., but intervention is necessary to become a party to the proceeding. The filings are accessible in the.... More detailed information relating to filing requirements, interventions, protests, and service can be...
77 FR 36527 - Enstor Grama Ridge Storage and Transportation, L.L.C.; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-19
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR12-28-000] Enstor Grama Ridge Storage and Transportation, L.L.C.; Notice of Filing Take notice that on June 11, 2012, Enstor Grama Ridge Storage and Transportation, L.L.C. filed to revise its Statement of Operating Conditions to...
75 FR 31429 - Bay Gas Storage Company, Ltd.; Notice of Compliance Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-03
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR08-17-002] Bay Gas Storage Company, Ltd.; Notice of Compliance Filing May 27, 2010. Take notice that on May 21, 2010, Bay Gas Storage Company, Ltd (Bay Gas) filed to comply with the April 15, 2010, Commission Order which directed Bay Gas to...
75 FR 63452 - ONEOK Gas Storage, L.L.C.; Notice of Baseline Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-15
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR11-67-001] ONEOK Gas Storage, L.L.C.; Notice of Baseline Filing October 7, 2010. Take notice that on October 1, 2010, ONEOK Gas Storage, L.L.C. submitted a revised baseline filing of its Statement of Operating Conditions for services...
Microcomputer Database Management Systems for Bibliographic Data.
ERIC Educational Resources Information Center
Pollard, Richard
1986-01-01
Discusses criteria for evaluating microcomputer database management systems (DBMS) used for storage and retrieval of bibliographic data. Two popular types of microcomputer DBMS--file management systems and relational database management systems--are evaluated with respect to these criteria. (Author/MBR)
Arctic Boreal Vulnerability Experiment (ABoVE) Science Cloud
NASA Astrophysics Data System (ADS)
Duffy, D.; Schnase, J. L.; McInerney, M.; Webster, W. P.; Sinno, S.; Thompson, J. H.; Griffith, P. C.; Hoy, E.; Carroll, M.
2014-12-01
The effects of climate change are being revealed at alarming rates in the Arctic and Boreal regions of the planet. NASA's Terrestrial Ecology Program has launched a major field campaign to study these effects over the next 5 to 8 years. The Arctic Boreal Vulnerability Experiment (ABoVE) will challenge scientists to take measurements in the field, study remote observations, and even run models to better understand the impacts of a rapidly changing climate for areas of Alaska and western Canada. The NASA Center for Climate Simulation (NCCS) at the Goddard Space Flight Center (GSFC) has partnered with the Terrestrial Ecology Program to create a science cloud designed for this field campaign - the ABoVE Science Cloud. The cloud combines traditional high performance computing with emerging technologies to create an environment specifically designed for large-scale climate analytics. The ABoVE Science Cloud utilizes (1) virtualized high-speed InfiniBand networks, (2) a combination of high-performance file systems and object storage, and (3) virtual system environments tailored for data intensive, science applications. At the center of the architecture is a large object storage environment, much like a traditional high-performance file system, that supports data proximal processing using technologies like MapReduce on a Hadoop Distributed File System (HDFS). Surrounding the storage is a cloud of high performance compute resources with many processing cores and large memory coupled to the storage through an InfiniBand network. Virtual systems can be tailored to a specific scientist and provisioned on the compute resources with extremely high-speed network connectivity to the storage and to other virtual systems. In this talk, we will present the architectural components of the science cloud and examples of how it is being used to meet the needs of the ABoVE campaign. In our experience, the science cloud approach significantly lowers the barriers and risks to organizations that require high performance computing solutions and provides the NCCS with the agility required to meet our customers' rapidly increasing and evolving requirements.
The Challenges Facing Science Data Archiving on Current Mass Storage Systems
NASA Technical Reports Server (NTRS)
Peavey, Bernard; Behnke, Jeanne (Editor)
1996-01-01
This paper discusses the desired characteristics of a tape-based petabyte science data archive and retrieval system required to store and distribute several terabytes (TB) of data per day over an extended period of time, probably more than 115 years, in support of programs such as the Earth Observing System Data and Information System (EOSDIS). These characteristics take into consideration not only cost effective and affordable storage capacity, but also rapid access to selected files, and reading rates that are needed to satisfy thousands of retrieval transactions per day. It seems that where rapid random access to files is not crucial, the tape medium, magnetic or optical, continues to offer cost effective data storage and retrieval solutions, and is likely to do so for many years to come. However, in environments like EOS these tape based archive solutions provide less than full user satisfaction. Therefore, the objective of this paper is to describe the performance and operational enhancements that need to be made to the current tape based archival systems in order to achieve greater acceptance by the EOS and similar user communities.
The Jade File System. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Rao, Herman Chung-Hwa
1991-01-01
File systems have long been the most important and most widely used form of shared permanent storage. File systems in traditional time-sharing systems, such as Unix, support a coherent sharing model for multiple users. Distributed file systems implement this sharing model in local area networks. However, most distributed file systems fail to scale from local area networks to an internet. Four characteristics of scalability were recognized: size, wide area, autonomy, and heterogeneity. Owing to size and wide area, techniques such as broadcasting, central control, and central resources, which are widely adopted by local area network file systems, are not adequate for an internet file system. An internet file system must also support the notion of autonomy because an internet is made up by a collection of independent organizations. Finally, heterogeneity is the nature of an internet file system, not only because of its size, but also because of the autonomy of the organizations in an internet. The Jade File System, which provides a uniform way to name and access files in the internet environment, is presented. Jade is a logical system that integrates a heterogeneous collection of existing file systems, where heterogeneous means that the underlying file systems support different file access protocols. Because of autonomy, Jade is designed under the restriction that the underlying file systems may not be modified. In order to avoid the complexity of maintaining an internet-wide, global name space, Jade permits each user to define a private name space. In Jade's design, we pay careful attention to avoiding unnecessary network messages between clients and file servers in order to achieve acceptable performance. Jade's name space supports two novel features: (1) it allows multiple file systems to be mounted under one direction; and (2) it permits one logical name space to mount other logical name spaces. A prototype of Jade was implemented to examine and validate its design. The prototype consists of interfaces to the Unix File System, the Sun Network File System, and the File Transfer Protocol.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-06-08
... storage, conveyor transfer points, bagging and bulk loading and unloading systems. These standards rely on... part shall maintain a file of these measurements, and retain the file for at least two years following... total time, effort, or financial resources expended by persons to generate, maintain, retain, or...
Grid Data Access on Widely Distributed Worker Nodes Using Scalla and SRM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jakl, Pavel; /Prague, Inst. Phys.; Lauret, Jerome
2011-11-10
Facing the reality of storage economics, NP experiments such as RHIC/STAR have been engaged in a shift of the analysis model, and now heavily rely on using cheap disks attached to processing nodes, as such a model is extremely beneficial over expensive centralized storage. Additionally, exploiting storage aggregates with enhanced distributed computing capabilities such as dynamic space allocation (lifetime of spaces), file management on shared storages (lifetime of files, pinning file), storage policies or a uniform access to heterogeneous storage solutions is not an easy task. The Xrootd/Scalla system allows for storage aggregation. We will present an overview of themore » largest deployment of Scalla (Structured Cluster Architecture for Low Latency Access) in the world spanning over 1000 CPUs co-sharing the 350 TB Storage Elements and the experience on how to make such a model work in the RHIC/STAR standard analysis framework. We will explain the key features and approach on how to make access to mass storage (HPSS) possible in such a large deployment context. Furthermore, we will give an overview of a fully 'gridified' solution using the plug-and-play features of Scalla architecture, replacing standard storage access with grid middleware SRM (Storage Resource Manager) components designed for space management and will compare the solution with the standard Scalla approach in use in STAR for the past 2 years. Integration details, future plans and status of development will be explained in the area of best transfer strategy between multiple-choice data pools and best placement with respect of load balancing and interoperability with other SRM aware tools or implementations.« less
Optimising LAN access to grid enabled storage elements
NASA Astrophysics Data System (ADS)
Stewart, G. A.; Cowan, G. A.; Dunne, B.; Elwell, A.; Millar, A. P.
2008-07-01
When operational, the Large Hadron Collider experiments at CERN will collect tens of petabytes of physics data per year. The worldwide LHC computing grid (WLCG) will distribute this data to over two hundred Tier-1 and Tier-2 computing centres, enabling particle physicists around the globe to access the data for analysis. Although different middleware solutions exist for effective management of storage systems at collaborating institutes, the patterns of access envisaged for Tier-2s fall into two distinct categories. The first involves bulk transfer of data between different Grid storage elements using protocols such as GridFTP. This data movement will principally involve writing ESD and AOD files into Tier-2 storage. Secondly, once datasets are stored at a Tier-2, physics analysis jobs will read the data from the local SE. Such jobs require a POSIX-like interface to the storage so that individual physics events can be extracted. In this paper we consider the performance of POSIX-like access to files held in Disk Pool Manager (DPM) storage elements, a popular lightweight SRM storage manager from EGEE.
An introduction to the Marshall information retrieval and display system
NASA Technical Reports Server (NTRS)
1974-01-01
An on-line terminal oriented data storage and retrieval system is presented which allows a user to extract and process information from stored data bases. The use of on-line terminals for extracting and displaying data from the data bases provides a fast and responsive method for obtaining needed information. The system consists of general purpose computer programs that provide the overall capabilities of the total system. The system can process any number of data files via a Dictionary (one for each file) which describes the data format to the system. New files may be added to the system at any time, and reprogramming is not required. Illustrations of the system are shown, and sample inquiries and responses are given.
Data General Corporation Advanced Operating System/Virtual Storage (AOS/ VS). Revision 7.60
1989-02-22
control list for each directory and data file. An access control list includes the users who can and cannot access files as well as the access...and any required data, it can -5- February 22, 1989 Final Evaluation Report Data General AOS/VS SYSTEM OVERVIEW operate asynchronously and in parallel...memory. The IOC can perform the data transfer without further interventiin from the CPU. The I/O channels interface with the processor or system
Utilizing HDF4 File Content Maps for the Cloud
NASA Technical Reports Server (NTRS)
Lee, Hyokyung Joe
2016-01-01
We demonstrate a prototype study that HDF4 file content map can be used for efficiently organizing data in cloud object storage system to facilitate cloud computing. This approach can be extended to any binary data formats and to any existing big data analytics solution powered by cloud computing because HDF4 file content map project started as long term preservation of NASA data that doesn't require HDF4 APIs to access data.
Evaluation of ZFS as an efficient WLCG storage backend
NASA Astrophysics Data System (ADS)
Ebert, M.; Washbrook, A.
2017-10-01
A ZFS based software raid system was tested for performance against a hardware raid system providing storage based on the traditional Linux file systems XFS and EXT4. These tests were done for a healthy raid array as well as for a degraded raid array and during the rebuild of a raid array. It was found that ZFS performs better in almost all test scenarios. In addition, distinct features of ZFS were tested for WLCG data storage use, like compression and higher raid levels with triple redundancy information. The long term reliability was observed after converting all production storage servers at the Edinburgh WLCG Tier-2 site to ZFS, resulting in about 1.2PB of ZFS based storage at this site.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gibson, Garth
Petascale computing infrastructures for scientific discovery make petascale demands on information storage capacity, performance, concurrency, reliability, availability, and manageability. The Petascale Data Storage Institute focuses on the data storage problems found in petascale scientific computing environments, with special attention to community issues such as interoperability, community buy-in, and shared tools. The Petascale Data Storage Institute is a collaboration between researchers at Carnegie Mellon University, National Energy Research Scientific Computing Center, Pacific Northwest National Laboratory, Oak Ridge National Laboratory, Sandia National Laboratory, Los Alamos National Laboratory, University of Michigan, and the University of California at Santa Cruz. Because the Institute focusesmore » on low level files systems and storage systems, its role in improving SciDAC systems was one of supporting application middleware such as data management and system-level performance tuning. In retrospect, the Petascale Data Storage Institute’s most innovative and impactful contribution is the Parallel Log-structured File System (PLFS). Published in SC09, PLFS is middleware that operates in MPI-IO or embedded in FUSE for non-MPI applications. Its function is to decouple concurrently written files into a per-process log file, whose impact (the contents of the single file that the parallel application was concurrently writing) is determined on later reading, rather than during its writing. PLFS is transparent to the parallel application, offering a POSIX or MPI-IO interface, and it shows an order of magnitude speedup to the Chombo benchmark and two orders of magnitude to the FLASH benchmark. Moreover, LANL production applications see speedups of 5X to 28X, so PLFS has been put into production at LANL. Originally conceived and prototyped in a PDSI collaboration between LANL and CMU, it has grown to engage many other PDSI institutes, international partners like AWE, and has a large team at EMC supporting and enhancing it. PLFS is open sourced with a BSD license on sourceforge. Post PDSI funding comes from NNSA and industry sources. Moreover, PLFS has spin out half a dozen or more papers, partnered on research with multiple schools and vendors, and has projects to transparently 1) dis- tribute metadata over independent metadata servers, 2) exploit drastically non-POSIX Hadoop storage for HPC POSIX applications, 3) compress checkpoints on the fly, 4) batch delayed writes for write speed, 5) compress read-back indexes and parallelize their redistribution, 6) double-buffer writes in NAND Flash storage to decouple host blocking during checkpoint from disk write time in the storage system, 7) pack small files into a smaller number of bigger containers. There are two large scale open source Linux software projects that PDSI significantly incubated, though neither were initated in PDSI. These are 1) Ceph, a UCSC parallel object storage research project that has continued to be a vehicle for research, and has become a released part of Linux, and 2) Parallel NFS (pNFS) a portion of the IETF’s NFSv4.1 that brings the core data parallelism found in Lustre, PanFS, PVFS, and Ceph to the industry standard NFS, with released code in Linux 3.0, and its vendor offerings, with products from NetApp, EMC, BlueArc and RedHat. Both are fundamentally supported and advanced by vendor companies now, but were critcally transferred from research demonstration to viable product with funding from PDSI, in part. At this point Lustre remains the primary path to scalable IO in Exascale systems, but both Ceph and pNFS are viable alternatives with different fundamental advantages. Finally, research community building was a big success for PDSI. Through the HECFSIO workshops and HECURA project with NSF PDSI stimulated and helped to steer leveraged funding of over $25M. Through the Petascale (now Parallel) Data Storage Workshop series, www.pdsw.org, colocated with SCxy each year, PDSI created and incubated five offerings of this high-attendance workshop. The workshop has gone on without PDSI support with two more highly successfully workshops, rewriting its organizational structure to be community managed. More than 70 peer reviewed papers have been presented at PDSW workshops.« less
75 FR 50757 - Combined Notice of Filings No. 1
Federal Register 2010, 2011, 2012, 2013, 2014
2010-08-17
... Refund Report filings: Docket Numbers: RP10-1046-000. Applicants: Young Gas Storage Company, Ltd. Description: Young Gas Storage Company, Ltd. submits tariff filing per 154.203: Baseline to be effective 8/3... action to be taken, but will not serve to make protestants parties to the proceeding. Anyone filing a...
75 FR 61464 - Washington 10 Storage Corporation; Notice of Compliance Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-05
... Storage Corporation; Notice of Compliance Filing September 24, 2010. Take notice that on September 22, 2010, Washington 10 Storage Corporation, in compliance with the Commission's September 17, 2010 Letter... an effective date of June 18, 2010. \\1\\ See Washington 10 Storage Corporation, Docket No. PR10-37...
Extending DIRAC File Management with Erasure-Coding for efficient storage.
NASA Astrophysics Data System (ADS)
Cadellin Skipsey, Samuel; Todev, Paulin; Britton, David; Crooks, David; Roy, Gareth
2015-12-01
The state of the art in Grid style data management is to achieve increased resilience of data via multiple complete replicas of data files across multiple storage endpoints. While this is effective, it is not the most space-efficient approach to resilience, especially when the reliability of individual storage endpoints is sufficiently high that only a few will be inactive at any point in time. We report on work performed as part of GridPP[1], extending the Dirac File Catalogue and file management interface to allow the placement of erasure-coded files: each file distributed as N identically-sized chunks of data striped across a vector of storage endpoints, encoded such that any M chunks can be lost and the original file can be reconstructed. The tools developed are transparent to the user, and, as well as allowing up and downloading of data to Grid storage, also provide the possibility of parallelising access across all of the distributed chunks at once, improving data transfer and IO performance. We expect this approach to be of most interest to smaller VOs, who have tighter bounds on the storage available to them, but larger (WLCG) VOs may be interested as their total data increases during Run 2. We provide an analysis of the costs and benefits of the approach, along with future development and implementation plans in this area. In general, overheads for multiple file transfers provide the largest issue for competitiveness of this approach at present.
An Improved B+ Tree for Flash File Systems
NASA Astrophysics Data System (ADS)
Havasi, Ferenc
Nowadays mobile devices such as mobile phones, mp3 players and PDAs are becoming evermore common. Most of them use flash chips as storage. To store data efficiently on flash, it is necessary to adapt ordinary file systems because they are designed for use on hard disks. Most of the file systems use some kind of search tree to store index information, which is very important from a performance aspect. Here we improved the B+ search tree algorithm so as to make flash devices more efficient. Our implementation of this solution saves 98%-99% of the flash operations, and is now the part of the Linux kernel.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Orrell, S.; Ralstin, S.
1992-04-01
Many computer security plans specify that only a small percentage of the data processed will be classified. Thus, the bulk of the data on secure systems must be unclassified. Secure limited access sites operating approved classified computing systems sometimes also have a system ostensibly containing only unclassified files but operating within the secure environment. That system could be networked or otherwise connected to a classified system(s) in order that both be able to use common resources for file storage or computing power. Such a system must operate under the same rules as the secure classified systems. It is in themore » nature of unclassified files that they either came from, or will eventually migrate to, a non-secure system. Today, unclassified files are exported from systems within the secure environment typically by loading transport media and carrying them to an open system. Import of unclassified files is handled similarly. This media transport process, sometimes referred to as sneaker net, often is manually logged and controlled only by administrative procedures. A comprehensive system for secure bi-directional transfer of unclassified files between secure and open environments has yet to be developed. Any such secure file transport system should be required to meet several stringent criteria. It is the purpose of this document to begin a definition of these criteria.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Orrell, S.; Ralstin, S.
1992-01-01
Many computer security plans specify that only a small percentage of the data processed will be classified. Thus, the bulk of the data on secure systems must be unclassified. Secure limited access sites operating approved classified computing systems sometimes also have a system ostensibly containing only unclassified files but operating within the secure environment. That system could be networked or otherwise connected to a classified system(s) in order that both be able to use common resources for file storage or computing power. Such a system must operate under the same rules as the secure classified systems. It is in themore » nature of unclassified files that they either came from, or will eventually migrate to, a non-secure system. Today, unclassified files are exported from systems within the secure environment typically by loading transport media and carrying them to an open system. Import of unclassified files is handled similarly. This media transport process, sometimes referred to as sneaker net, often is manually logged and controlled only by administrative procedures. A comprehensive system for secure bi-directional transfer of unclassified files between secure and open environments has yet to be developed. Any such secure file transport system should be required to meet several stringent criteria. It is the purpose of this document to begin a definition of these criteria.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brooks, Kriston P.; Sprik, Samuel J.; Tamburello, David A.
The U.S. Department of Energy (DOE) has developed a vehicle framework model to simulate fuel cell-based light-duty vehicle operation for various hydrogen storage systems. This transient model simulates the performance of the storage system, fuel cell, and vehicle for comparison to DOE’s Technical Targets using four drive cycles/profiles. Chemical hydrogen storage models have been developed for the Framework model for both exothermic and endothermic materials. Despite the utility of such models, they require that material researchers input system design specifications that cannot be easily estimated. To address this challenge, a design tool has been developed that allows researchers to directlymore » enter kinetic and thermodynamic chemical hydrogen storage material properties into a simple sizing module that then estimates the systems parameters required to run the storage system model. Additionally, this design tool can be used as a standalone executable file to estimate the storage system mass and volume outside of the framework model and compare it to the DOE Technical Targets. These models will be explained and exercised with existing hydrogen storage materials.« less
Experience in running relational databases on clustered storage
NASA Astrophysics Data System (ADS)
Gaspar Aparicio, Ruben; Potocky, Miroslav
2015-12-01
For past eight years, CERN IT Database group has based its backend storage on NAS (Network-Attached Storage) architecture, providing database access via NFS (Network File System) protocol. In last two and half years, our storage has evolved from a scale-up architecture to a scale-out one. This paper describes our setup and a set of functionalities providing key features to other services like Database on Demand [1] or CERN Oracle backup and recovery service. It also outlines possible trend of evolution that, storage for databases could follow.
A design for a new catalog manager and associated file management for the Land Analysis System (LAS)
NASA Technical Reports Server (NTRS)
Greenhagen, Cheryl
1986-01-01
Due to the larger number of different types of files used in an image processing system, a mechanism for file management beyond the bounds of typical operating systems is necessary. The Transportable Applications Executive (TAE) Catalog Manager was written to meet this need. Land Analysis System (LAS) users at the EROS Data Center (EDC) encountered some problems in using the TAE catalog manager, including catalog corruption, networking difficulties, and lack of a reliable tape storage and retrieval capability. These problems, coupled with the complexity of the TAE catalog manager, led to the decision to design a new file management system for LAS, tailored to the needs of the EDC user community. This design effort, which addressed catalog management, label services, associated data management, and enhancements to LAS applications, is described. The new file management design will provide many benefits including improved system integration, increased flexibility, enhanced reliability, enhanced portability, improved performance, and improved maintainability.
The mass storage testing laboratory at GSFC
NASA Technical Reports Server (NTRS)
Venkataraman, Ravi; Williams, Joel; Michaud, David; Gu, Heng; Kalluri, Atri; Hariharan, P. C.; Kobler, Ben; Behnke, Jeanne; Peavey, Bernard
1998-01-01
Industry-wide benchmarks exist for measuring the performance of processors (SPECmarks), and of database systems (Transaction Processing Council). Despite storage having become the dominant item in computing and IT (Information Technology) budgets, no such common benchmark is available in the mass storage field. Vendors and consultants provide services and tools for capacity planning and sizing, but these do not account for the complete set of metrics needed in today's archives. The availability of automated tape libraries, high-capacity RAID systems, and high- bandwidth interconnectivity between processor and peripherals has led to demands for services which traditional file systems cannot provide. File Storage and Management Systems (FSMS), which began to be marketed in the late 80's, have helped to some extent with large tape libraries, but their use has introduced additional parameters affecting performance. The aim of the Mass Storage Test Laboratory (MSTL) at Goddard Space Flight Center is to develop a test suite that includes not only a comprehensive check list to document a mass storage environment but also benchmark code. Benchmark code is being tested which will provide measurements for both baseline systems, i.e. applications interacting with peripherals through the operating system services, and for combinations involving an FSMS. The benchmarks are written in C, and are easily portable. They are initially being aimed at the UNIX Open Systems world. Measurements are being made using a Sun Ultra 170 Sparc with 256MB memory running Solaris 2.5.1 with the following configuration: 4mm tape stacker on SCSI 2 Fast/Wide; 4GB disk device on SCSI 2 Fast/Wide; and Sony Petaserve on Fast/Wide differential SCSI 2.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-11-29
... Transportation, LLC, Enstor Katy Storage and Transportation, LP, et al.; Notice of Baseline Filings November 18... a revised baseline filing of their Statement of Operating Conditions for services provided under...
Technology for national asset storage systems
NASA Technical Reports Server (NTRS)
Coyne, Robert A.; Hulen, Harry; Watson, Richard
1993-01-01
An industry-led collaborative project, called the National Storage Laboratory, was organized to investigate technology for storage systems that will be the future repositories for our national information assets. Industry participants are IBM Federal Systems Company, Ampex Recording Systems Corporation, General Atomics DISCOS Division, IBM ADSTAR, Maximum Strategy Corporation, Network Systems Corporation, and Zitel Corporation. Industry members of the collaborative project are funding their own participation. Lawrence Livermore National Laboratory through its National Energy Research Supercomputer Center (NERSC) will participate in the project as the operational site and the provider of applications. The expected result is an evaluation of a high performance storage architecture assembled from commercially available hardware and software, with some software enhancements to meet the project's goals. It is anticipated that the integrated testbed system will represent a significant advance in the technology for distributed storage systems capable of handling gigabyte class files at gigabit-per-second data rates. The National Storage Laboratory was officially launched on 27 May 1992.
National Storage Laboratory: a collaborative research project
NASA Astrophysics Data System (ADS)
Coyne, Robert A.; Hulen, Harry; Watson, Richard W.
1993-01-01
The grand challenges of science and industry that are driving computing and communications have created corresponding challenges in information storage and retrieval. An industry-led collaborative project has been organized to investigate technology for storage systems that will be the future repositories of national information assets. Industry participants are IBM Federal Systems Company, Ampex Recording Systems Corporation, General Atomics DISCOS Division, IBM ADSTAR, Maximum Strategy Corporation, Network Systems Corporation, and Zitel Corporation. Industry members of the collaborative project are funding their own participation. Lawrence Livermore National Laboratory through its National Energy Research Supercomputer Center (NERSC) will participate in the project as the operational site and provider of applications. The expected result is the creation of a National Storage Laboratory to serve as a prototype and demonstration facility. It is expected that this prototype will represent a significant advance in the technology for distributed storage systems capable of handling gigabyte-class files at gigabit-per-second data rates. Specifically, the collaboration expects to make significant advances in hardware, software, and systems technology in four areas of need, (1) network-attached high performance storage; (2) multiple, dynamic, distributed storage hierarchies; (3) layered access to storage system services; and (4) storage system management.
78 FR 5175 - Combined Notice of Filings #1
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-24
...: Energy Storage Holdings, LLC. Description: Energy Storage Holdings, LLC to be effective 1/12/ 2013. Filed... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission Combined Notice of Filings 1 Take notice.... Applicants: Puget Sound Energy, Inc. Description: Application for Authorization for Disposition of...
NASA Astrophysics Data System (ADS)
Baru, Chaitan; Nandigam, Viswanath; Krishnan, Sriram
2010-05-01
Increasingly, the geoscience user community expects modern IT capabilities to be available in service of their research and education activities, including the ability to easily access and process large remote sensing datasets via online portals such as GEON (www.geongrid.org) and OpenTopography (opentopography.org). However, serving such datasets via online data portals presents a number of challenges. In this talk, we will evaluate the pros and cons of alternative storage strategies for management and processing of such datasets using binary large object implementations (BLOBs) in database systems versus implementation in Hadoop files using the Hadoop Distributed File System (HDFS). The storage and I/O requirements for providing online access to large datasets dictate the need for declustering data across multiple disks, for capacity as well as bandwidth and response time performance. This requires partitioning larger files into a set of smaller files, and is accompanied by the concomitant requirement for managing large numbers of file. Storing these sub-files as blobs in a shared-nothing database implemented across a cluster provides the advantage that all the distributed storage management is done by the DBMS. Furthermore, subsetting and processing routines can be implemented as user-defined functions (UDFs) on these blobs and would run in parallel across the set of nodes in the cluster. On the other hand, there are both storage overheads and constraints, and software licensing dependencies created by such an implementation. Another approach is to store the files in an external filesystem with pointers to them from within database tables. The filesystem may be a regular UNIX filesystem, a parallel filesystem, or HDFS. In the HDFS case, HDFS would provide the file management capability, while the subsetting and processing routines would be implemented as Hadoop programs using the MapReduce model. Hadoop and its related software libraries are freely available. Another consideration is the strategy used for partitioning large data collections, and large datasets within collections, using round-robin vs hash partitioning vs range partitioning methods. Each has different characteristics in terms of spatial locality of data and resultant degree of declustering of the computations on the data. Furthermore, we have observed that, in practice, there can be large variations in the frequency of access to different parts of a large data collection and/or dataset, thereby creating "hotspots" in the data. We will evaluate the ability of different approaches for dealing effectively with such hotspots and alternative strategies for dealing with hotspots.
Characterizing output bottlenecks in a supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xie, Bing; Chase, Jeffrey; Dillow, David A
2012-01-01
Supercomputer I/O loads are often dominated by writes. HPC (High Performance Computing) file systems are designed to absorb these bursty outputs at high bandwidth through massive parallelism. However, the delivered write bandwidth often falls well below the peak. This paper characterizes the data absorption behavior of a center-wide shared Lustre parallel file system on the Jaguar supercomputer. We use a statistical methodology to address the challenges of accurately measuring a shared machine under production load and to obtain the distribution of bandwidth across samples of compute nodes, storage targets, and time intervals. We observe and quantify limitations from competing traffic,more » contention on storage servers and I/O routers, concurrency limitations in the client compute node operating systems, and the impact of variance (stragglers) on coupled output such as striping. We then examine the implications of our results for application performance and the design of I/O middleware systems on shared supercomputers.« less
SSeCloud: Using secret sharing scheme to secure keys
NASA Astrophysics Data System (ADS)
Hu, Liang; Huang, Yang; Yang, Disheng; Zhang, Yuzhen; Liu, Hengchang
2017-08-01
With the use of cloud storage services, one of the concerns is how to protect sensitive data securely and privately. While users enjoy the convenience of data storage provided by semi-trusted cloud storage providers, they are confronted with all kinds of risks at the same time. In this paper, we present SSeCloud, a secure cloud storage system that improves security and usability by applying secret sharing scheme to secure keys. The system encrypts uploading files on the client side and splits encrypted keys into three shares. Each of them is respectively stored by users, cloud storage providers and the alternative third trusted party. Any two of the parties can reconstruct keys. Evaluation results of prototype system show that SSeCloud provides high security without too much performance penalty.
I/O performance evaluation of a Linux-based network-attached storage device
NASA Astrophysics Data System (ADS)
Sun, Zhaoyan; Dong, Yonggui; Wu, Jinglian; Jia, Huibo; Feng, Guanping
2002-09-01
In a Local Area Network (LAN), clients are permitted to access the files on high-density optical disks via a network server. But the quality of read service offered by the conventional server is not satisfied because of the multiple functions on the server and the overmuch caller. This paper develops a Linux-based Network-Attached Storage (NAS) server. The Operation System (OS), composed of an optimized kernel and a miniaturized file system, is stored in a flash memory. After initialization, the NAS device is connected into the LAN. The administrator and users could configure the access the server through the web page respectively. In order to enhance the quality of access, the management of buffer cache in file system is optimized. Some benchmark programs are peformed to evaluate the I/O performance of the NAS device. Since data recorded in optical disks are usually for reading accesses, our attention is focused on the reading throughput of the device. The experimental results indicate that the I/O performance of our NAS device is excellent.
Utilizing ORACLE tools within Unix
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ferguson, R.
1995-07-01
Large databases, by their very nature, often serve as repositories of data which may be needed by other systems. The transmission of this data to other systems has in the past involved several layers of human intervention. The Integrated Cargo Data Base (ICDB) developed by Martin Marietta Energy Systems for the Military Traffic Management Command as part of the Worldwide Port System provides data integration and worldwide tracking of cargo that passes through common-user ocean cargo ports. One of the key functions of ICDB is data distribution of a variety of data files to a number of other systems. Developmentmore » of automated data distribution procedures had to deal with the following constraints: (1) variable generation time for data files, (2) use of only current data for data files, (3) use of a minimum number of select statements, (4) creation of unique data files for multiple recipients, (5) automatic transmission of data files to recipients, and (6) avoidance of extensive and long-term data storage.« less
Cloud Engineering Principles and Technology Enablers for Medical Image Processing-as-a-Service
Bao, Shunxing; Plassard, Andrew J.; Landman, Bennett A.; Gokhale, Aniruddha
2017-01-01
Traditional in-house, laboratory-based medical imaging studies use hierarchical data structures (e.g., NFS file stores) or databases (e.g., COINS, XNAT) for storage and retrieval. The resulting performance from these approaches is, however, impeded by standard network switches since they can saturate network bandwidth during transfer from storage to processing nodes for even moderate-sized studies. To that end, a cloud-based “medical image processing-as-a-service” offers promise in utilizing the ecosystem of Apache Hadoop, which is a flexible framework providing distributed, scalable, fault tolerant storage and parallel computational modules, and HBase, which is a NoSQL database built atop Hadoop’s distributed file system. Despite this promise, HBase’s load distribution strategy of region split and merge is detrimental to the hierarchical organization of imaging data (e.g., project, subject, session, scan, slice). This paper makes two contributions to address these concerns by describing key cloud engineering principles and technology enhancements we made to the Apache Hadoop ecosystem for medical imaging applications. First, we propose a row-key design for HBase, which is a necessary step that is driven by the hierarchical organization of imaging data. Second, we propose a novel data allocation policy within HBase to strongly enforce collocation of hierarchically related imaging data. The proposed enhancements accelerate data processing by minimizing network usage and localizing processing to machines where the data already exist. Moreover, our approach is amenable to the traditional scan, subject, and project-level analysis procedures, and is compatible with standard command line/scriptable image processing software. Experimental results for an illustrative sample of imaging data reveals that our new HBase policy results in a three-fold time improvement in conversion of classic DICOM to NiFTI file formats when compared with the default HBase region split policy, and nearly a six-fold improvement over a commonly available network file system (NFS) approach even for relatively small file sets. Moreover, file access latency is lower than network attached storage. PMID:28884169
75 FR 62374 - Combined Notice of Filings #1
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-08
... Energy Storage, LLC. Description: AES Energy Storage, LLC submits tariff filing per 35.12: AES Energy... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission Combined Notice of Filings 1 September... Numbers: EC10-102-000. Applicants: NRG Energy, Inc, NRG Retail Acquisition Inc., Green Mountain Energy...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Chase Qishi
A number of Department of Energy (DOE) science applications, involving exascale computing systems and large experimental facilities, are expected to generate large volumes of data, in the range of petabytes to exabytes, which will be transported over wide-area networks for the purpose of storage, visualization, and analysis. To support such capabilities, significant progress has been made in various components including the deployment of 100 Gbps networks with future 1 Tbps bandwidth, increases in end-host capabilities with multiple cores and buses, capacity improvements in large disk arrays, and deployment of parallel file systems such as Lustre and GPFS. High-performance source-to-sink datamore » flows must be composed of these component systems, which requires significant optimizations of the storage-to-host data and execution paths to match the edge and long-haul network connections. In particular, end systems are currently supported by 10-40 Gbps Network Interface Cards (NIC) and 8-32 Gbps storage Host Channel Adapters (HCAs), which carry the individual flows that collectively must reach network speeds of 100 Gbps and higher. Indeed, such data flows must be synthesized using multicore, multibus hosts connected to high-performance storage systems on one side and to the network on the other side. Current experimental results show that the constituent flows must be optimally composed and preserved from storage systems, across the hosts and the networks with minimal interference. Furthermore, such a capability must be made available transparently to the science users without placing undue demands on them to account for the details of underlying systems and networks. And, this task is expected to become even more complex in the future due to the increasing sophistication of hosts, storage systems, and networks that constitute the high-performance flows. The objectives of this proposal are to (1) develop and test the component technologies and their synthesis methods to achieve source-to-sink high-performance flows, and (2) develop tools that provide these capabilities through simple interfaces to users and applications. In terms of the former, we propose to develop (1) optimization methods that align and transition multiple storage flows to multiple network flows on multicore, multibus hosts; and (2) edge and long-haul network path realization and maintenance using advanced provisioning methods including OSCARS and OpenFlow. We also propose synthesis methods that combine these individual technologies to compose high-performance flows using a collection of constituent storage-network flows, and realize them across the storage and local network connections as well as long-haul connections. We propose to develop automated user tools that profile the hosts, storage systems, and network connections; compose the source-to-sink complex flows; and set up and maintain the needed network connections. These solutions will be tested using (1) 100 Gbps connection(s) between Oak Ridge National Laboratory (ORNL) and Argonne National Laboratory (ANL) with storage systems supported by Lustre and GPFS file systems with an asymmetric connection to University of Memphis (UM); (2) ORNL testbed with multicore and multibus hosts, switches with OpenFlow capabilities, and network emulators; and (3) 100 Gbps connections from ESnet and their Openflow testbed, and other experimental connections. This proposal brings together the expertise and facilities of the two national laboratories, ORNL and ANL, and UM. It also represents a collaboration between DOE and the Department of Defense (DOD) projects at ORNL by sharing technical expertise and personnel costs, and leveraging the existing DOD Extreme Scale Systems Center (ESSC) facilities at ORNL.« less
Terascale Cluster for Advanced Turbulent Combustion Simulations
2008-07-25
the system We have given the name CATS (for Combustion And Turbulence Simulator) to the terascale system that was obtained through this grant. CATS ...lnfiniBand interconnect. CATS includes an interactive login node and a file server, each holding in excess of 1 terabyte of file storage. The 35 active...compute nodes of CATS enable us to run up to 140-core parallel MPI batch jobs; one node is reserved to run the scheduler. CATS is operated and
Federal Register 2010, 2011, 2012, 2013, 2014
2010-03-10
... LNG storage tanks; A closed-loop shell and tube heat exchanger vaporization system; Various ancillary..., there are three methods you can use to submit your comments to the Commission. In all instances please... encourages electronic filing of comments and has dedicated eFiling expert staff available to assist you at...
A high performance hierarchical storage management system for the Canadian tier-1 centre at TRIUMF
NASA Astrophysics Data System (ADS)
Deatrich, D. C.; Liu, S. X.; Tafirout, R.
2010-04-01
We describe in this paper the design and implementation of Tapeguy, a high performance non-proprietary Hierarchical Storage Management (HSM) system which is interfaced to dCache for efficient tertiary storage operations. The system has been successfully implemented at the Canadian Tier-1 Centre at TRIUMF. The ATLAS experiment will collect a large amount of data (approximately 3.5 Petabytes each year). An efficient HSM system will play a crucial role in the success of the ATLAS Computing Model which is driven by intensive large-scale data analysis activities that will be performed on the Worldwide LHC Computing Grid infrastructure continuously. Tapeguy is Perl-based. It controls and manages data and tape libraries. Its architecture is scalable and includes Dataset Writing control, a Read-back Queuing mechanism and I/O tape drive load balancing as well as on-demand allocation of resources. A central MySQL database records metadata information for every file and transaction (for audit and performance evaluation), as well as an inventory of library elements. Tapeguy Dataset Writing was implemented to group files which are close in time and of similar type. Optional dataset path control dynamically allocates tape families and assign tapes to it. Tape flushing is based on various strategies: time, threshold or external callbacks mechanisms. Tapeguy Read-back Queuing reorders all read requests by using an elevator algorithm, avoiding unnecessary tape loading and unloading. Implementation of priorities will guarantee file delivery to all clients in a timely manner.
New capabilities in the HENP grand challenge storage access systemand its application at RHIC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bernardo, L.; Gibbard, B.; Malon, D.
2000-04-25
The High Energy and Nuclear Physics Data Access GrandChallenge project has developed an optimizing storage access softwaresystem that was prototyped at RHIC. It is currently undergoingintegration with the STAR experiment in preparation for data taking thatstarts in mid-2000. The behavior and lessons learned in the RHIC MockData Challenge exercises are described as well as the observedperformance under conditions designed to characterize scalability. Up to250 simultaneous queries were tested and up to 10 million events across 7event components were involved in these queries. The system coordinatesthe staging of "bundles" of files from the HPSS tape system, so that allthe needed componentsmore » of each event are in disk cache when accessed bythe application software. The caching policy algorithm for thecoordinated bundle staging is described in the paper. The initialprototype implementation interfaced to the Objectivity/DB. In this latestversion, it evolved to work with arbitrary files and use CORBA interfacesto the tag database and file catalog services. The interface to the tagdatabase and the MySQL-based file catalog services used by STAR aredescribed along with the planned usage scenarios.« less
Woodson, Kristina E; Sable, Craig A; Cross, Russell R; Pearson, Gail D; Martin, Gerard R
2004-11-01
Live transmission of echocardiograms over integrated services digital network lines is accurate and has led to improvements in the delivery of pediatric cardiology care. Permanent archiving of the live studies has not previously been reported. Specific obstacles to permanent storage of telemedicine files have included the ability to produce accurate images without a significant increase in storage requirements. We evaluated the accuracy of Motion Pictures Expert Group (MPEG) digitization of incoming video streams and assessed the storage requirements of these files for infants in a real-time pediatric tele-echocardiography program. All major cardiac diagnoses were correctly diagnosed by review of MPEG images. MPEG file size ranged from 11.1 to 182 MB (56.5 +/- 29.9 MB). MPEG digitization during live neonatal telemedicine is accurate and provides an efficient method for storage. This modality has acceptable storage requirements; file sizes are comparable to other digital modalities.
NASA Technical Reports Server (NTRS)
Kobler, Benjamin (Editor); Hariharan, P. C. (Editor)
2000-01-01
This document contains copies of those technical papers received in time for publication prior to the Eighth Goddard Conference on Mass Storage Systems and Technologies which is being held in cooperation with the Seventeenth IEEE Symposium on Mass Storage Systems at the University of Maryland University College Inn and Conference Center March 27-30, 2000. As one of an ongoing series, this Conference continues to provide a forum for discussion of issues relevant to the management of large volumes of data. The Conference encourages all interested organizations to discuss long term mass storage requirements and experiences in fielding solutions. Emphasis is on current and future practical solutions addressing issues in data management, storage systems and media, data acquisition, long term retention of data, and data distribution. This year's discussion topics include architecture, future of current technology, new technology with a special emphasis on holographic storage, performance, standards, site reports, vendor solutions. Tutorials will be available on stability of optical media, disk subsystem performance evaluation, I/O and storage tuning, functionality and performance evaluation of file systems for storage area networks.
76 FR 30332 - Combined Notice of Filings No. 1
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-25
..., May 18, 2011. Docket Numbers: RP11-2098-000. Applicants: Young Gas Storage Company, Ltd. Description: Young Gas Storage Company, Ltd. submits tariff filing per 154.204: Revised Reservoir Integrity Limit... action to be taken, but will not serve to make protestants parties to the proceeding. Anyone filing a...
75 FR 57754 - Combined Notice of Filings No. 2
Federal Register 2010, 2011, 2012, 2013, 2014
2010-09-22
... Refund Report filings: Docket Numbers: RP10-1258-001. Applicants: Central New York Oil and Gas Co., LLC. Description: Central New York Oil and Gas Company, LLC submits their compliance filing to incorporate in the...-001. Applicants: Arlington Storage Company, LLC. Description: Arlington Storage Company, LLC submits...
The ATLAS EventIndex: architecture, design choices, deployment and first operation experience
NASA Astrophysics Data System (ADS)
Barberis, D.; Cárdenas Zárate, S. E.; Cranshaw, J.; Favareto, A.; Fernández Casaní, Á.; Gallas, E. J.; Glasman, C.; González de la Hoz, S.; Hřivnáč, J.; Malon, D.; Prokoshin, F.; Salt Cairols, J.; Sánchez, J.; Többicke, R.; Yuan, R.
2015-12-01
The EventIndex is the complete catalogue of all ATLAS events, keeping the references to all files that contain a given event in any processing stage. It replaces the TAG database, which had been in use during LHC Run 1. For each event it contains its identifiers, the trigger pattern and the GUIDs of the files containing it. Major use cases are event picking, feeding the Event Service used on some production sites, and technical checks of the completion and consistency of processing campaigns. The system design is highly modular so that its components (data collection system, storage system based on Hadoop, query web service and interfaces to other ATLAS systems) could be developed separately and in parallel during LSI. The EventIndex is in operation for the start of LHC Run 2. This paper describes the high-level system architecture, the technical design choices and the deployment process and issues. The performance of the data collection and storage systems, as well as the query services, are also reported.
Automated clustering-based workload characterization
NASA Technical Reports Server (NTRS)
Pentakalos, Odysseas I.; Menasce, Daniel A.; Yesha, Yelena
1996-01-01
The demands placed on the mass storage systems at various federal agencies and national laboratories are continuously increasing in intensity. This forces system managers to constantly monitor the system, evaluate the demand placed on it, and tune it appropriately using either heuristics based on experience or analytic models. Performance models require an accurate workload characterization. This can be a laborious and time consuming process. It became evident from our experience that a tool is necessary to automate the workload characterization process. This paper presents the design and discusses the implementation of a tool for workload characterization of mass storage systems. The main features of the tool discussed here are: (1)Automatic support for peak-period determination. Histograms of system activity are generated and presented to the user for peak-period determination; (2) Automatic clustering analysis. The data collected from the mass storage system logs is clustered using clustering algorithms and tightness measures to limit the number of generated clusters; (3) Reporting of varied file statistics. The tool computes several statistics on file sizes such as average, standard deviation, minimum, maximum, frequency, as well as average transfer time. These statistics are given on a per cluster basis; (4) Portability. The tool can easily be used to characterize the workload in mass storage systems of different vendors. The user needs to specify through a simple log description language how the a specific log should be interpreted. The rest of this paper is organized as follows. Section two presents basic concepts in workload characterization as they apply to mass storage systems. Section three describes clustering algorithms and tightness measures. The following section presents the architecture of the tool. Section five presents some results of workload characterization using the tool.Finally, section six presents some concluding remarks.
Brooks, Kriston P.; Sprik, Samuel J.; Tamburello, David A.; ...
2018-04-07
The U.S. Department of Energy (DOE) developed a vehicle Framework model to simulate fuel cell-based light-duty vehicle operation for various hydrogen storage systems. This transient model simulates the performance of the storage system, fuel cell, and vehicle for comparison to Technical Targets established by DOE for four drive cycles/profiles. Chemical hydrogen storage models have been developed for the Framework for both exothermic and endothermic materials. Despite the utility of such models, they require that material researchers input system design specifications that cannot be estimated easily. To address this challenge, a design tool has been developed that allows researchers to directlymore » enter kinetic and thermodynamic chemical hydrogen storage material properties into a simple sizing module that then estimates system parameters required to run the storage system model. Additionally, the design tool can be used as a standalone executable file to estimate the storage system mass and volume outside of the Framework model. Here, these models will be explained and exercised with the representative hydrogen storage materials exothermic ammonia borane (NH 3BH 3) and endothermic alane (AlH 3).« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brooks, Kriston P.; Sprik, Samuel J.; Tamburello, David A.
The U.S. Department of Energy (DOE) developed a vehicle Framework model to simulate fuel cell-based light-duty vehicle operation for various hydrogen storage systems. This transient model simulates the performance of the storage system, fuel cell, and vehicle for comparison to Technical Targets established by DOE for four drive cycles/profiles. Chemical hydrogen storage models have been developed for the Framework for both exothermic and endothermic materials. Despite the utility of such models, they require that material researchers input system design specifications that cannot be estimated easily. To address this challenge, a design tool has been developed that allows researchers to directlymore » enter kinetic and thermodynamic chemical hydrogen storage material properties into a simple sizing module that then estimates system parameters required to run the storage system model. Additionally, the design tool can be used as a standalone executable file to estimate the storage system mass and volume outside of the Framework model. Here, these models will be explained and exercised with the representative hydrogen storage materials exothermic ammonia borane (NH 3BH 3) and endothermic alane (AlH 3).« less
NASA Technical Reports Server (NTRS)
Runnels, Tyson D.
1993-01-01
This is a case study. It deals with the use of a 'virtual file system' (VFS) for Boeing's UNIX-based Product Standards Data System (PSDS). One of the objectives of PSDS is to store digital standards documents. The file-storage requirements are that the files must be rapidly accessible, stored for long periods of time - as though they were paper, protected from disaster, and accumulative to about 80 billion characters (80 gigabytes). This volume of data will be approached in the first two years of the project's operation. The approach chosen is to install a hierarchical file migration system using optical disk cartridges. Files are migrated from high-performance media to lower performance optical media based on a least-frequency-used algorithm. The optical media are less expensive per character stored and are removable. Vital statistics about the removable optical disk cartridges are maintained in a database. The assembly of hardware and software acts as a single virtual file system transparent to the PSDS user. The files are copied to 'backup-and-recover' media whose vital statistics are also stored in the database. Seventeen months into operation, PSDS is storing 49 gigabytes. A number of operational and performance problems were overcome. Costs are under control. New and/or alternative uses for the VFS are being considered.
A data distribution strategy for the 1990s (files are not enough)
NASA Technical Reports Server (NTRS)
Tankenson, Mike; Wright, Steven
1993-01-01
Virtually all of the data distribution strategies being contemplated for the EOSDIS era revolve around the use of files. Most, if not all, mass storage technologies are based around the file model. However, files may be the wrong primary abstraction for supporting scientific users in the 1990s and beyond. Other abstractions more closely matching the respective scientific discipline of the end user may be more appropriate. JPL has built a unique multimission data distribution system based on a strategy of telemetry stream emulation to match the responsibilities of spacecraft team and ground data system operators supporting our nations suite of planetary probes. The current system, operational since 1989 and the launch of the Magellan spacecraft, is supporting over 200 users at 15 remote sites. This stream-oriented data distribution model can provide important lessons learned to builders of future data systems.
Hu, Ding; Xie, Shuqun; Yu, Donglan; Zheng, Zhensheng; Wang, Kuijian
2010-04-01
The development of external counterpulsation (ECP) local area network system and extensible markup language (XML)-based remote ECP medical information system conformable to digital imaging and communications in medicine (DICOM) standard has been improving the digital interchangeablity and sharability of ECP data. However, the therapy process of ECP is a continuous and longtime supervision which builds a mass of waveform data. In order to reduce the storage space and improve the transmission efficiency, the waveform data with the normative format of ECP data files have to be compressed. In this article, we introduced the compression arithmetic of template matching and improved quick fitting of linear approximation distance thresholding (LADT) in combimation with the characters of enhanced external counterpulsation (EECP) waveform signal. The DICOM standard is used as the storage and transmission standard to make our system compatible with hospital information system. According to the rules of transfer syntaxes, we defined private transfer syntax for one-dimensional compressed waveform data and stored EECP data into a DICOM file. Testing result indicates that the compressed and normative data can be correctly transmitted and displayed between EECP workstations in our EECP laboratory.
NASA Astrophysics Data System (ADS)
Bauerdick, L. A. T.; Bloom, K.; Bockelman, B.; Bradley, D. C.; Dasu, S.; Dost, J. M.; Sfiligoi, I.; Tadel, A.; Tadel, M.; Wuerthwein, F.; Yagil, A.; Cms Collaboration
2014-06-01
Following the success of the XRootd-based US CMS data federation, the AAA project investigated extensions of the federation architecture by developing two sample implementations of an XRootd, disk-based, caching proxy. The first one simply starts fetching a whole file as soon as a file open request is received and is suitable when completely random file access is expected or it is already known that a whole file be read. The second implementation supports on-demand downloading of partial files. Extensions to the Hadoop Distributed File System have been developed to allow for an immediate fallback to network access when local HDFS storage fails to provide the requested block. Both cache implementations are in pre-production testing at UCSD.
78 FR 61995 - Combined Notice of Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-10
.... Comments Due: 5 p.m. ET 10/9/13. Docket Numbers: RP13-1357-000. Applicants: Young Gas Storage Company, Ltd. Description: Annual Operational Purchases and Sales Report of Young Gas Storage Company, Ltd.. Filed Date: 9... necessary to become a party to the proceeding. Filings in Existing Proceedings Docket Numbers: PR13-62-001...
76 FR 6457 - Hill-Lake Gas Storage, LLC; Notice of Baseline Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2011-02-04
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR10-134-001] Hill-Lake Gas Storage, LLC; Notice of Baseline Filings January 31, 2011. Take notice that on January 28, 2011, Hill-Lake submitted a revised baseline filing of their Statement of Operating Conditions for services provided under...
76 FR 7186 - Hill-Lake Gas Storage, LLC; Notice of Baseline Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2011-02-09
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR10-134-002] Hill-Lake Gas Storage, LLC; Notice of Baseline Filings February 2, 2011. Take notice that on February 1, 2011, Hill-Lake submitted a revised baseline filing of their Statement of Operating Conditions for services provided under...
Name It! Store It! Protect It!: A Systems Approach to Managing Data in Research Core Facilities.
DeVries, Matthew; Fenchel, Matthew; Fogarty, R E; Kim, Byong-Do; Timmons, Daniel; White, A Nicole
2017-12-01
As the capabilities of technology increase, so do the production of data and the need for data management. The need for data storage at many academic institutions is increasing exponentially. Technology is expanding rapidly, and institutions are recognizing the need to incorporate data management that can be available for future data sharing as a critical component of institutional services. The establishment of a process to manage the surge in data storage is complex and often hindered by not having a plan. Simple file naming-nomenclature-is also becoming ever more important to leave an established understanding of the contents in a file. This is especially the case as research experiences turnover from research projects and personnel. The indexing of files consistently also helps to identify past work. Finally, the protection of the data contents is becoming increasing challenging. As the genomic field expands, and medicine becomes more personalized, the identification of methods to protect the contents of data in both short- and long-term storage needs to be established so as not to risk the potential of revealing identifiable information. This is often something we do not consider in a nonclinical research environment. The need for establishing basic guidelines for institutions is critical, as individual research laboratories are unable to handle the scope of data storage required for their own research. In addition to the immediate needs for establishing guidelines on data storage and file naming and how to protect information, the recognition of the need for specialized support for data management supporting research cores and laboratories at academic institutions is becoming a critical component of institutional services. Here, we outline some case studies and methods that you may be able to adopt at your own institution.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-26
... proposed reservoirs from 6,225 feet to 7,330 feet; and (7) change the name of the project from ``Jones Canyon Pumped Storage Project'' to ``Oregon Winds Pumped Storage''. FERC Contact: Jennifer Harper, 202..., using the eComment system at http://www.ferc.gov/docs-filing/ecomment.asp . You must include your name...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-04
...-long earth embankment dam creating; (2) an upper reservoir with a surface area of 120 acres and an 6,000 acre-foot storage capacity; (3) a 80-foot-high, 2,800-foot-long earth embankment dam creating; (4... prior registration, using the eComment system at http://www.ferc.gov/docs-filing/ecomment.asp . You must...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-04
...-long earth embankment dam creating; (2) an upper reservoir with a surface area of 85 acres and an 5,000 acre-foot storage capacity; (3) a 60-foot-high, 7,300-foot-long earth embankment dam creating; (4) a... characters, without prior registration, using the eComment system at http://www.ferc.gov/docs-filing/ecomment...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-04
...: (1) A 120-foot-high, 3,200-foot-long earth embankment dam; (2) a 30- foot-high, 300-foot-long earth...-foot storage capacity; (4) a 60-foot-high, 5,600-foot-long earth embankment dam creating; (5) a lower... characters, without prior registration, using the eComment system at http://www.ferc.gov/docs-filing/ecomment...
Solving data-at-rest for the storage and retrieval of files in ad hoc networks
NASA Astrophysics Data System (ADS)
Knobler, Ron; Scheffel, Peter; Williams, Jonathan; Gaj, Kris; Kaps, Jens-Peter
2013-05-01
Based on current trends for both military and commercial applications, the use of mobile devices (e.g. smartphones and tablets) is greatly increasing. Several military applications consist of secure peer to peer file sharing without a centralized authority. For these military applications, if one or more of these mobile devices are lost or compromised, sensitive files can be compromised by adversaries, since COTS devices and operating systems are used. Complete system files cannot be stored on a device, since after compromising a device, an adversary can attack the data at rest, and eventually obtain the original file. Also after a device is compromised, the existing peer to peer system devices must still be able to access all system files. McQ has teamed with the Cryptographic Engineering Research Group at George Mason University to develop a custom distributed file sharing system to provide a complete solution to the data at rest problem for resource constrained embedded systems and mobile devices. This innovative approach scales very well to a large number of network devices, without a single point of failure. We have implemented the approach on representative mobile devices as well as developed an extensive system simulator to benchmark expected system performance based on detailed modeling of the network/radio characteristics, CONOPS, and secure distributed file system functionality. The simulator is highly customizable for the purpose of determining expected system performance for other network topologies and CONOPS.
An Object-Relational Ifc Storage Model Based on Oracle Database
NASA Astrophysics Data System (ADS)
Li, Hang; Liu, Hua; Liu, Yong; Wang, Yuan
2016-06-01
With the building models are getting increasingly complicated, the levels of collaboration across professionals attract more attention in the architecture, engineering and construction (AEC) industry. In order to adapt the change, buildingSMART developed Industry Foundation Classes (IFC) to facilitate the interoperability between software platforms. However, IFC data are currently shared in the form of text file, which is defective. In this paper, considering the object-based inheritance hierarchy of IFC and the storage features of different database management systems (DBMS), we propose a novel object-relational storage model that uses Oracle database to store IFC data. Firstly, establish the mapping rules between data types in IFC specification and Oracle database. Secondly, design the IFC database according to the relationships among IFC entities. Thirdly, parse the IFC file and extract IFC data. And lastly, store IFC data into corresponding tables in IFC database. In experiment, three different building models are selected to demonstrate the effectiveness of our storage model. The comparison of experimental statistics proves that IFC data are lossless during data exchange.
NASA Technical Reports Server (NTRS)
Ferrara, Jeffrey; Calk, William; Atwell, William; Tsui, Tina
2013-01-01
MPISS is an automatic file transfer system that implements a combination of standard and mission-unique transfer protocols required by the Global Precipitation Measurement Mission (GPM) Precipitation Processing System (PPS) to control the flow of data between the MOC and the PPS. The primary features of MPISS are file transfers (both with and without PPS specific protocols), logging of file transfer and system events to local files and a standard messaging bus, short term storage of data files to facilitate retransmissions, and generation of file transfer accounting reports. The system includes a graphical user interface (GUI) to control the system, allow manual operations, and to display events in real time. The PPS specific protocols are an enhanced version of those that were developed for the Tropical Rainfall Measuring Mission (TRMM). All file transfers between the MOC and the PPS use the SSH File Transfer Protocol (SFTP). For reports and data files generated within the MOC, no additional protocols are used when transferring files to the PPS. For observatory data files, an additional handshaking protocol of data notices and data receipts is used. MPISS generates and sends to the PPS data notices containing data start and stop times along with a checksum for the file for each observatory data file transmitted. MPISS retrieves the PPS generated data receipts that indicate the success or failure of the PPS to ingest the data file and/or notice. MPISS retransmits the appropriate files as indicated in the receipt when required. MPISS also automatically retrieves files from the PPS. The unique feature of this software is the use of both standard and PPS specific protocols in parallel. The advantage of this capability is that it supports users that require the PPS protocol as well as those that do not require it. The system is highly configurable to accommodate the needs of future users.
75 FR 37425 - Combined Notice of Filings No. 1
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-29
... 30, 2010. Docket Numbers: RP10-859-000. Applicants: Egan Hub Storage, LLC. Description: Egan Hub... Date: 5 p.m. Eastern Time on Wednesday, June 30, 2010. Docket Numbers: RP10-861-000. Applicants: Egan Hub Storage, LLC. Description: Egan Hub Storage, LLC submits tariff filing per 154.203: Egan Hub...
17 CFR 232.501 - Modular submissions and segmented filings.
Code of Federal Regulations, 2011 CFR
2011-04-01
...) One or more electronic format documents may be submitted for storage in the non-public EDGAR data... data storage area at any time, not to exceed a total of one megabyte of digital information. If an...-public EDGAR data storage area for assembly as a segmented filing. (2) Segments shall be submitted no...
17 CFR 232.501 - Modular submissions and segmented filings.
Code of Federal Regulations, 2014 CFR
2014-04-01
...) One or more electronic format documents may be submitted for storage in the non-public EDGAR data... data storage area at any time, not to exceed a total of one megabyte of digital information. If an...-public EDGAR data storage area for assembly as a segmented filing. (2) Segments shall be submitted no...
17 CFR 232.501 - Modular submissions and segmented filings.
Code of Federal Regulations, 2010 CFR
2010-04-01
...) One or more electronic format documents may be submitted for storage in the non-public EDGAR data... data storage area at any time, not to exceed a total of one megabyte of digital information. If an...-public EDGAR data storage area for assembly as a segmented filing. (2) Segments shall be submitted no...
17 CFR 232.501 - Modular submissions and segmented filings.
Code of Federal Regulations, 2012 CFR
2012-04-01
...) One or more electronic format documents may be submitted for storage in the non-public EDGAR data... data storage area at any time, not to exceed a total of one megabyte of digital information. If an...-public EDGAR data storage area for assembly as a segmented filing. (2) Segments shall be submitted no...
17 CFR 232.501 - Modular submissions and segmented filings.
Code of Federal Regulations, 2013 CFR
2013-04-01
...) One or more electronic format documents may be submitted for storage in the non-public EDGAR data... data storage area at any time, not to exceed a total of one megabyte of digital information. If an...-public EDGAR data storage area for assembly as a segmented filing. (2) Segments shall be submitted no...
76 FR 18753 - Jefferson Island Storage & Hub, L.L.C.; Notice of Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-04-05
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR11-97-000] Jefferson Island Storage & Hub, L.L.C.; Notice of Filing Take notice that on March 28, 2011, Jefferson Island Storage & Hub, L.L.C. (Jefferson Island) submitted a revised Statement of Operating Conditions (SOC) for...
Cryptonite: A Secure and Performant Data Repository on Public Clouds
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kumbhare, Alok; Simmhan, Yogesh; Prasanna, Viktor
2012-06-29
Cloud storage has become immensely popular for maintaining synchronized copies of files and for sharing documents with collaborators. However, there is heightened concern about the security and privacy of Cloud-hosted data due to the shared infrastructure model and an implicit trust in the service providers. Emerging needs of secure data storage and sharing for domains like Smart Power Grids, which deal with sensitive consumer data, require the persistence and availability of Cloud storage but with client-controlled security and encryption, low key management overhead, and minimal performance costs. Cryptonite is a secure Cloud storage repository that addresses these requirements using amore » StrongBox model for shared key management.We describe the Cryptonite service and desktop client, discuss performance optimizations, and provide an empirical analysis of the improvements. Our experiments shows that Cryptonite clients achieve a 40% improvement in file upload bandwidth over plaintext storage using the Azure Storage Client API despite the added security benefits, while our file download performance is 5 times faster than the baseline for files greater than 100MB.« less
Fifth NASA Goddard Conference on Mass Storage Systems and Technologies.. Volume 1
NASA Technical Reports Server (NTRS)
Kobler, Benjamin (Editor); Hariharan, P. C. (Editor)
1996-01-01
This document contains copies of those technical papers received in time for publication prior to the Fifth Goddard Conference on Mass Storage Systems and Technologies. As one of an ongoing series, this conference continues to serve as a unique medium for the exchange of information on topics relating to the ingestion and management of substantial amounts of data and the attendant problems involved. This year's discussion topics include storage architecture, database management, data distribution, file system performance and modeling, and optical recording technology. There will also be a paper on Application Programming Interfaces (API) for a Physical Volume Repository (PVR) defined in Version 5 of the Institute of Electrical and Electronics Engineers (IEEE) Reference Model (RM). In addition, there are papers on specific archives and storage products.
76 FR 45788 - Combined Notice of Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-01
... Time on Tuesday, July 26, 2011. Docket Numbers: RP11-2273-000. Applicants: Egan Hub Storage, LLC. Description: Egan Hub Storage, LLC submits tariff filing per 154.204: Egan Modifications for Big Sandy LINK...
Federal Register 2010, 2011, 2012, 2013, 2014
2010-12-28
...-Filing system does not support unlisted software, and the NRC Meta System Help Desk will not be able to... Setpoint Methodology for LSSS [Limiting Safety System Setting] Functions,'' which included the instrument... System Instrumentation,'' Function 3, Condensate Storage Tank Level--Low. The supporting TS Bases will...
Digital Stratigraphy: Contextual Analysis of File System Traces in Forensic Science.
Casey, Eoghan
2017-12-28
This work introduces novel methods for conducting forensic analysis of file allocation traces, collectively called digital stratigraphy. These in-depth forensic analysis methods can provide insight into the origin, composition, distribution, and time frame of strata within storage media. Using case examples and empirical studies, this paper illuminates the successes, challenges, and limitations of digital stratigraphy. This study also shows how understanding file allocation methods can provide insight into concealment activities and how real-world computer usage can complicate digital stratigraphy. Furthermore, this work explains how forensic analysts have misinterpreted traces of normal file system behavior as indications of concealment activities. This work raises awareness of the value of taking the overall context into account when analyzing file system traces. This work calls for further research in this area and for forensic tools to provide necessary information for such contextual analysis, such as highlighting mass deletion, mass copying, and potential backdating. © 2017 American Academy of Forensic Sciences.
A mass storage system for supercomputers based on Unix
NASA Technical Reports Server (NTRS)
Richards, J.; Kummell, T.; Zarlengo, D. G.
1988-01-01
The authors present the design, implementation, and utilization of a large mass storage subsystem (MSS) for the numerical aerodynamics simulation. The MSS supports a large networked, multivendor Unix-based supercomputing facility. The MSS at Ames Research Center provides all processors on the numerical aerodynamics system processing network, from workstations to supercomputers, the ability to store large amounts of data in a highly accessible, long-term repository. The MSS uses Unix System V and is capable of storing hundreds of thousands of files ranging from a few bytes to 2 Gb in size.
75 FR 63460 - Combined Notice of Filings No. 1
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-15
...: 5 p.m. Eastern Time on Wednesday, October 20, 2010. Docket Numbers: RP11-48-000. Applicants: Egan Hub Storage, LLC. Description: Egan Hub Storage, LLC submits tariff filing per 154.204: Form of...
NASA Astrophysics Data System (ADS)
You, Xiaozhen; Yao, Zhihong
2005-04-01
As a standard of communication and storage for medical digital images, DICOM has been playing a very important role in integration of hospital information. In DICOM, tags are expressed by numbers, and only standard data elements can be shared by looking up Data Dictionary while private tags can not. As such, a DICOM file's readability and extensibility is limited. In addition, reading DICOM files needs special software. In our research, we introduced XML into DICOM, defining an XML-based DICOM special transfer format, XML-DCM, a DICOM storage format, X-DCM, as well as developing a program package to realize format interchange among DICOM, XML-DCM, and X-DCM. XML-DCM is based on the DICOM structure while replacing numeric tags with accessible XML character string tags. The merits are as following: a) every character string tag of XML-DCM has explicit meaning, so users can understand standard data elements and those private data elements easily without looking up the Data Dictionary. In this way, the readability and data sharing of DICOM files are greatly improved; b) According to requirements, users can set new character string tags with explicit meaning to their own system to extend the capacity of data elements; c) User can read the medical image and associated information conveniently through IE, ultimately enlarging the scope of data sharing. The application of storage format X-DCM will reduce data redundancy and save storage memory. The result of practical application shows that XML-DCM does favor integration and share of medical image data among different systems or devices.
SIDS-toADF File Mapping Manual
NASA Technical Reports Server (NTRS)
McCarthy, Douglas; Smith, Matthew; Poirier, Diane; Smith, Charles A. (Technical Monitor)
2002-01-01
The "CFD General Notation System" (CGNS) consists of a collection of conventions, and conforming software, for the storage and retrieval of Computational Fluid Dynamics (CFD) data. It facilitates the exchange of data between sites and applications, and helps stabilize the archiving of aerodynamic data. This effort was initiated in order to streamline the procedures in exchanging data and software between NASA and its customers, but the goal is to develop CGNS into a National Standard for the exchange of aerodynamic data. The CGNS development team is comprised of members from Boeing Commercial Airplane Group, NASA-Ames, NASA-Langley, NASA-Lewis, McDonnell-Douglas Corporation (now Boeing-St. Louis), Air Force-Wright Lab., and ICEM-CFD Engineering. The elements of CGNS address all activities associated with the storage of data on external media and its movement to and from application programs. These elements include: 1) The Advanced Data Format (ADF) Database manager, consisting of both a file format specification and its I/O software, which handles the actual reading and writing of data from and to external storage media; 2) The Standard Interface Data Structures (SIDS), which specify the intellectual content of CFD data and the conventions governing naming and terminology; 3) The SIDS-to-ADF File Mapping conventions, which specify the exact location where the CFD data defined by the SIDS is to be stored within the ADF file(s); and 4) The CGNS Mid-level Library, which provides CFD-knowledgeable routines suitable for direct installation into application codes. The SIDS-toADF File Mapping Manual specifies the exact manner in which, under CGNS conventions, CFD data structures (the SIDS) are to be stored in (i.e., mapped onto) the file structure provided by the database manager (ADF). The result is a conforming CGNS database. Adherence to the mapping conventions guarantees uniform meaning and location of CFD data within ADF files, and thereby allows the construction of universal software to read and write the data.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-12-01
... project would consist of the following: (1) A 210-foot-high, 1,610-foot-long earth fill dam; (2) a 20-foot... acre-foot storage capacity; (4) a 170-foot-high, 1,270.0-foot-long earth fill dam creating; (5) a lower... prior registration, using the eComment system at http://www.ferc.gov/docs-filing/ecomment.asp . You must...
The Design of a Secure File Storage System
1979-12-01
ERROR _CODE (Przi SUCO COPE) !01ile not found; write access to dtrectorv not permitted I t := GATEKEFPER?.TICKFT ’MAIL BOX, 0) G ATE KF YP F I ~D iNC...BOX.MS’T.SUCC CODE F’OF COD? (DIOR SUCO CODE) Ifile_ not found.; Fead acceLss to directoryv file t ~TRKEPE.TIKFT MIT BOX C) GATHYP~PE-I.AWAIT (MAILBOX, C. (t+2
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-25
...-long earth embankment dam; (2) an upper reservoir with a surface area of 100 acres and an 7,100 acre-foot storage capacity; (3) a 120-foot-high, 7,430-foot-long earth embankment dam creating; (4) a lower... brief comments up to 6,000 characters, without prior registration, using the eComment system at http...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-02-25
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Project No. 13880-000] Cuffs Run Pumped..., Motions To Intervene, and Competing Applications On November 18, 2010, Cuffs Run Pumped Storage, LLC filed... to study the feasibility of the Cuffs Run Pumped Storage Project, located on Cuffs Run and the...
Feasibility of Executing MIMS on Interdata 80.
CDC 6500 computers, CDC 6600 computers, MIMS(Medical Information Management System ), Medical information management system , File structures, Computer...storage managementThe report examines the feasibility of implementing large information management system on mini-computers. The Medical Information ... Management System and the Interdata 80 mini-computer were selected as being representative systems. The FORTRAN programs currently being used in MIMS
A simulation model for wind energy storage systems. Volume 3: Program descriptions
NASA Technical Reports Server (NTRS)
Warren, A. W.; Edsinger, R. W.; Burroughs, J. D.
1977-01-01
Program descriptions, flow charts, and program listings for the SIMWEST model generation program, the simulation program, the file maintenance program, and the printer plotter program are given. For Vol 2, see .
Data Compression in Full-Text Retrieval Systems.
ERIC Educational Resources Information Center
Bell, Timothy C.; And Others
1993-01-01
Describes compression methods for components of full-text systems such as text databases on CD-ROM. Topics discussed include storage media; structures for full-text retrieval, including indexes, inverted files, and bitmaps; compression tools; memory requirements during retrieval; and ranking and information retrieval. (Contains 53 references.)…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mercier, C.W.
The Network File System (NFS) will be the user interface to a High-Performance Data System (HPDS) being developed at Los Alamos National Laboratory (LANL). HPDS will manage high-capacity, high-performance storage systems connected directly to a high-speed network from distributed workstations. NFS will be modified to maximize performance and to manage massive amounts of data. 6 refs., 3 figs.
A data storage, retrieval and analysis system for endocrine research. [for Skylab
NASA Technical Reports Server (NTRS)
Newton, L. E.; Johnston, D. A.
1975-01-01
This retrieval system builds, updates, retrieves, and performs basic statistical analyses on blood, urine, and diet parameters for the M071 and M073 Skylab and Apollo experiments. This system permits data entry from cards to build an indexed sequential file. Programs are easily modified for specialized analyses.
76 FR 39869 - Lee 8 Storage Partnership; Notice of Motion for Extension of Rate Case Filing Deadline
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-07
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR09-5-003] Lee 8 Storage..., Lee 8 Storage Partnership (Lee 8) filed a request for an extension consistent with the Commission's... extend the cycle for such reviews from three to five years.\\1\\ Therefore, Lee 8 requests that the date...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-11-26
... the NRC's E-Filing system does not support unlisted software, and the NRC Meta System Help Desk will... Osmosis (RO) system borated water storage tank suction connections. Basis for proposed no significant... requirement. For the SFP, the suction to the RO system is above the required TS water level, therefore, the...
Federal Register 2010, 2011, 2012, 2013, 2014
2010-11-30
... a level of > 22 inches water column in support of SGIG system operation. Exelon is submitting this... site, but should note that the NRC's E-Filing system does not support unlisted software, and the NRC... EDGs and the associated support systems, such as the fuel oil storage and transfer systems, are...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-04-16
...-Filing system does not support unlisted software, and the NRC Meta System Help Desk will not be able to... reverse osmosis system during normal plant operation to purify the water in the borated water storage... result of water returned from the RO System with lower boron concentration. Thus, no adverse effects from...
Active Management of Integrated Geothermal-CO2 Storage Reservoirs in Sedimentary Formations
Buscheck, Thomas A.
2012-01-01
Active Management of Integrated Geothermal–CO2 Storage Reservoirs in Sedimentary Formations: An Approach to Improve Energy Recovery and Mitigate Risk : FY1 Final Report The purpose of phase 1 is to determine the feasibility of integrating geologic CO2 storage (GCS) with geothermal energy production. Phase 1 includes reservoir analyses to determine injector/producer well schemes that balance the generation of economically useful flow rates at the producers with the need to manage reservoir overpressure to reduce the risks associated with overpressure, such as induced seismicity and CO2 leakage to overlying aquifers. This submittal contains input and output files of the reservoir model analyses. A reservoir-model "index-html" file was sent in a previous submittal to organize the reservoir-model input and output files according to sections of the FY1 Final Report to which they pertain. The recipient should save the file: Reservoir-models-inputs-outputs-index.html in the same directory that the files: Section2.1.*.tar.gz files are saved in.
Active Management of Integrated Geothermal-CO2 Storage Reservoirs in Sedimentary Formations
Buscheck, Thomas A.
2000-01-01
Active Management of Integrated Geothermal–CO2 Storage Reservoirs in Sedimentary Formations: An Approach to Improve Energy Recovery and Mitigate Risk: FY1 Final Report The purpose of phase 1 is to determine the feasibility of integrating geologic CO2 storage (GCS) with geothermal energy production. Phase 1 includes reservoir analyses to determine injector/producer well schemes that balance the generation of economically useful flow rates at the producers with the need to manage reservoir overpressure to reduce the risks associated with overpressure, such as induced seismicity and CO2 leakage to overlying aquifers. This submittal contains input and output files of the reservoir model analyses. A reservoir-model "index-html" file was sent in a previous submittal to organize the reservoir-model input and output files according to sections of the FY1 Final Report to which they pertain. The recipient should save the file: Reservoir-models-inputs-outputs-index.html in the same directory that the files: Section2.1.*.tar.gz files are saved in.
Cross-Matching of Very Large Catalogs
NASA Astrophysics Data System (ADS)
Martynov, M. V.; Bodryagin, D. V.
Modern astronomical catalogs and sky surveys, that contain billions of objects, belong to the "big data" data class. Existing available services have limited functionality and do not include all required and available catalogs. The software package ACrId (Astronomical Cross Identification) for cross-matching large astronomical catalogs, which uses an sphere pixelation algorithm HEALPix, ReiserFS file system and JSON-type text files for storage, has been developed at the Research Institution "Mykolaiv Astronomical Observatory".
Lessons Learned in Deploying the World s Largest Scale Lustre File System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dillow, David A; Fuller, Douglas; Wang, Feiyi
2010-01-01
The Spider system at the Oak Ridge National Laboratory's Leadership Computing Facility (OLCF) is the world's largest scale Lustre parallel file system. Envisioned as a shared parallel file system capable of delivering both the bandwidth and capacity requirements of the OLCF's diverse computational environment, the project had a number of ambitious goals. To support the workloads of the OLCF's diverse computational platforms, the aggregate performance and storage capacity of Spider exceed that of our previously deployed systems by a factor of 6x - 240 GB/sec, and 17x - 10 Petabytes, respectively. Furthermore, Spider supports over 26,000 clients concurrently accessing themore » file system, which exceeds our previously deployed systems by nearly 4x. In addition to these scalability challenges, moving to a center-wide shared file system required dramatically improved resiliency and fault-tolerance mechanisms. This paper details our efforts in designing, deploying, and operating Spider. Through a phased approach of research and development, prototyping, deployment, and transition to operations, this work has resulted in a number of insights into large-scale parallel file system architectures, from both the design and the operational perspectives. We present in this paper our solutions to issues such as network congestion, performance baselining and evaluation, file system journaling overheads, and high availability in a system with tens of thousands of components. We also discuss areas of continued challenges, such as stressed metadata performance and the need for file system quality of service alongside with our efforts to address them. Finally, operational aspects of managing a system of this scale are discussed along with real-world data and observations.« less
ERIC Educational Resources Information Center
Illinois Univ., Urbana. Coordinated Science Lab.
In contrast to conventional information storage and retrieval systems in which a body of knowledge is thought of as an indexed codex of documents to which access is obtained by an appropriately indexed query, this interdisciplinary study aims at an understanding of what is "knowledge" as distinct from a "data file," how this knowledge is acquired,…
75 FR 10474 - Privacy Act of 1974; Systems of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2010-03-08
... storage media. RETRIEVABILITY: Retrieved by last name and Social Security Number (SSN). SAFEGUARDS... proposed action will be effective without further notice on April 7, 2010 unless comments are received... RECORDS IN THE SYSTEM: Delete entry and replace with ``The files contain full name, grade, Social Security...
Cloud Based Drive Forensic and DDoS Analysis on Seafile as Case Study
NASA Astrophysics Data System (ADS)
Bahaweres, R. B.; Santo, N. B.; Ningsih, A. S.
2017-01-01
The rapid development of Internet due to increasing data rates through both broadband cable networks and 4G wireless mobile, make everyone easily connected to the internet. Storages as Services (StaaS) is more popular and many users want to store their data in one place so that whenever they need they can easily access anywhere, any place and anytime in the cloud. The use of the service makes it vulnerable to use by someone to commit a crime or can do Denial of Service (DoS) on cloud storage services. The criminals can use the cloud storage services to store, upload and download illegal file or document to the cloud storage. In this study, we try to implement a private cloud storage using Seafile on Raspberry Pi and perform simulations in Local Area Network and Wi-Fi environment to analyze forensically to discover or open a criminal act can be traced and proved forensically. Also, we can identify, collect and analyze the artifact of server and client, such as a registry of the desktop client, the file system, the log of seafile, the cache of the browser, and database forensic.
Accounting Data to Web Interface Using PERL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hargeaves, C
2001-08-13
This document will explain the process to create a web interface for the accounting information generated by the High Performance Storage Systems (HPSS) accounting report feature. The accounting report contains useful data but it is not easily accessed in a meaningful way. The accounting report is the only way to see summarized storage usage information. The first step is to take the accounting data, make it meaningful and store the modified data in persistent databases. The second step is to generate the various user interfaces, HTML pages, that will be used to access the data. The third step is tomore » transfer all required files to the web server. The web pages pass parameters to Common Gateway Interface (CGI) scripts that generate dynamic web pages and graphs. The end result is a web page with specific information presented in text with or without graphs. The accounting report has a specific format that allows the use of regular expressions to verify if a line is storage data. Each storage data line is stored in a detailed database file with a name that includes the run date. The detailed database is used to create a summarized database file that also uses run date in its name. The summarized database is used to create the group.html web page that includes a list of all storage users. Scripts that query the database folder to build a list of available databases generate two additional web pages. A master script that is run monthly as part of a cron job, after the accounting report has completed, manages all of these individual scripts. All scripts are written in the PERL programming language. Whenever possible data manipulation scripts are written as filters. All scripts are written to be single source, which means they will function properly on both the open and closed networks at LLNL. The master script handles the command line inputs for all scripts, file transfers to the web server and records run information in a log file. The rest of the scripts manipulate the accounting data or use the files created to generate HTML pages. Each script will be described in detail herein. The following is a brief description of HPSS taken directly from an HPSS web site. ''HPSS is a major development project, which began in 1993 as a Cooperative Research and Development Agreement (CRADA) between government and industry. The primary objective of HPSS is to move very large data objects between high performance computers, workstation clusters, and storage libraries at speeds many times faster than is possible with today's software systems. For example, HPSS can manage parallel data transfers from multiple network-connected disk arrays at rates greater than 1 Gbyte per second, making it possible to access high definition digitized video in real time.'' The HPSS accounting report is a canned report whose format is controlled by the HPSS developers.« less
Mass storage system reference model, Version 4
NASA Technical Reports Server (NTRS)
Coleman, Sam (Editor); Miller, Steve (Editor)
1993-01-01
The high-level abstractions that underlie modern storage systems are identified. The information to generate the model was collected from major practitioners who have built and operated large storage facilities, and represents a distillation of the wisdom they have acquired over the years. The model provides a common terminology and set of concepts to allow existing systems to be examined and new systems to be discussed and built. It is intended that the model and the interfaces identified from it will allow and encourage vendors to develop mutually-compatible storage components that can be combined to form integrated storage systems and services. The reference model presents an abstract view of the concepts and organization of storage systems. From this abstraction will come the identification of the interfaces and modules that will be used in IEEE storage system standards. The model is not yet suitable as a standard; it does not contain implementation decisions, such as how abstract objects should be broken up into software modules or how software modules should be mapped to hosts; it does not give policy specifications, such as when files should be migrated; does not describe how the abstract objects should be used or connected; and does not refer to specific hardware components. In particular, it does not fully specify the interfaces.
Comparative case study between D3 and highcharts on lustre data visualization
NASA Astrophysics Data System (ADS)
ElTayeby, Omar; John, Dwayne; Patel, Pragnesh; Simmerman, Scott
2013-12-01
One of the challenging tasks in visual analytics is to target clustered time-series data sets, since it is important for data analysts to discover patterns changing over time while keeping their focus on particular subsets. In order to leverage the humans ability to quickly visually perceive these patterns, multivariate features should be implemented according to the attributes available. However, a comparative case study has been done using JavaScript libraries to demonstrate the differences in capabilities of using them. A web-based application to monitor the Lustre file system for the systems administrators and the operation teams has been developed using D3 and Highcharts. Lustre file systems are responsible of managing Remote Procedure Calls (RPCs) which include input output (I/O) requests between clients and Object Storage Targets (OSTs). The objective of this application is to provide time-series visuals of these calls and storage patterns of users on Kraken, a University of Tennessee High Performance Computing (HPC) resource in Oak Ridge National Laboratory (ORNL).
Reducing I/O variability using dynamic I/O path characterization in petascale storage systems
Son, Seung Woo; Sehrish, Saba; Liao, Wei-keng; ...
2016-11-01
In petascale systems with a million CPU cores, scalable and consistent I/O performance is becoming increasingly difficult to sustain mainly because of I/O variability. Furthermore, the I/O variability is caused by concurrently running processes/jobs competing for I/O or a RAID rebuild when a disk drive fails. We present a mechanism that stripes across a selected subset of I/O nodes with the lightest workload at runtime to achieve the highest I/O bandwidth available in the system. In this paper, we propose a probing mechanism to enable application-level dynamic file striping to mitigate I/O variability. We also implement the proposed mechanism inmore » the high-level I/O library that enables memory-to-file data layout transformation and allows transparent file partitioning using subfiling. Subfiling is a technique that partitions data into a set of files of smaller size and manages file access to them, making data to be treated as a single, normal file to users. Here, we demonstrate that our bandwidth probing mechanism can successfully identify temporally slower I/O nodes without noticeable runtime overhead. Experimental results on NERSC’s systems also show that our approach isolates I/O variability effectively on shared systems and improves overall collective I/O performance with less variation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ruwart, T M; Eldel, A
2000-01-01
The primary objectives of this project were to evaluate the performance of the SGI CXFS File System in a Storage Area Network (SAN) and compare/contrast it to the performance of a locally attached XFS file system on the same computer and storage subsystems. The University of Minnesota participants were asked to verify that the performance of the SAN/CXFS configuration did not fall below 85% of the performance of the XFS local configuration. There were two basic hardware test configurations constructed from the following equipment: Two Onyx 2 computer systems each with two Qlogic-based Fibre Channel/XIO Host Bus Adapter (HBA); Onemore » 8-Port Brocade Silkworm 2400 Fibre Channel Switch; and Four Ciprico RF7000 RAID Disk Arrays populated Seagate Barracuda 50GB disk drives. The Operating System on each of the ONYX 2 computer systems was IRIX 6.5.6. The first hardware configuration consisted of directly connecting the Ciprico arrays to the Qlogic controllers without the Brocade switch. The purpose for this configuration was to establish baseline performance data on the Qlogic controllers / Ciprico disk raw subsystem. This baseline performance data would then be used to demonstrate any performance differences arising from the addition of the Brocade Fibre Channel Switch. Furthermore, the performance of the Qlogic controllers could be compared to that of the older, Adaptec-based XIO dual-channel Fibre Channel adapters previously used on these systems. It should be noted that only raw device tests were performed on this configuration. No file system testing was performed on this configuration. The second hardware configuration introduced the Brocade Fibre Channel Switch. Two FC ports from each of the ONYX2 computer systems were attached to four ports of the switch and the four Ciprico arrays were attached to the remaining four. Raw disk subsystem tests were performed on the SAN configuration in order to demonstrate the performance differences between the direct-connect and the switched configurations. After this testing was completed, the Ciprico arrays were formatted with an XFS file system and performance numbers were gathered to establish a File System Performance Baseline. Finally, the disks were formatted with CXFS and further tests were run to demonstrate the performance of the CXFS file system. A summary of the results of these tests is given.« less
Storage system architectures and their characteristics
NASA Technical Reports Server (NTRS)
Sarandrea, Bryan M.
1993-01-01
Not all users storage requirements call for 20 MBS data transfer rates, multi-tier file or data migration schemes, or even automated retrieval of data. The number of available storage solutions reflects the broad range of user requirements. It is foolish to think that any one solution can address the complete range of requirements. For users with simple off-line storage requirements, the cost and complexity of high end solutions would provide no advantage over a more simple solution. The correct answer is to match the requirements of a particular storage need to the various attributes of the available solutions. The goal of this paper is to introduce basic concepts of archiving and storage management in combination with the most common architectures and to provide some insight into how these concepts and architectures address various storage problems. The intent is to provide potential consumers of storage technology with a framework within which to begin the hunt for a solution which meets their particular needs. This paper is not intended to be an exhaustive study or to address all possible solutions or new technologies, but is intended to be a more practical treatment of todays storage system alternatives. Since most commercial storage systems today are built on Open Systems concepts, the majority of these solutions are hosted on the UNIX operating system. For this reason, some of the architectural issues discussed focus around specific UNIX architectural concepts. However, most of the architectures are operating system independent and the conclusions are applicable to such architectures on any operating system.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-12-14
... level be raised to support SSF RC Makeup System operability. Thus, the SFP water level will not be..., but should note that the NRC's E-Filing system does not support unlisted software, and the NRC Meta... a reverse osmosis system during normal plant operation to remove silica from borated water storage...
A STORAGE AND RETRIEVAL SYSTEM FOR DOCUMENTS IN INSTRUCTIONAL RESOURCES. REPORT NO. 13.
ERIC Educational Resources Information Center
DIAMOND, ROBERT M.; LEE, BERTA GRATTAN
IN ORDER TO IMPROVE INSTRUCTION WITHIN TWO-YEAR LOWER DIVISION COURSES, A COMPREHENSIVE RESOURCE LIBRARY WAS DEVELOPED AND A SIMPLIFIED CATALOGING AND INFORMATION RETRIEVAL SYSTEM WAS APPLIED TO IT. THE ROYAL MCBEE "KEYDEX" SYSTEM, CONTAINING THREE MAJOR COMPONENTS--A PUNCH MACHINE, FILE CARDS, AND A LIGHT BOX--WAS USED. CARDS WERE HEADED WITH KEY…
Cache-enabled small cell networks: modeling and tradeoffs.
Baştuǧ, Ejder; Bennis, Mehdi; Kountouris, Marios; Debbah, Mérouane
We consider a network model where small base stations (SBSs) have caching capabilities as a means to alleviate the backhaul load and satisfy users' demand. The SBSs are stochastically distributed over the plane according to a Poisson point process (PPP) and serve their users either (i) by bringing the content from the Internet through a finite rate backhaul or (ii) by serving them from the local caches. We derive closed-form expressions for the outage probability and the average delivery rate as a function of the signal-to-interference-plus-noise ratio (SINR), SBS density, target file bitrate, storage size, file length, and file popularity. We then analyze the impact of key operating parameters on the system performance. It is shown that a certain outage probability can be achieved either by increasing the number of base stations or the total storage size. Our results and analysis provide key insights into the deployment of cache-enabled small cell networks (SCNs), which are seen as a promising solution for future heterogeneous cellular networks.
NASA Technical Reports Server (NTRS)
Ryan, J. W.; Ma, C.; Schupler, B. R.
1980-01-01
A data base handler which would act to tie Mark 3 system programs together is discussed. The data base handler is written in FORTRAN and is implemented on the Hewlett-Packard 21MX and the IBM 360/91. The system design objectives were to (1) provide for an easily specified method of data interchange among programs, (2) provide for a high level of data integrity, (3) accommodate changing requirments, (4) promote program accountability, (5) provide a single source of program constants, and (6) provide a central point for data archiving. The system consists of two distinct parts: a set of files existing on disk packs and tapes; and a set of utility subroutines which allow users to access the information in these files. Users never directly read or write the files and need not know the details of how the data are formatted in the files. To the users, the storage medium is format free. A user does need to know something about the sequencing of his data in the files but nothing about data in which he has no interest.
Solar heating and hot water system installed at Arlington Raquetball Club, Arlington, Virginia
NASA Technical Reports Server (NTRS)
1981-01-01
A solar space and water heating system is described. The solar energy system consists of 2,520 sq. ft. of flat plate solar collectors and a 4,000 gallon solar storage tank. The transfer medium in the forced closed loop is a nontoxic antifreeze solution (50 percent water, 50 percent propylene glycol). The service hot water system consists of a preheat coil (60 ft. of 1 1/4 in copper tubing) located in the upper third of the solar storage tank and a recirculation loop between the preheat coil and the existing electric water heaters. The space heating system consists of two separate water to air heat exchangers located in the ducts of the existing space heating/cooling systems. The heating water is supplied from the solar storage tank. Extracts from site files, specification references for solar modifications to existing building heating and hot water systems, and installation, operation and maintenance instructions are included.
Dragly, Svenn-Arne; Hobbi Mobarhan, Milad; Lepperød, Mikkel E.; Tennøe, Simen; Fyhn, Marianne; Hafting, Torkel; Malthe-Sørenssen, Anders
2018-01-01
Natural sciences generate an increasing amount of data in a wide range of formats developed by different research groups and commercial companies. At the same time there is a growing desire to share data along with publications in order to enable reproducible research. Open formats have publicly available specifications which facilitate data sharing and reproducible research. Hierarchical Data Format 5 (HDF5) is a popular open format widely used in neuroscience, often as a foundation for other, more specialized formats. However, drawbacks related to HDF5's complex specification have initiated a discussion for an improved replacement. We propose a novel alternative, the Experimental Directory Structure (Exdir), an open specification for data storage in experimental pipelines which amends drawbacks associated with HDF5 while retaining its advantages. HDF5 stores data and metadata in a hierarchy within a complex binary file which, among other things, is not human-readable, not optimal for version control systems, and lacks support for easy access to raw data from external applications. Exdir, on the other hand, uses file system directories to represent the hierarchy, with metadata stored in human-readable YAML files, datasets stored in binary NumPy files, and raw data stored directly in subdirectories. Furthermore, storing data in multiple files makes it easier to track for version control systems. Exdir is not a file format in itself, but a specification for organizing files in a directory structure. Exdir uses the same abstractions as HDF5 and is compatible with the HDF5 Abstract Data Model. Several research groups are already using data stored in a directory hierarchy as an alternative to HDF5, but no common standard exists. This complicates and limits the opportunity for data sharing and development of common tools for reading, writing, and analyzing data. Exdir facilitates improved data storage, data sharing, reproducible research, and novel insight from interdisciplinary collaboration. With the publication of Exdir, we invite the scientific community to join the development to create an open specification that will serve as many needs as possible and as a foundation for open access to and exchange of data. PMID:29706879
Dragly, Svenn-Arne; Hobbi Mobarhan, Milad; Lepperød, Mikkel E; Tennøe, Simen; Fyhn, Marianne; Hafting, Torkel; Malthe-Sørenssen, Anders
2018-01-01
Natural sciences generate an increasing amount of data in a wide range of formats developed by different research groups and commercial companies. At the same time there is a growing desire to share data along with publications in order to enable reproducible research. Open formats have publicly available specifications which facilitate data sharing and reproducible research. Hierarchical Data Format 5 (HDF5) is a popular open format widely used in neuroscience, often as a foundation for other, more specialized formats. However, drawbacks related to HDF5's complex specification have initiated a discussion for an improved replacement. We propose a novel alternative, the Experimental Directory Structure (Exdir), an open specification for data storage in experimental pipelines which amends drawbacks associated with HDF5 while retaining its advantages. HDF5 stores data and metadata in a hierarchy within a complex binary file which, among other things, is not human-readable, not optimal for version control systems, and lacks support for easy access to raw data from external applications. Exdir, on the other hand, uses file system directories to represent the hierarchy, with metadata stored in human-readable YAML files, datasets stored in binary NumPy files, and raw data stored directly in subdirectories. Furthermore, storing data in multiple files makes it easier to track for version control systems. Exdir is not a file format in itself, but a specification for organizing files in a directory structure. Exdir uses the same abstractions as HDF5 and is compatible with the HDF5 Abstract Data Model. Several research groups are already using data stored in a directory hierarchy as an alternative to HDF5, but no common standard exists. This complicates and limits the opportunity for data sharing and development of common tools for reading, writing, and analyzing data. Exdir facilitates improved data storage, data sharing, reproducible research, and novel insight from interdisciplinary collaboration. With the publication of Exdir, we invite the scientific community to join the development to create an open specification that will serve as many needs as possible and as a foundation for open access to and exchange of data.
Fifth NASA Goddard Conference on Mass Storage Systems and Technologies. Volume 2
NASA Technical Reports Server (NTRS)
Kobler, Benjamin (Editor); Hariharan, P. C. (Editor)
1996-01-01
This document contains copies of those technical papers received in time for publication prior to the Fifth Goddard Conference on Mass Storage Systems and Technologies held September 17 - 19, 1996, at the University of Maryland, University Conference Center in College Park, Maryland. As one of an ongoing series, this conference continues to serve as a unique medium for the exchange of information on topics relating to the ingestion and management of substantial amounts of data and the attendant problems involved. This year's discussion topics include storage architecture, database management, data distribution, file system performance and modeling, and optical recording technology. There will also be a paper on Application Programming Interfaces (API) for a Physical Volume Repository (PVR) defined in Version 5 of the Institute of Electrical and Electronics Engineers (IEEE) Reference Model (RM). In addition, there are papers on specific archives and storage products.
NASA Technical Reports Server (NTRS)
Poole, L. R.
1974-01-01
A study was conducted of an alternate method for storage and use of bathymetry data in the Langley Research Center and Virginia Institute of Marine Science mid-Atlantic continental-shelf wave-refraction computer program. The regional bathymetry array was divided into 105 indexed modules which can be read individually into memory in a nonsequential manner from a peripheral file using special random-access subroutines. In running a sample refraction case, a 75-percent decrease in program field length was achieved by using the random-access storage method in comparison with the conventional method of total regional array storage. This field-length decrease was accompanied by a comparative 5-percent increase in central processing time and a 477-percent increase in the number of operating-system calls. A comparative Langley Research Center computer system cost savings of 68 percent was achieved by using the random-access storage method.
First Experiences with CMS Data Storage on the GEMSS System at the INFN-CNAF Tier-1
NASA Astrophysics Data System (ADS)
Andreotti, D.; Bonacorsi, D.; Cavalli, A.; Pra, S. Dal; Dell'Agnello, L.; Forti, Alberto; Grandi, C.; Gregori, D.; Gioi, L. Li; Martelli, B.; Prosperini, A.; Ricci, P. P.; Ronchieri, Elisabetta; Sapunenko, V.; Sartirana, A.; Vagnoni, V.; Zappi, Riccardo
A brand new Mass Storage System solution called "Grid-Enabled Mass Storage System" (GEMSS) -based on the Storage Resource Manager (StoRM) developed by INFN, on the General Parallel File System by IBM and on the Tivoli Storage Manager by IBM -has been tested and deployed at the INFNCNAF Tier-1 Computing Centre in Italy. After a successful stress test phase, the solution is now being used in production for the data custodiality of the CMS experiment at CNAF. All data previously recorded on the CASTOR system have been transferred to GEMSS. As final validation of the GEMSS system, some of the computing tests done in the context of the WLCG "Scale Test for the Experiment Program" (STEP'09) challenge were repeated in September-October 2009 and compared with the results previously obtained with CASTOR in June 2009. In this paper, the GEMSS system basics, the stress test activity and the deployment phase -as well as the reliability and performance of the system -are overviewed. The experiences in the use of GEMSS at CNAF in preparing for the first months of data taking of the CMS experiment at the Large Hadron Collider are also presented.
ERIC Educational Resources Information Center
Beiser, Karl
1986-01-01
Describes a product--BiblioFile, Library Corporation's catalog production system--and a service--reproduction of public domain software on CD-ROM for sale to those interested--which revolve around the ultra-high-density storage capacity of CD-ROM discs. Criteria for selecting microcomputers are briefly reviewed. (MBR)
Peregrine System | High-Performance Computing | NREL
) and longer-term (/projects) storage. These file systems are mounted on all nodes. Peregrine has three -2670 Xeon processors and 64 GB of memory. In addition to mounting the /home, /nopt, /projects and # cores/node Memory/node Peak (DP) performance per node 88 Intel Xeon E5-2670 "Sandy Bridge" 8
Use of Schema on Read in Earth Science Data Archives
NASA Technical Reports Server (NTRS)
Hegde, Mahabaleshwara; Smit, Christine; Pilone, Paul; Petrenko, Maksym; Pham, Long
2017-01-01
Traditionally, NASA Earth Science data archives have file-based storage using proprietary data file formats, such as HDF and HDF-EOS, which are optimized to support fast and efficient storage of spaceborne and model data as they are generated. The use of file-based storage essentially imposes an indexing strategy based on data dimensions. In most cases, NASA Earth Science data uses time as the primary index, leading to poor performance in accessing data in spatial dimensions. For example, producing a time series for a single spatial grid cell involves accessing a large number of data files. With exponential growth in data volume due to the ever-increasing spatial and temporal resolution of the data, using file-based archives poses significant performance and cost barriers to data discovery and access. Storing and disseminating data in proprietary data formats imposes an additional access barrier for users outside the mainstream research community. At the NASA Goddard Earth Sciences Data Information Services Center (GES DISC), we have evaluated applying the schema-on-read principle to data access and distribution. We used Apache Parquet to store geospatial data, and have exposed data through Amazon Web Services (AWS) Athena, AWS Simple Storage Service (S3), and Apache Spark. Using the schema-on-read approach allows customization of indexing spatially or temporally to suit the data access pattern. The storage of data in open formats such as Apache Parquet has widespread support in popular programming languages. A wide range of solutions for handling big data lowers the access barrier for all users. This presentation will discuss formats used for data storage, frameworks with This presentation will discuss formats used for data storage, frameworks with support for schema-on-read used for data access, and common use cases covering data usage patterns seen in a geospatial data archive.
Program for Experimentation With Expert Systems
NASA Technical Reports Server (NTRS)
Engle, S. W.
1986-01-01
CERBERUS is forward-chaining, knowledge-based system program useful for experimentation with expert systems. Inference-engine mechanism performs deductions according to user-supplied rule set. Information stored in intermediate area, and user interrogated only when no applicable data found in storage. Each assertion posed by CERBERUS answered with certainty ranging from 0 to 100 percent. Rule processor stops investigating applicable rules when goal reaches certainty of 95 percent or higher. Capable of operating for wide variety of domains. Sample rule files included for animal identification, pixel classification in image processing, and rudimentary car repair for novice mechanic. User supplies set of end goals or actions. System complexity decided by user's rule file. CERBERUS written in FORTRAN 77.
Evaluating the effect of online data compression on the disk cache of a mass storage system
NASA Technical Reports Server (NTRS)
Pentakalos, Odysseas I.; Yesha, Yelena
1994-01-01
A trace driven simulation of the disk cache of a mass storage system was used to evaluate the effect of an online compression algorithm on various performance measures. Traces from the system at NASA's Center for Computational Sciences were used to run the simulation and disk cache hit ratios, number of files and bytes migrating to tertiary storage were measured. The measurements were performed for both an LRU and a size based migration algorithm. In addition to seeing the effect of online data compression on the disk cache performance measure, the simulation provided insight into the characteristics of the interactive references, suggesting that hint based prefetching algorithms are the only alternative for any future improvements to the disk cache hit ratio.
75 FR 32932 - Combined Notice of Filings No. 2
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-10
..., 2010. Take notice that the Commission has received the following Natural Gas Pipeline Rate and Refund Report filings: Docket Numbers: RP09-260-005. Applicants: Tres Palacios Gas Storage LLC. Description: Tres Palacios Gas Storage LLC submits Second Substitute First Revised Sheet 138 to FERC Gas Tariff...
Incorporating Brokers within Collaboration Environments
NASA Astrophysics Data System (ADS)
Rajasekar, A.; Moore, R.; de Torcy, A.
2013-12-01
A collaboration environment, such as the integrated Rule Oriented Data System (iRODS - http://irods.diceresearch.org), provides interoperability mechanisms for accessing storage systems, authentication systems, messaging systems, information catalogs, networks, and policy engines from a wide variety of clients. The interoperability mechanisms function as brokers, translating actions requested by clients to the protocol required by a specific technology. The iRODS data grid is used to enable collaborative research within hydrology, seismology, earth science, climate, oceanography, plant biology, astronomy, physics, and genomics disciplines. Although each domain has unique resources, data formats, semantics, and protocols, the iRODS system provides a generic framework that is capable of managing collaborative research initiatives that span multiple disciplines. Each interoperability mechanism (broker) is linked to a name space that enables unified access across the heterogeneous systems. The collaboration environment provides not only support for brokers, but also support for virtualization of name spaces for users, files, collections, storage systems, metadata, and policies. The broker enables access to data or information in a remote system using the appropriate protocol, while the collaboration environment provides a uniform naming convention for accessing and manipulating each object. Within the NSF DataNet Federation Consortium project (http://www.datafed.org), three basic types of interoperability mechanisms have been identified and applied: 1) drivers for managing manipulation at the remote resource (such as data subsetting), 2) micro-services that execute the protocol required by the remote resource, and 3) policies for controlling the execution. For example, drivers have been written for manipulating NetCDF and HDF formatted files within THREDDS servers. Micro-services have been written that manage interactions with the CUAHSI data repository, the DataONE information catalog, and the GeoBrain broker. Policies have been written that manage transfer of messages between an iRODS message queue and the Advanced Message Queuing Protocol. Examples of these brokering mechanisms will be presented. The DFC collaboration environment serves as the intermediary between community resources and compute grids, enabling reproducible data-driven research. It is possible to create an analysis workflow that retrieves data subsets from a remote server, assemble the required input files, automate the execution of the workflow, automatically track the provenance of the workflow, and share the input files, workflow, and output files. A collaborator can re-execute a shared workflow, compare results, change input files, and re-execute an analysis.
NASA Technical Reports Server (NTRS)
Jamsek, Damir A.
1993-01-01
A brief example of the use of formal methods techniques in the specification of a software system is presented. The report is part of a larger effort targeted at defining a formal methods pilot project for NASA. One possible application domain that may be used to demonstrate the effective use of formal methods techniques within the NASA environment is presented. It is not intended to provide a tutorial on either formal methods techniques or the application being addressed. It should, however, provide an indication that the application being considered is suitable for a formal methods by showing how such a task may be started. The particular system being addressed is the Structured File Services (SFS), which is a part of the Data Storage and Retrieval Subsystem (DSAR), which in turn is part of the Data Management System (DMS) onboard Spacestation Freedom. This is a software system that is currently under development for NASA. An informal mathematical development is presented. Section 3 contains the same development using Penelope (23), an Ada specification and verification system. The complete text of the English version Software Requirements Specification (SRS) is reproduced in Appendix A.
Secure Reliable Processing Systems
1984-02-21
be attainable in principle, the more difficult goal is to meet all of the above while still maintaining good performance within the framwork of a well...managing the network, the user sees a conceptually simpler storage facility, composed merely of files, without machine boundaries, replicated copies
A Rewritable, Random-Access DNA-Based Storage System.
Yazdi, S M Hossein Tabatabaei; Yuan, Yongbo; Ma, Jian; Zhao, Huimin; Milenkovic, Olgica
2015-09-18
We describe the first DNA-based storage architecture that enables random access to data blocks and rewriting of information stored at arbitrary locations within the blocks. The newly developed architecture overcomes drawbacks of existing read-only methods that require decoding the whole file in order to read one data fragment. Our system is based on new constrained coding techniques and accompanying DNA editing methods that ensure data reliability, specificity and sensitivity of access, and at the same time provide exceptionally high data storage capacity. As a proof of concept, we encoded parts of the Wikipedia pages of six universities in the USA, and selected and edited parts of the text written in DNA corresponding to three of these schools. The results suggest that DNA is a versatile media suitable for both ultrahigh density archival and rewritable storage applications.
A Rewritable, Random-Access DNA-Based Storage System
NASA Astrophysics Data System (ADS)
Tabatabaei Yazdi, S. M. Hossein; Yuan, Yongbo; Ma, Jian; Zhao, Huimin; Milenkovic, Olgica
2015-09-01
We describe the first DNA-based storage architecture that enables random access to data blocks and rewriting of information stored at arbitrary locations within the blocks. The newly developed architecture overcomes drawbacks of existing read-only methods that require decoding the whole file in order to read one data fragment. Our system is based on new constrained coding techniques and accompanying DNA editing methods that ensure data reliability, specificity and sensitivity of access, and at the same time provide exceptionally high data storage capacity. As a proof of concept, we encoded parts of the Wikipedia pages of six universities in the USA, and selected and edited parts of the text written in DNA corresponding to three of these schools. The results suggest that DNA is a versatile media suitable for both ultrahigh density archival and rewritable storage applications.
1984-06-01
programming environment and then dumped, as described in the Franz Lisp manual [Ref. 13]. A synopsis of the functional elements which make up this LISP...the average system usage rate. Lines i4 and 15 reflect a function of Franz Lisp wherein past used storage locations are reclaimed for the available... Franz Lisp Opus 38. Also included in this distribu- tion are two library files containing the bonding Fad a layouts in CIF, and a library file
Inertial Manifolds for Navier-Stokes Equations and Related Dynamical Systems
1991-05-31
Graphics IRIS (SGI). The RLE files for the animation are loaded to an Abekas and recorded to tape by Betacam . This computational work was done by using the...scripts and comments, are loaded to the Abekas-A60 digital image storage device, and then recorded to the Betacam BVW-75 analog tape recorder. Static...interfacing, huge data files are output to the Data Vault parallelly with little cost. In addition to the SGIs, Abekas, Betacam and Solitaire, the
NASA Astrophysics Data System (ADS)
Anantharaj, V.; Mayer, B.; Wang, F.; Hack, J.; McKenna, D.; Hartman-Baker, R.
2012-04-01
The Oak Ridge Leadership Computing Facility (OLCF) facilitates the execution of computational experiments that require tens of millions of CPU hours (typically using thousands of processors simultaneously) while generating hundreds of terabytes of data. A set of ultra high resolution climate experiments in progress, using the Community Earth System Model (CESM), will produce over 35,000 files, ranging in sizes from 21 MB to 110 GB each. The execution of the experiments will require nearly 70 Million CPU hours on the Jaguar and Titan supercomputers at OLCF. The total volume of the output from these climate modeling experiments will be in excess of 300 TB. This model output must then be archived, analyzed, distributed to the project partners in a timely manner, and also made available more broadly. Meeting this challenge would require efficient movement of the data, staging the simulation output to a large and fast file system that provides high volume access to other computational systems used to analyze the data and synthesize results. This file system also needs to be accessible via high speed networks to an archival system that can provide long term reliable storage. Ideally this archival system is itself directly available to other systems that can be used to host services making the data and analysis available to the participants in the distributed research project and to the broader climate community. The various resources available at the OLCF now support this workflow. The available systems include the new Jaguar Cray XK6 2.63 petaflops (estimated) supercomputer, the 10 PB Spider center-wide parallel file system, the Lens/EVEREST analysis and visualization system, the HPSS archival storage system, the Earth System Grid (ESG), and the ORNL Climate Data Server (CDS). The ESG features federated services, search & discovery, extensive data handling capabilities, deep storage access, and Live Access Server (LAS) integration. The scientific workflow enabled on these systems, and developed as part of the Ultra-High Resolution Climate Modeling Project, allows users of OLCF resources to efficiently share simulated data, often multi-terabyte in volume, as well as the results from the modeling experiments and various synthesized products derived from these simulations. The final objective in the exercise is to ensure that the simulation results and the enhanced understanding will serve the needs of a diverse group of stakeholders across the world, including our research partners in U.S. Department of Energy laboratories & universities, domain scientists, students (K-12 as well as higher education), resource managers, decision makers, and the general public.
Log-less metadata management on metadata server for parallel file systems.
Liao, Jianwei; Xiao, Guoqiang; Peng, Xiaoning
2014-01-01
This paper presents a novel metadata management mechanism on the metadata server (MDS) for parallel and distributed file systems. In this technique, the client file system backs up the sent metadata requests, which have been handled by the metadata server, so that the MDS does not need to log metadata changes to nonvolatile storage for achieving highly available metadata service, as well as better performance improvement in metadata processing. As the client file system backs up certain sent metadata requests in its memory, the overhead for handling these backup requests is much smaller than that brought by the metadata server, while it adopts logging or journaling to yield highly available metadata service. The experimental results show that this newly proposed mechanism can significantly improve the speed of metadata processing and render a better I/O data throughput, in contrast to conventional metadata management schemes, that is, logging or journaling on MDS. Besides, a complete metadata recovery can be achieved by replaying the backup logs cached by all involved clients, when the metadata server has crashed or gone into nonoperational state exceptionally.
Log-Less Metadata Management on Metadata Server for Parallel File Systems
Xiao, Guoqiang; Peng, Xiaoning
2014-01-01
This paper presents a novel metadata management mechanism on the metadata server (MDS) for parallel and distributed file systems. In this technique, the client file system backs up the sent metadata requests, which have been handled by the metadata server, so that the MDS does not need to log metadata changes to nonvolatile storage for achieving highly available metadata service, as well as better performance improvement in metadata processing. As the client file system backs up certain sent metadata requests in its memory, the overhead for handling these backup requests is much smaller than that brought by the metadata server, while it adopts logging or journaling to yield highly available metadata service. The experimental results show that this newly proposed mechanism can significantly improve the speed of metadata processing and render a better I/O data throughput, in contrast to conventional metadata management schemes, that is, logging or journaling on MDS. Besides, a complete metadata recovery can be achieved by replaying the backup logs cached by all involved clients, when the metadata server has crashed or gone into nonoperational state exceptionally. PMID:24892093
Toward Transparent Data Management in Multi-layer Storage Hierarchy for HPC Systems
Wadhwa, Bharti; Byna, Suren; Butt, Ali R.
2018-04-17
Upcoming exascale high performance computing (HPC) systems are expected to comprise multi-tier storage hierarchy, and thus will necessitate innovative storage and I/O mechanisms. Traditional disk and block-based interfaces and file systems face severe challenges in utilizing capabilities of storage hierarchies due to the lack of hierarchy support and semantic interfaces. Object-based and semantically-rich data abstractions for scientific data management on large scale systems offer a sustainable solution to these challenges. Such data abstractions can also simplify users involvement in data movement. Here, we take the first steps of realizing such an object abstraction and explore storage mechanisms for these objectsmore » to enhance I/O performance, especially for scientific applications. We explore how an object-based interface can facilitate next generation scalable computing systems by presenting the mapping of data I/O from two real world HPC scientific use cases: a plasma physics simulation code (VPIC) and a cosmology simulation code (HACC). Our storage model stores data objects in different physical organizations to support data movement across layers of memory/storage hierarchy. Our implementation sclaes well to 16K parallel processes, and compared to the state of the art, such as MPI-IO and HDF5, our object-based data abstractions and data placement strategy in multi-level storage hierarchy achieves up to 7 X I/O performance improvement for scientific data.« less
Toward Transparent Data Management in Multi-layer Storage Hierarchy for HPC Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wadhwa, Bharti; Byna, Suren; Butt, Ali R.
Upcoming exascale high performance computing (HPC) systems are expected to comprise multi-tier storage hierarchy, and thus will necessitate innovative storage and I/O mechanisms. Traditional disk and block-based interfaces and file systems face severe challenges in utilizing capabilities of storage hierarchies due to the lack of hierarchy support and semantic interfaces. Object-based and semantically-rich data abstractions for scientific data management on large scale systems offer a sustainable solution to these challenges. Such data abstractions can also simplify users involvement in data movement. Here, we take the first steps of realizing such an object abstraction and explore storage mechanisms for these objectsmore » to enhance I/O performance, especially for scientific applications. We explore how an object-based interface can facilitate next generation scalable computing systems by presenting the mapping of data I/O from two real world HPC scientific use cases: a plasma physics simulation code (VPIC) and a cosmology simulation code (HACC). Our storage model stores data objects in different physical organizations to support data movement across layers of memory/storage hierarchy. Our implementation sclaes well to 16K parallel processes, and compared to the state of the art, such as MPI-IO and HDF5, our object-based data abstractions and data placement strategy in multi-level storage hierarchy achieves up to 7 X I/O performance improvement for scientific data.« less
NASA Astrophysics Data System (ADS)
Zhu, F.; Yu, H.; Rilee, M. L.; Kuo, K. S.; Yu, L.; Pan, Y.; Jiang, H.
2017-12-01
Since the establishment of data archive centers and the standardization of file formats, scientists are required to search metadata catalogs for data needed and download the data files to their local machines to carry out data analysis. This approach has facilitated data discovery and access for decades, but it inevitably leads to data transfer from data archive centers to scientists' computers through low-bandwidth Internet connections. Data transfer becomes a major performance bottleneck in such an approach. Combined with generally constrained local compute/storage resources, they limit the extent of scientists' studies and deprive them of timely outcomes. Thus, this conventional approach is not scalable with respect to both the volume and variety of geoscience data. A much more viable solution is to couple analysis and storage systems to minimize data transfer. In our study, we compare loosely coupled approaches (exemplified by Spark and Hadoop) and tightly coupled approaches (exemplified by parallel distributed database management systems, e.g., SciDB). In particular, we investigate the optimization of data placement and movement to effectively tackle the variety challenge, and boost the popularization of parallelization to address the volume challenge. Our goal is to enable high-performance interactive analysis for a good portion of geoscience data analysis exercise. We show that tightly coupled approaches can concentrate data traffic between local storage systems and compute units, and thereby optimizing bandwidth utilization to achieve a better throughput. Based on our observations, we develop a geoscience data analysis system that tightly couples analysis engines with storages, which has direct access to the detailed map of data partition locations. Through an innovation data partitioning and distribution scheme, our system has demonstrated scalable and interactive performance in real-world geoscience data analysis applications.
78 FR 19474 - Combined Notice of Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2013-04-01
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission Combined Notice of Filings Take notice that the Commission has received the following Natural Gas Pipeline Rate and Refund Report filings: Filings Instituting Proceedings Docket Numbers: RP13-702-000. Applicants: Egan Hub Storage, LLC...
Up-to-date state of storage techniques used for large numerical data files
NASA Technical Reports Server (NTRS)
Chlouba, V.
1975-01-01
Methods for data storage and output in data banks and memory files are discussed along with a survey of equipment available for this. Topics discussed include magnetic tapes, magnetic disks, Terabit magnetic tape memory, Unicon 690 laser memory, IBM 1360 photostore, microfilm recording equipment, holographic recording, film readers, optical character readers, digital data storage techniques, and photographic recording. The individual types of equipment are summarized in tables giving the basic technical parameters.
Integrating new Storage Technologies into EOS
NASA Astrophysics Data System (ADS)
Peters, Andreas J.; van der Ster, Dan C.; Rocha, Joaquim; Lensing, Paul
2015-12-01
The EOS[1] storage software was designed to cover CERN disk-only storage use cases in the medium-term trading scalability against latency. To cover and prepare for long-term requirements the CERN IT data and storage services group (DSS) is actively conducting R&D and open source contributions to experiment with a next generation storage software based on CEPH[3] and ethernet enabled disk drives. CEPH provides a scale-out object storage system RADOS and additionally various optional high-level services like S3 gateway, RADOS block devices and a POSIX compliant file system CephFS. The acquisition of CEPH by Redhat underlines the promising role of CEPH as the open source storage platform of the future. CERN IT is running a CEPH service in the context of OpenStack on a moderate scale of 1 PB replicated storage. Building a 100+PB storage system based on CEPH will require software and hardware tuning. It is of capital importance to demonstrate the feasibility and possibly iron out bottlenecks and blocking issues beforehand. The main idea behind this R&D is to leverage and contribute to existing building blocks in the CEPH storage stack and implement a few CERN specific requirements in a thin, customisable storage layer. A second research topic is the integration of ethernet enabled disks. This paper introduces various ongoing open source developments, their status and applicability.
Toward Millions of File System IOPS on Low-Cost, Commodity Hardware
Zheng, Da; Burns, Randal; Szalay, Alexander S.
2013-01-01
We describe a storage system that removes I/O bottlenecks to achieve more than one million IOPS based on a user-space file abstraction for arrays of commodity SSDs. The file abstraction refactors I/O scheduling and placement for extreme parallelism and non-uniform memory and I/O. The system includes a set-associative, parallel page cache in the user space. We redesign page caching to eliminate CPU overhead and lock-contention in non-uniform memory architecture machines. We evaluate our design on a 32 core NUMA machine with four, eight-core processors. Experiments show that our design delivers 1.23 million 512-byte read IOPS. The page cache realizes the scalable IOPS of Linux asynchronous I/O (AIO) and increases user-perceived I/O performance linearly with cache hit rates. The parallel, set-associative cache matches the cache hit rates of the global Linux page cache under real workloads. PMID:24402052
Toward Millions of File System IOPS on Low-Cost, Commodity Hardware.
Zheng, Da; Burns, Randal; Szalay, Alexander S
2013-01-01
We describe a storage system that removes I/O bottlenecks to achieve more than one million IOPS based on a user-space file abstraction for arrays of commodity SSDs. The file abstraction refactors I/O scheduling and placement for extreme parallelism and non-uniform memory and I/O. The system includes a set-associative, parallel page cache in the user space. We redesign page caching to eliminate CPU overhead and lock-contention in non-uniform memory architecture machines. We evaluate our design on a 32 core NUMA machine with four, eight-core processors. Experiments show that our design delivers 1.23 million 512-byte read IOPS. The page cache realizes the scalable IOPS of Linux asynchronous I/O (AIO) and increases user-perceived I/O performance linearly with cache hit rates. The parallel, set-associative cache matches the cache hit rates of the global Linux page cache under real workloads.
77 FR 9902 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-21
... effective on March 22, 2012 unless comments are received that would result in a contrary determination... or status, name, Social Security Number (SSN), gender, medical diagnosis, medical condition, special...: Delete entry and replace with ``Paper records in file folders and electronic storage media...
mz5: space- and time-efficient storage of mass spectrometry data sets.
Wilhelm, Mathias; Kirchner, Marc; Steen, Judith A J; Steen, Hanno
2012-01-01
Across a host of MS-driven-omics fields, researchers witness the acquisition of ever increasing amounts of high throughput MS data and face the need for their compact yet efficiently accessible storage. Addressing the need for an open data exchange format, the Proteomics Standards Initiative and the Seattle Proteome Center at the Institute for Systems Biology independently developed the mzData and mzXML formats, respectively. In a subsequent joint effort, they defined an ontology and associated controlled vocabulary that specifies the contents of MS data files, implemented as the newer mzML format. All three formats are based on XML and are thus not particularly efficient in either storage space requirements or read/write speed. This contribution introduces mz5, a complete reimplementation of the mzML ontology that is based on the efficient, industrial strength storage backend HDF5. Compared with the current mzML standard, this strategy yields an average file size reduction to ∼54% and increases linear read and write speeds ∼3-4-fold. The format is implemented as part of the ProteoWizard project and is available under a permissive Apache license. Additional information and download links are available from http://software.steenlab.org/mz5.
Electronic signatures for long-lasting storage purposes in electronic archives.
Pharow, Peter; Blobel, Bernd
2005-03-01
Communication and co-operation in healthcare and welfare require a certain set of trusted third party (TTP) services describing both status and relation of communicating principals as well as their corresponding keys and attributes. Additional TTP services are needed to provide trustworthy information about dynamic issues of communication and co-operation such as time and location of processes, workflow relations, and system behaviour. Legal and ethical requirements demand securely stored patient information and well-defined access rights. Among others, electronic signatures based on asymmetric cryptography are important means for securing the integrity of a message or file as well as for accountability purposes including non-repudiation of both origin and receipt. Electronic signatures along with certified time stamps or time signatures are especially important for electronic archives in general, electronic health records (EHR) in particular, and especially for typical purposes of long-lasting storage. Apart from technical storage problems (e.g. lifetime of the storage devices, interoperability of retrieval and presentation software), this paper identifies mechanisms of e.g. re-signing and re-stamping of data items, files, messages, sets of archived items or documents, archive structures, and even whole archives.
Comparing NetCDF and SciDB on managing and querying 5D hydrologic dataset
NASA Astrophysics Data System (ADS)
Liu, Haicheng; Xiao, Xiao
2016-11-01
Efficiently extracting information from high dimensional hydro-meteorological modelling datasets requires smart solutions. Traditional methods are mostly based on files, which can be edited and accessed handily. But they have problems of efficiency due to contiguous storage structure. Others propose databases as an alternative for advantages such as native functionalities for manipulating multidimensional (MD) arrays, smart caching strategy and scalability. In this research, NetCDF file based solutions and the multidimensional array database management system (DBMS) SciDB applying chunked storage structure are benchmarked to determine the best solution for storing and querying 5D large hydrologic modelling dataset. The effect of data storage configurations including chunk size, dimension order and compression on query performance is explored. Results indicate that dimension order to organize storage of 5D data has significant influence on query performance if chunk size is very large. But the effect becomes insignificant when chunk size is properly set. Compression of SciDB mostly has negative influence on query performance. Caching is an advantage but may be influenced by execution of different query processes. On the whole, NetCDF solution without compression is in general more efficient than the SciDB DBMS.
Efficient Access to Massive Amounts of Tape-Resident Data
NASA Astrophysics Data System (ADS)
Yu, David; Lauret, Jérôme
2017-10-01
Randomly restoring files from tapes degrades the read performance primarily due to frequent tape mounts. The high latency and time-consuming tape mount and dismount is a major issue when accessing massive amounts of data from tape storage. BNL’s mass storage system currently holds more than 80 PB of data on tapes, managed by HPSS. To restore files from HPSS, we make use of a scheduler software, called ERADAT. This scheduler system was originally based on code from Oak Ridge National Lab, developed in the early 2000s. After some major modifications and enhancements, ERADAT now provides advanced HPSS resource management, priority queuing, resource sharing, web-browser visibility of real-time staging activities and advanced real-time statistics and graphs. ERADAT is also integrated with ACSLS and HPSS for near real-time mount statistics and resource control in HPSS. ERADAT is also the interface between HPSS and other applications such as the locally developed Data Carousel, providing fair resource-sharing policies and related capabilities. ERADAT has demonstrated great performance at BNL.
ArrayBridge: Interweaving declarative array processing with high-performance computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xing, Haoyuan; Floratos, Sofoklis; Blanas, Spyros
Scientists are increasingly turning to datacenter-scale computers to produce and analyze massive arrays. Despite decades of database research that extols the virtues of declarative query processing, scientists still write, debug and parallelize imperative HPC kernels even for the most mundane queries. This impedance mismatch has been partly attributed to the cumbersome data loading process; in response, the database community has proposed in situ mechanisms to access data in scientific file formats. Scientists, however, desire more than a passive access method that reads arrays from files. This paper describes ArrayBridge, a bi-directional array view mechanism for scientific file formats, that aimsmore » to make declarative array manipulations interoperable with imperative file-centric analyses. Our prototype implementation of ArrayBridge uses HDF5 as the underlying array storage library and seamlessly integrates into the SciDB open-source array database system. In addition to fast querying over external array objects, ArrayBridge produces arrays in the HDF5 file format just as easily as it can read from it. ArrayBridge also supports time travel queries from imperative kernels through the unmodified HDF5 API, and automatically deduplicates between array versions for space efficiency. Our extensive performance evaluation in NERSC, a large-scale scientific computing facility, shows that ArrayBridge exhibits statistically indistinguishable performance and I/O scalability to the native SciDB storage engine.« less
Building analytical platform with Big Data solutions for log files of PanDA infrastructure
NASA Astrophysics Data System (ADS)
Alekseev, A. A.; Barreiro Megino, F. G.; Klimentov, A. A.; Korchuganova, T. A.; Maendo, T.; Padolski, S. V.
2018-05-01
The paper describes the implementation of a high-performance system for the processing and analysis of log files for the PanDA infrastructure of the ATLAS experiment at the Large Hadron Collider (LHC), responsible for the workload management of order of 2M daily jobs across the Worldwide LHC Computing Grid. The solution is based on the ELK technology stack, which includes several components: Filebeat, Logstash, ElasticSearch (ES), and Kibana. Filebeat is used to collect data from logs. Logstash processes data and export to Elasticsearch. ES are responsible for centralized data storage. Accumulated data in ES can be viewed using a special software Kibana. These components were integrated with the PanDA infrastructure and replaced previous log processing systems for increased scalability and usability. The authors will describe all the components and their configuration tuning for the current tasks, the scale of the actual system and give several real-life examples of how this centralized log processing and storage service is used to showcase the advantages for daily operations.
NASA Technical Reports Server (NTRS)
Kobler, Ben (Editor); Hariharan, P. C. (Editor)
2004-01-01
MSST2004, the Twelfth NASA Goddard / Twenty-first IEEE Conference on Mass Storage Systems and Technologies has as its focus long-term stewardship of globally-distributed storage. The increasing prevalence of e-anything brought about by widespread use of applications based, among others, on the World Wide Web, has contributed to rapid growth of online data holdings. A study released by the School of Information Management and Systems at the University of California, Berkeley, estimates that over 5 exabytes of data was created in 2002. Almost 99 percent of this information originally appeared on magnetic media. The theme for MSST2004 is therefore both timely and appropriate. There have been many discussions about rapid technological obsolescence, incompatible formats and inadequate attention to the permanent preservation of knowledge committed to digital storage. Tutorial sessions at MSST2004 detail some of these concerns, and steps being taken to alleviate them. Over 30 papers deal with topics as diverse as performance, file systems, and stewardship and preservation. A number of short papers, extemporaneous presentations, and works in progress will detail current and relevant research on the MSST2004 theme.
Sharing lattice QCD data over a widely distributed file system
NASA Astrophysics Data System (ADS)
Amagasa, T.; Aoki, S.; Aoki, Y.; Aoyama, T.; Doi, T.; Fukumura, K.; Ishii, N.; Ishikawa, K.-I.; Jitsumoto, H.; Kamano, H.; Konno, Y.; Matsufuru, H.; Mikami, Y.; Miura, K.; Sato, M.; Takeda, S.; Tatebe, O.; Togawa, H.; Ukawa, A.; Ukita, N.; Watanabe, Y.; Yamazaki, T.; Yoshie, T.
2015-12-01
JLDG is a data-grid for the lattice QCD (LQCD) community in Japan. Several large research groups in Japan have been working on lattice QCD simulations using supercomputers distributed over distant sites. The JLDG provides such collaborations with an efficient method of data management and sharing. File servers installed on 9 sites are connected to the NII SINET VPN and are bound into a single file system with the GFarm. The file system looks the same from any sites, so that users can do analyses on a supercomputer on a site, using data generated and stored in the JLDG at a different site. We present a brief description of hardware and software of the JLDG, including a recently developed subsystem for cooperating with the HPCI shared storage, and report performance and statistics of the JLDG. As of April 2015, 15 research groups (61 users) store their daily research data of 4.7PB including replica and 68 million files in total. Number of publications for works which used the JLDG is 98. The large number of publications and recent rapid increase of disk usage convince us that the JLDG has grown up into a useful infrastructure for LQCD community in Japan.
Wiley, Laura K.; Sivley, R. Michael; Bush, William S.
2013-01-01
Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist PMID:23894185
Wiley, Laura K; Sivley, R Michael; Bush, William S
2013-01-01
Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist.
Use of Schema on Read in Earth Science Data Archives
NASA Astrophysics Data System (ADS)
Petrenko, M.; Hegde, M.; Smit, C.; Pilone, P.; Pham, L.
2017-12-01
Traditionally, NASA Earth Science data archives have file-based storage using proprietary data file formats, such as HDF and HDF-EOS, which are optimized to support fast and efficient storage of spaceborne and model data as they are generated. The use of file-based storage essentially imposes an indexing strategy based on data dimensions. In most cases, NASA Earth Science data uses time as the primary index, leading to poor performance in accessing data in spatial dimensions. For example, producing a time series for a single spatial grid cell involves accessing a large number of data files. With exponential growth in data volume due to the ever-increasing spatial and temporal resolution of the data, using file-based archives poses significant performance and cost barriers to data discovery and access. Storing and disseminating data in proprietary data formats imposes an additional access barrier for users outside the mainstream research community. At the NASA Goddard Earth Sciences Data Information Services Center (GES DISC), we have evaluated applying the "schema-on-read" principle to data access and distribution. We used Apache Parquet to store geospatial data, and have exposed data through Amazon Web Services (AWS) Athena, AWS Simple Storage Service (S3), and Apache Spark. Using the "schema-on-read" approach allows customization of indexing—spatial or temporal—to suit the data access pattern. The storage of data in open formats such as Apache Parquet has widespread support in popular programming languages. A wide range of solutions for handling big data lowers the access barrier for all users. This presentation will discuss formats used for data storage, frameworks with support for "schema-on-read" used for data access, and common use cases covering data usage patterns seen in a geospatial data archive.
MolabIS--an integrated information system for storing and managing molecular genetics data.
Truong, Cong V C; Groeneveld, Linn F; Morgenstern, Burkhard; Groeneveld, Eildert
2011-10-31
Long-term sample storage, tracing of data flow and data export for subsequent analyses are of great importance in genetics studies. Therefore, molecular labs do need a proper information system to handle an increasing amount of data from different projects. We have developed a molecular labs information management system (MolabIS). It was implemented as a web-based system allowing the users to capture original data at each step of their workflow. MolabIS provides essential functionality for managing information on individuals, tracking samples and storage locations, capturing raw files, importing final data from external files, searching results, accessing and modifying data. Further important features are options to generate ready-to-print reports and convert sequence and microsatellite data into various data formats, which can be used as input files in subsequent analyses. Moreover, MolabIS also provides a tool for data migration. MolabIS is designed for small-to-medium sized labs conducting Sanger sequencing and microsatellite genotyping to store and efficiently handle a relative large amount of data. MolabIS not only helps to avoid time consuming tasks but also ensures the availability of data for further analyses. The software is packaged as a virtual appliance which can run on different platforms (e.g. Linux, Windows). MolabIS can be distributed to a wide range of molecular genetics labs since it was developed according to a general data model. Released under GPL, MolabIS is freely available at http://www.molabis.org.
MolabIS - An integrated information system for storing and managing molecular genetics data
2011-01-01
Background Long-term sample storage, tracing of data flow and data export for subsequent analyses are of great importance in genetics studies. Therefore, molecular labs do need a proper information system to handle an increasing amount of data from different projects. Results We have developed a molecular labs information management system (MolabIS). It was implemented as a web-based system allowing the users to capture original data at each step of their workflow. MolabIS provides essential functionality for managing information on individuals, tracking samples and storage locations, capturing raw files, importing final data from external files, searching results, accessing and modifying data. Further important features are options to generate ready-to-print reports and convert sequence and microsatellite data into various data formats, which can be used as input files in subsequent analyses. Moreover, MolabIS also provides a tool for data migration. Conclusions MolabIS is designed for small-to-medium sized labs conducting Sanger sequencing and microsatellite genotyping to store and efficiently handle a relative large amount of data. MolabIS not only helps to avoid time consuming tasks but also ensures the availability of data for further analyses. The software is packaged as a virtual appliance which can run on different platforms (e.g. Linux, Windows). MolabIS can be distributed to a wide range of molecular genetics labs since it was developed according to a general data model. Released under GPL, MolabIS is freely available at http://www.molabis.org. PMID:22040322
CephFS: a new generation storage platform for Australian high energy physics
NASA Astrophysics Data System (ADS)
Borges, G.; Crosby, S.; Boland, L.
2017-10-01
This paper presents an implementation of a Ceph file system (CephFS) use case at the ARC Center of Excellence for Particle Physics at the Terascale (CoEPP). CoEPP’s CephFS provides a posix-like file system on top of a Ceph RADOS object store, deployed on commodity hardware and without single points of failure. By delivering a unique file system namespace at different CoEPP centres spread across Australia, local HEP researchers can store, process and share data independently of their geographical locations. CephFS is also used as the back-end file system for a WLCG ATLAS user area at the Australian Tier-2. Dedicated SRM and XROOTD services, deployed on top of CoEPP’s CephFS, integrates it in ATLAS data distributed operations. This setup, while allowing Australian HEP researchers to trigger data movement via ATLAS grid tools, also enables local posix-like read access providing greater control to scientists of their data flows. In this article we will present details on CoEPP’s Ceph/CephFS implementation and report performance I/O metrics collected during the testing/tuning phase of the system.
Developing a Hadoop-based Middleware for Handling Multi-dimensional NetCDF
NASA Astrophysics Data System (ADS)
Li, Z.; Yang, C. P.; Schnase, J. L.; Duffy, D.; Lee, T. J.
2014-12-01
Climate observations and model simulations are collecting and generating vast amounts of climate data, and these data are ever-increasing and being accumulated in a rapid speed. Effectively managing and analyzing these data are essential for climate change studies. Hadoop, a distributed storage and processing framework for large data sets, has attracted increasing attentions in dealing with the Big Data challenge. The maturity of Infrastructure as a Service (IaaS) of cloud computing further accelerates the adoption of Hadoop in solving Big Data problems. However, Hadoop is designed to process unstructured data such as texts, documents and web pages, and cannot effectively handle the scientific data format such as array-based NetCDF files and other binary data format. In this paper, we propose to build a Hadoop-based middleware for transparently handling big NetCDF data by 1) designing a distributed climate data storage mechanism based on POSIX-enabled parallel file system to enable parallel big data processing with MapReduce, as well as support data access by other systems; 2) modifying the Hadoop framework to transparently processing NetCDF data in parallel without sequencing or converting the data into other file formats, or loading them to HDFS; and 3) seamlessly integrating Hadoop, cloud computing and climate data in a highly scalable and fault-tolerance framework.
ERIC Educational Resources Information Center
Mathies, Lorraine
1972-01-01
The ERIC information system is designed for computerized information storage and retrieval. While the computer can play an increasingly more vital role in facilitating reference searches of large literature collections, experience shows that manual searching gives the user skills and expertise that are essential to effectively use the computerized…
A Novel Navigation Paradigm for XML Repositories.
ERIC Educational Resources Information Center
Azagury, Alain; Factor, Michael E.; Maarek, Yoelle S.; Mandler, Benny
2002-01-01
Discusses data exchange over the Internet and describes the architecture and implementation of an XML document repository that promotes a navigation paradigm for XML documents based on content and context. Topics include information retrieval and semistructured documents; and file systems as information storage infrastructure, particularly XMLFS.…
Federal Register 2010, 2011, 2012, 2013, 2014
2010-07-21
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission Houston Pipe Line Company LP--Bammel Storage, Docket No. PR10-51- 000, et. al.; Notice of Baseline Filings July 14, 2010. Houston Pipe Line..., 2010, respectively the applicants listed above submitted their baseline filing of its Statement of...
Using Cloud-based Storage Technologies for Earth Science Data
NASA Astrophysics Data System (ADS)
Michaelis, A.; Readey, J.; Votava, P.
2016-12-01
Cloud based infrastructure may offer several key benefits of scalability, built in redundancy and reduced total cost of ownership as compared with a traditional data center approach. However, most of the tools and software systems developed for NASA data repositories were not developed with a cloud based infrastructure in mind and do not fully take advantage of commonly available cloud-based technologies. Object storage services are provided through all the leading public (Amazon Web Service, Microsoft Azure, Google Cloud, etc.) and private (Open Stack) clouds, and may provide a more cost-effective means of storing large data collections online. We describe a system that utilizes object storage rather than traditional file system based storage to vend earth science data. The system described is not only cost effective, but shows superior performance for running many different analytics tasks in the cloud. To enable compatibility with existing tools and applications, we outline client libraries that are API compatible with existing libraries for HDF5 and NetCDF4. Performance of the system is demonstrated using clouds services running on Amazon Web Services.
NASA Astrophysics Data System (ADS)
Gallagher, J. H. R.; Jelenak, A.; Potter, N.; Fulker, D. W.; Habermann, T.
2017-12-01
Providing data services based on cloud computing technology that is equivalent to those developed for traditional computing and storage systems is critical for successful migration to cloud-based architectures for data production, scientific analysis and storage. OPeNDAP Web-service capabilities (comprising the Data Access Protocol (DAP) specification plus open-source software for realizing DAP in servers and clients) are among the most widely deployed means for achieving data-as-service functionality in the Earth sciences. OPeNDAP services are especially common in traditional data center environments where servers offer access to datasets stored in (very large) file systems, and a preponderance of the source data for these services is being stored in the Hierarchical Data Format Version 5 (HDF5). Three candidate architectures for serving NASA satellite Earth Science HDF5 data via Hyrax running on Amazon Web Services (AWS) were developed and their performance examined for a set of representative use cases. The performance was based both on runtime and incurred cost. The three architectures differ in how HDF5 files are stored in the Amazon Simple Storage Service (S3) and how the Hyrax server (as an EC2 instance) retrieves their data. The results for both the serial and parallel access to HDF5 data in the S3 will be presented. While the study focused on HDF5 data, OPeNDAP and the Hyrax data server, the architectures are generic and the analysis can be extrapolated to many different data formats, web APIs, and data servers.
NASA Astrophysics Data System (ADS)
Sangaline, E.; Lauret, J.
2014-06-01
The quantity of information produced in Nuclear and Particle Physics (NPP) experiments necessitates the transmission and storage of data across diverse collections of computing resources. Robust solutions such as XRootD have been used in NPP, but as the usage of cloud resources grows, the difficulties in the dynamic configuration of these systems become a concern. Hadoop File System (HDFS) exists as a possible cloud storage solution with a proven track record in dynamic environments. Though currently not extensively used in NPP, HDFS is an attractive solution offering both elastic storage and rapid deployment. We will present the performance of HDFS in both canonical I/O tests and for a typical data analysis pattern within the RHIC/STAR experimental framework. These tests explore the scaling with different levels of redundancy and numbers of clients. Additionally, the performance of FUSE and NFS interfaces to HDFS were evaluated as a way to allow existing software to function without modification. Unfortunately, the complicated data structures in NPP are non-trivial to integrate with Hadoop and so many of the benefits of the MapReduce paradigm could not be directly realized. Despite this, our results indicate that using HDFS as a distributed filesystem offers reasonable performance and scalability and that it excels in its ease of configuration and deployment in a cloud environment.
A Data Handling System for Modern and Future Fermilab Experiments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Illingworth, R. A.
2014-01-01
Current and future Fermilab experiments such as Minerva, NOνA, and MicroBoone are now using an improved version of the Fermilab SAM data handling system. SAM was originally used by the CDF and D0 experiments for Run II of the Fermilab Tevatron to provide file metadata and location cataloguing, uploading of new files to tape storage, dataset management, file transfers between global processing sites, and processing history tracking. However SAM was heavily tailored to the Run II environment and required complex and hard to deploy client software, which made it hard to adapt to new experiments. The Fermilab Computing Sector hasmore » progressively updated SAM to use modern, standardized, technologies in order to more easily deploy it for current and upcoming Fermilab experiments, and to support the data preservation efforts of the Run II experiments.« less
Storage Media for Microcomputers.
ERIC Educational Resources Information Center
Trautman, Rodes
1983-01-01
Reviews computer storage devices designed to provide additional memory for microcomputers--chips, floppy disks, hard disks, optical disks--and describes how secondary storage is used (file transfer, formatting, ingredients of incompatibility); disk/controller/software triplet; magnetic tape backup; storage volatility; disk emulator; and…
Network issues for large mass storage requirements
NASA Technical Reports Server (NTRS)
Perdue, James
1992-01-01
File Servers and Supercomputing environments need high performance networks to balance the I/O requirements seen in today's demanding computing scenarios. UltraNet is one solution which permits both high aggregate transfer rates and high task-to-task transfer rates as demonstrated in actual tests. UltraNet provides this capability as both a Server-to-Server and Server-to-Client access network giving the supercomputing center the following advantages highest performance Transport Level connections (to 40 MBytes/sec effective rates); matches the throughput of the emerging high performance disk technologies, such as RAID, parallel head transfer devices and software striping; supports standard network and file system applications using SOCKET's based application program interface such as FTP, rcp, rdump, etc.; supports access to the Network File System (NFS) and LARGE aggregate bandwidth for large NFS usage; provides access to a distributed, hierarchical data server capability using DISCOS UniTree product; supports file server solutions available from multiple vendors, including Cray, Convex, Alliant, FPS, IBM, and others.
Peregrine System Configuration | High-Performance Computing | NREL
nodes and storage are connected by a high speed InfiniBand network. Compute nodes are diskless with an directories are mounted on all nodes, along with a file system dedicated to shared projects. A brief processors with 64 GB of memory. All nodes are connected to the high speed Infiniband network and and a
Storage, retrieval, and edit of digital video using Motion JPEG
NASA Astrophysics Data System (ADS)
Sudharsanan, Subramania I.; Lee, D. H.
1994-04-01
In a companion paper we describe a Micro Channel adapter card that can perform real-time JPEG (Joint Photographic Experts Group) compression of a 640 by 480 24-bit image within 1/30th of a second. Since this corresponds to NTSC video rates at considerably good perceptual quality, this system can be used for real-time capture and manipulation of continuously fed video. To facilitate capturing the compressed video in a storage medium, an IBM Bus master SCSI adapter with cache is utilized. Efficacy of the data transfer mechanism is considerably improved using the System Control Block architecture, an extension to Micro Channel bus masters. We show experimental results that the overall system can perform at compressed data rates of about 1.5 MBytes/second sustained and with sporadic peaks to about 1.8 MBytes/second depending on the image sequence content. We also describe mechanisms to access the compressed data very efficiently through special file formats. This in turn permits creation of simpler sequence editors. Another advantage of the special file format is easy control of forward, backward and slow motion playback. The proposed method can be extended for design of a video compression subsystem for a variety of personal computing systems.
DPM — efficient storage in diverse environments
NASA Astrophysics Data System (ADS)
Hellmich, Martin; Furano, Fabrizio; Smith, David; Brito da Rocha, Ricardo; Álvarez Ayllón, Alejandro; Manzi, Andrea; Keeble, Oliver; Calvet, Ivan; Regala, Miguel Antonio
2014-06-01
Recent developments, including low power devices, cluster file systems and cloud storage, represent an explosion in the possibilities for deploying and managing grid storage. In this paper we present how different technologies can be leveraged to build a storage service with differing cost, power, performance, scalability and reliability profiles, using the popular storage solution Disk Pool Manager (DPM/dmlite) as the enabling technology. The storage manager DPM is designed for these new environments, allowing users to scale up and down as they need it, and optimizing their computing centers energy efficiency and costs. DPM runs on high-performance machines, profiting from multi-core and multi-CPU setups. It supports separating the database from the metadata server, the head node, largely reducing its hard disk requirements. Since version 1.8.6, DPM is released in EPEL and Fedora, simplifying distribution and maintenance, but also supporting the ARM architecture beside i386 and x86_64, allowing it to run the smallest low-power machines such as the Raspberry Pi or the CuBox. This usage is facilitated by the possibility to scale horizontally using a main database and a distributed memcached-powered namespace cache. Additionally, DPM supports a variety of storage pools in the backend, most importantly HDFS, S3-enabled storage, and cluster file systems, allowing users to fit their DPM installation exactly to their needs. In this paper, we investigate the power-efficiency and total cost of ownership of various DPM configurations. We develop metrics to evaluate the expected performance of a setup both in terms of namespace and disk access considering the overall cost including equipment, power consumptions, or data/storage fees. The setups tested range from the lowest scale using Raspberry Pis with only 700MHz single cores and a 100Mbps network connections, over conventional multi-core servers to typical virtual machine instances in cloud settings. We evaluate the combinations of different name server setups, for example load-balanced clusters, with different storage setups, from using a classic local configuration to private and public clouds.
Multipurpose Controller with EPICS integration and data logging: BPM application for ESS Bilbao
NASA Astrophysics Data System (ADS)
Arredondo, I.; del Campo, M.; Echevarria, P.; Jugo, J.; Etxebarria, V.
2013-10-01
This work presents a multipurpose configurable control system which can be integrated in an EPICS control network, this functionality being configured through a XML configuration file. The core of the system is the so-called Hardware Controller which is in charge of the control hardware management, the set up and communication with the EPICS network and the data storage. The reconfigurable nature of the controller is based on a single XML file, allowing any final user to easily modify and adjust the control system to any specific requirement. The selected Java development environment ensures a multiplatform operation and large versatility, even regarding the control hardware to be controlled. Specifically, this paper, focused on fast control based on a high performance FPGA, describes also an application approach for the ESS Bilbao's Beam Position Monitoring system. The implementation of the XML configuration file and the satisfactory performance outcome achieved are presented, as well as a general description of the Multipurpose Controller itself.
Joseph, Mercy; Malhotra, Amit; Rao, Murali; Sharma, Abhimanyu; Talwar, Sangeeta
2016-01-01
Background Complete removal of old filling material during root canal retreatment is fundamental for predictable cleaning and shaping of canal anatomy. Most of the retreatment methods tested in earlier studies have shown inability to achieve complete removal of root canal filling. Therefore the aim of this investigation was to assess the efficacy of three different rotary nickel titanium retreatment systems and Hedstrom files in removing filling material from root canals. Material and Methods Sixty extracted mandibular premolars were decoronated to leave 15 mm root. Specimen were hand instrumented and obturated using gutta percha and AH plus root canal sealer. After storage period of two weeks, roots were retreated with three (Protaper retreatment files, Mtwo retreatment files, NRT GPR) rotary retreatment instrument systems and Hedstroem files. Subsequently, samples were sectioned longitudinally and examined under stereomicroscope. Digital images were recorded and evaluated using Digital Image Analysing Software. The retreatment time was recorded for each tooth using a stopwatch. The area of canal and the residual filling material was recorded in mm2 and the percentage of remaining filling material on canal walls was calculated. Data was analysed using ANOVA test. Results Significantly less amount of residual filling material was present in protaper and Mtwo instrumented teeth (p < 0.05) compared to NRT GPR and Hedstrom files group. Protaper instruments also required lesser time during removal of filling material followed by Mtwo instruments, NRT GPR files and Hedstrom files. Conclusions None of the instruments were able to remove the filling material completely from root canal. Protaper universal retreatment system and Mtwo retreatment files were more efficient and faster compared to NRT GPR fles and Hedstrom files. Key words:Gutta-percha removal, nickel titanium, root canal retreatment, rotary instruments. PMID:27703601
Archival storage solutions for PACS
NASA Astrophysics Data System (ADS)
Chunn, Timothy
1997-05-01
While they are many, one of the inhibitors to the wide spread diffusion of PACS systems has been robust, cost effective digital archive storage solutions. Moreover, an automated Nearline solution is key to a central, sharable data repository, enabling many applications such as PACS, telemedicine and teleradiology, and information warehousing and data mining for research such as patient outcome analysis. Selecting the right solution depends on a number of factors: capacity requirements, write and retrieval performance requirements, scaleability in capacity and performance, configuration architecture and flexibility, subsystem availability and reliability, security requirements, system cost, achievable benefits and cost savings, investment protection, strategic fit and more.This paper addresses many of these issues. It compares and positions optical disk and magnetic tape technologies, which are the predominant archive mediums today. Price and performance comparisons will be made at different archive capacities, plus the effect of file size on storage system throughput will be analyzed. The concept of automated migration of images from high performance, high cost storage devices to high capacity, low cost storage devices will be introduced as a viable way to minimize overall storage costs for an archive. The concept of access density will also be introduced and applied to the selection of the most cost effective archive solution.
78 FR 20908 - Combined Notice of Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2013-04-08
.... Description: UGI Storage Compliance Filing TL-96 to be effective 4/ 1/2013. Filed Date: 3/28/13. Accession... Eastern Pipe Line Company, LP. Description: Flow Through of Cash-Out Revenues filed on 3-28-13. Filed Date: 3/28/13. Accession Number: 20130328-5022. Comments Due: 5 p.m. ET 4/9/13. Docket Numbers: RP13-726...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-20
....315(c), must be filed in accordance with the NRC E-Filing rule (72 FR 49139, August 28, 2007). The E... with the procedural requirements of E-Filing, at least ten (10) days prior to the filing deadline, the... the NRC in accordance with the E-Filing rule, the participant must file the document using the NRC's...
78 FR 5792 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-28
... be effective on February 28, 2013 unless comments are received which result in a contrary...: Delete entry and replace with ``Name, Social Security Number (SSN) and/or DoD ID Number, home address... ``Paper file folders and electronic storage media.'' * * * * * Safeguards: Delete entry and replace with...
78 FR 73516 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-06
... proposed action will be effective on January 6, 2014 unless comments are received which result in a..., Social Security Number (SSN), and case docket number and may include the individual's home address and... entry and replace with ``Paper file folders and electronic storage media.'' Retrievability: Delete entry...
Code of Federal Regulations, 2011 CFR
2011-04-01
... occupied had it been in operation. (3) A proposed annual rule of operation for the storage reservoir or... the reservoir or reservoirs showing the maximum, average, and minimum operating pool levels, the..., weekly and daily, during periods of low and normal flows after the plant is in operation and the system...
Integrated Geothermal-CO2 Storage Reservoirs: FY1 Final Report
Buscheck, Thomas A.
2012-01-01
The purpose of phase 1 is to determine the feasibility of integrating geologic CO2 storage (GCS) with geothermal energy production. Phase 1 includes reservoir analyses to determine injector/producer well schemes that balance the generation of economically useful flow rates at the producers with the need to manage reservoir overpressure to reduce the risks associated with overpressure, such as induced seismicity and CO2 leakage to overlying aquifers. This submittal contains input and output files of the reservoir model analyses. A reservoir-model "index-html" file was sent in a previous submittal to organize the reservoir-model input and output files according to sections of the FY1 Final Report to which they pertain. The recipient should save the file: Reservoir-models-inputs-outputs-index.html in the same directory that the files: Section2.1.*.tar.gz files are saved in.
77 FR 23474 - Combined Notice of Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2012-04-19
...: Young Gas Storage Company, Ltd. Description: EBB Notice Categories to be effective 5/15/2012. Filed Date... intervention is necessary to become a party to the proceeding. Filings in Existing Proceedings Docket Numbers... requirements, interventions, protests, and service can be found at: http://www.ferc.gov/docs-filing/efiling...
78 FR 13050 - Combined Notice of Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2013-02-26
... be considered, but intervention is necessary to become a party to the proceeding. Filings in Existing Proceedings Docket Numbers: RP13-106-002. Applicants: Young Gas Storage Company, Ltd. Description: Young NAESB..., protests, and service can be found at: http://www.ferc.gov/docs-filing/efiling/filing-req.pdf . For other...
SIOExplorer: Modern IT Methods and Tools for Digital Library Management
NASA Astrophysics Data System (ADS)
Sutton, D. W.; Helly, J.; Miller, S.; Chase, A.; Clarck, D.
2003-12-01
With more geoscience disciplines becoming data-driven it is increasingly important to utilize modern techniques for data, information and knowledge management. SIOExplorer is a new digital library project with 2 terabytes of oceanographic data collected over the last 50 years on 700 cruises by the Scripps Institution of Oceanography. It is built using a suite of information technology tools and methods that allow for an efficient and effective digital library management system. The library consists of a number of independent collections, each with corresponding metadata formats. The system architecture allows each collection to be built and uploaded based on a collection dependent metadata template file (MTF). This file is used to create the hierarchical structure of the collection, create metadata tables in a relational database, and to populate object metadata files and the collection as a whole. Collections are comprised of arbitrary digital objects stored at the San Diego Supercomputer Center (SDSC) High Performance Storage System (HPSS) and managed using the Storage Resource Broker (SRB), data handling middle ware developed at SDSC. SIOExplorer interoperates with other collections as a data provider through the Open Archives Initiative (OAI) protocol. The user services for SIOExplorer are accessed from CruiseViewer, a Java application served using Java Web Start from the SIOExplorer home page. CruiseViewer is an advanced tool for data discovery and access. It implements general keyword and interactive geospatial search methods for the collections. It uses a basemap to georeference search results on user selected basemaps such as global topography or crustal age. User services include metadata viewing, opening of selective mime type digital objects (such as images, documents and grid files), and downloading of objects (including the brokering of proprietary hold restrictions).
The amino acid's backup bone - storage solutions for proteomics facilities.
Meckel, Hagen; Stephan, Christian; Bunse, Christian; Krafzik, Michael; Reher, Christopher; Kohl, Michael; Meyer, Helmut Erich; Eisenacher, Martin
2014-01-01
Proteomics methods, especially high-throughput mass spectrometry analysis have been continually developed and improved over the years. The analysis of complex biological samples produces large volumes of raw data. Data storage and recovery management pose substantial challenges to biomedical or proteomic facilities regarding backup and archiving concepts as well as hardware requirements. In this article we describe differences between the terms backup and archive with regard to manual and automatic approaches. We also introduce different storage concepts and technologies from transportable media to professional solutions such as redundant array of independent disks (RAID) systems, network attached storages (NAS) and storage area network (SAN). Moreover, we present a software solution, which we developed for the purpose of long-term preservation of large mass spectrometry raw data files on an object storage device (OSD) archiving system. Finally, advantages, disadvantages, and experiences from routine operations of the presented concepts and technologies are evaluated and discussed. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan. Copyright © 2013. Published by Elsevier B.V.
75 FR 63466 - Sawgrass Storage LLC; Notice of Petition
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-15
... Storage LLC; Notice of Petition October 7, 2010. Take notice that on September 27, 2010, Sawgrass Storage LLC (Sawgrass Storage), 3333 Warrenville Road, Suite 630, Lisle, Illinois 60532, filed in Docket No..., Sawgrass Storage proposes to drill a test well to determine the feasibility of developing a depleted...
NASA Technical Reports Server (NTRS)
Shields, Michael F.
1993-01-01
The need to manage large amounts of data on robotically controlled devices has been critical to the mission of this Agency for many years. In many respects this Agency has helped pioneer, with their industry counterparts, the development of a number of products long before these systems became commercially available. Numerous attempts have been made to field both robotically controlled tape and optical disk technology and systems to satisfy our tertiary storage needs. Custom developed products were architected, designed, and developed without vendor partners over the past two decades to field workable systems to handle our ever increasing storage requirements. Many of the attendees of this symposium are familiar with some of the older products, such as: the Braegen Automated Tape Libraries (ATL's), the IBM 3850, the Ampex TeraStore, just to name a few. In addition, we embarked on an in-house development of a shared disk input/output support processor to manage our every increasing tape storage needs. For all intents and purposes, this system was a file server by current definitions which used CDC Cyber computers as the control processors. It served us well and was just recently removed from production usage.
An ECG ambulatory system with mobile embedded architecture for ST-segment analysis.
Miranda-Cid, Alejandro; Alvarado-Serrano, Carlos
2010-01-01
A prototype of a ECG ambulatory system for long term monitoring of ST segment of 3 leads, low power, portability and data storage in solid state memory cards has been developed. The solution presented is based in a mobile embedded architecture of a portable entertainment device used as a tool for storage and processing of bioelectric signals, and a mid-range RISC microcontroller, PIC 16F877, which performs the digitalization and transmission of ECG. The ECG amplifier stage is a low power, unipolar voltage and presents minimal distortion of the phase response of high pass filter in the ST segment. We developed an algorithm that manages access to files through an implementation for FAT32, and the ECG display on the device screen. The records are stored in TXT format for further processing. After the acquisition, the system implemented works as a standard USB mass storage device.
Umeda, Akira; Iwata, Yasushi; Okada, Yasumasa; Shimada, Megumi; Baba, Akiyasu; Minatogawa, Yasuyuki; Yamada, Takayasu; Chino, Masao; Watanabe, Takafumi; Akaishi, Makoto
2004-12-01
The high cost of digital echocardiographs and the large size of data files hinder the adoption of remote diagnosis of digitized echocardiography data. We have developed a low-cost digital filing system for echocardiography data. In this system, data from a conventional analog echocardiograph are captured using a personal computer (PC) equipped with an analog-to-digital converter board. Motion picture data are promptly compressed using a moving pictures expert group (MPEG) 4 codec. The digitized data with preliminary reports obtained in a rural hospital are then sent to cardiologists at distant urban general hospitals via the internet. The cardiologists can evaluate the data using widely available movie-viewing software (Windows Media Player). The diagnostic accuracy of this double-check system was confirmed by comparison with ordinary super-VHS videotapes. We have demonstrated that digitization of echocardiography data from a conventional analog echocardiograph and MPEG 4 compression can be performed using an ordinary PC-based system, and that this system enables highly efficient digital storage and remote diagnosis at low cost.
CruiseViewer: SIOExplorer Graphical Interface to Metadata and Archives.
NASA Astrophysics Data System (ADS)
Sutton, D. W.; Helly, J. J.; Miller, S. P.; Chase, A.; Clark, D.
2002-12-01
We are introducing "CruiseViewer" as a prototype graphical interface for the SIOExplorer digital library project, part of the overall NSF National Science Digital Library (NSDL) effort. When complete, CruiseViewer will provide access to nearly 800 cruises, as well as 100 years of documents and images from the archives of the Scripps Institution of Oceanography (SIO). The project emphasizes data object accessibility, a rich metadata format, efficient uploading methods and interoperability with other digital libraries. The primary function of CruiseViewer is to provide a human interface to the metadata database and to storage systems filled with archival data. The system schema is based on the concept of an "arbitrary digital object" (ADO). Arbitrary in that if the object can be stored on a computer system then SIOExplore can manage it. Common examples are a multibeam swath bathymetry file, a .pdf cruise report, or a tar file containing all the processing scripts used on a cruise. We require a metadata file for every ADO in an ascii "metadata interchange format" (MIF), which has proven to be highly useful for operability and extensibility. Bulk ADO storage is managed using the Storage Resource Broker, SRB, data handling middleware developed at the San Diego Supercomputer Center that centralizes management and access to distributed storage devices. MIF metadata are harvested from several sources and housed in a relational (Oracle) database. For CruiseViewer, cgi scripts resident on an Apache server are the primary communication and service request handling tools. Along with the CruiseViewer java application, users can query, access and download objects via a separate method that operates through standard web browsers, http://sioexplorer.ucsd.edu. Both provide the functionability to query and view object metadata, and select and download ADOs. For the CruiseViewer application Java 2D is used to add a geo-referencing feature that allows users to select basemap images and have vector shapes representing query results mapped over the basemap in the image panel. The two methods together address a wide range of user access needs and will allow for widespread use of SIOExplorer.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-09-29
... NRC E-Filing rule, which the NRC promulgated on August 28, 2007 (72 FR 49139). All documents filed in... submission of a request for hearing or petition to intervene, must be filed in accordance with the E-Filing rule. The E-Filing rule requires participants to submit and serve all adjudicatory documents over the...
Is HDF5 a Good Format to Replace UVFITS?
NASA Astrophysics Data System (ADS)
Price, D. C.; Barsdell, B. R.; Greenhill, L. J.
2015-09-01
The FITS (Flexible Image Transport System) data format was developed in the late 1970s for storage and exchange of astronomy-related image data. Since then, it has become a standard file format not only for images, but also for radio interferometer data (e.g. UVFITS, FITS-IDI). But is FITS the right format for next-generation telescopes to adopt? The newer Hierarchical Data Format (HDF5) file format offers considerable advantages over FITS, but has yet to gain widespread adoption within the radio astronomy. One of the major holdbacks is that HDF5 is not well supported by data reduction software packages. Here, we present a comparison of FITS, HDF5, and the MeasurementSet (MS) format for storage of interferometric data. In addition, we present a tool for converting between formats. We show that the underlying data model of FITS can be ported to HDF5, a first step toward achieving wider HDF5 support.
The Iranian National Geodata Revision Strategy and Realization Based on Geodatabase
NASA Astrophysics Data System (ADS)
Haeri, M.; Fasihi, A.; Ayazi, S. M.
2012-07-01
In recent years, using of spatial database for storing and managing spatial data has become a hot topic in the field of GIS. Accordingly National Cartographic Center of Iran (NCC) produces - from time to time - some spatial data which is usually included in some databases. One of the NCC major projects was designing National Topographic Database (NTDB). NCC decided to create National Topographic Database of the entire country-based on 1:25000 coverage maps. The standard of NTDB was published in 1994 and its database was created at the same time. In NTDB geometric data was stored in MicroStation design format (DGN) which each feature has a link to its attribute data (stored in Microsoft Access file). Also NTDB file was produced in a sheet-wise mode and then stored in a file-based style. Besides map compilation, revision of existing maps has already been started. Key problems of NCC are revision strategy, NTDB file-based style storage and operator challenges (NCC operators are almost preferred to edit and revise geometry data in CAD environments). A GeoDatabase solution for national Geodata, based on NTDB map files and operators' revision preferences, is introduced and released herein. The proposed solution extends the traditional methods to have a seamless spatial database which it can be revised in CAD and GIS environment, simultaneously. The proposed system is the common data framework to create a central data repository for spatial data storage and management.
NASA Astrophysics Data System (ADS)
Suftin, I.; Read, J. S.; Walker, J.
2013-12-01
Scientists prefer not having to be tied down to a specific machine or operating system in order to analyze local and remote data sets or publish work. Increasingly, analysis has been migrating to decentralized web services and data sets, using web clients to provide the analysis interface. While simplifying workflow access, analysis, and publishing of data, the move does bring with it its own unique set of issues. Web clients used for analysis typically offer workflows geared towards a single user, with steps and results that are often difficult to recreate and share with others. Furthermore, workflow results often may not be easily used as input for further analysis. Older browsers further complicate things by having no way to maintain larger chunks of information, often offloading the job of storage to the back-end server or trying to squeeze it into a cookie. It has been difficult to provide a concept of "session storage" or "workflow sharing" without a complex orchestration of the back-end for storage depending on either a centralized file system or database. With the advent of HTML5, browsers gained the ability to store more information through the use of the Web Storage API (a browser-cookie holds a maximum of 4 kilobytes). Web Storage gives us the ability to store megabytes of arbitrary data in-browser either with an expiration date or just for a session. This allows scientists to create, update, persist and share their workflow without depending on the backend to store session information, providing the flexibility for new web-based workflows to emerge. In the DSASWeb portal ( http://cida.usgs.gov/DSASweb/ ), using these techniques, the representation of every step in the analyst's workflow is stored as plain-text serialized JSON, which we can generate as a text file and provide to the analyst as an upload. This file may then be shared with others and loaded back into the application, restoring the application to the state it was in when the session file was generated. A user may then view results produced during that session or go back and alter input parameters, creating new results and producing new, unique sessions which they can then again share. This technique not only provides independence for the user to manage their session as they like, but also allows much greater freedom for the application provider to scale out without having to worry about carrying over user information or maintaining it in a central location.
mz5: Space- and Time-efficient Storage of Mass Spectrometry Data Sets*
Wilhelm, Mathias; Kirchner, Marc; Steen, Judith A. J.; Steen, Hanno
2012-01-01
Across a host of MS-driven-omics fields, researchers witness the acquisition of ever increasing amounts of high throughput MS data and face the need for their compact yet efficiently accessible storage. Addressing the need for an open data exchange format, the Proteomics Standards Initiative and the Seattle Proteome Center at the Institute for Systems Biology independently developed the mzData and mzXML formats, respectively. In a subsequent joint effort, they defined an ontology and associated controlled vocabulary that specifies the contents of MS data files, implemented as the newer mzML format. All three formats are based on XML and are thus not particularly efficient in either storage space requirements or read/write speed. This contribution introduces mz5, a complete reimplementation of the mzML ontology that is based on the efficient, industrial strength storage backend HDF5. Compared with the current mzML standard, this strategy yields an average file size reduction to ∼54% and increases linear read and write speeds ∼3–4-fold. The format is implemented as part of the ProteoWizard project and is available under a permissive Apache license. Additional information and download links are available from http://software.steenlab.org/mz5. PMID:21960719
Partial Storage Optimization and Load Control Strategy of Cloud Data Centers
2015-01-01
We present a novel approach to solve the cloud storage issues and provide a fast load balancing algorithm. Our approach is based on partitioning and concurrent dual direction download of the files from multiple cloud nodes. Partitions of the files are saved on the cloud rather than the full files, which provide a good optimization to the cloud storage usage. Only partial replication is used in this algorithm to ensure the reliability and availability of the data. Our focus is to improve the performance and optimize the storage usage by providing the DaaS on the cloud. This algorithm solves the problem of having to fully replicate large data sets, which uses up a lot of precious space on the cloud nodes. Reducing the space needed will help in reducing the cost of providing such space. Moreover, performance is also increased since multiple cloud servers will collaborate to provide the data to the cloud clients in a faster manner. PMID:25973444
Partial storage optimization and load control strategy of cloud data centers.
Al Nuaimi, Klaithem; Mohamed, Nader; Al Nuaimi, Mariam; Al-Jaroodi, Jameela
2015-01-01
We present a novel approach to solve the cloud storage issues and provide a fast load balancing algorithm. Our approach is based on partitioning and concurrent dual direction download of the files from multiple cloud nodes. Partitions of the files are saved on the cloud rather than the full files, which provide a good optimization to the cloud storage usage. Only partial replication is used in this algorithm to ensure the reliability and availability of the data. Our focus is to improve the performance and optimize the storage usage by providing the DaaS on the cloud. This algorithm solves the problem of having to fully replicate large data sets, which uses up a lot of precious space on the cloud nodes. Reducing the space needed will help in reducing the cost of providing such space. Moreover, performance is also increased since multiple cloud servers will collaborate to provide the data to the cloud clients in a faster manner.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-12-22
... project would be a closed-loop pumped storage system, with an initial fill from the existing Otter Creek...: Federal Power Act 16 U.S.C. 791(a)-825(r). h. Applicant Contact: Parker Knoll Hydro, LLC., 975 South State... system; (12) approximately 1 mile of 345-kV transmission line; and (13) appurtenant facilities. The...
NASA Technical Reports Server (NTRS)
Snyder, W. V.; Hanson, R. J.
1986-01-01
Text Exchange System (TES) exchanges and maintains organized textual information including source code, documentation, data, and listings. System consists of two computer programs and definition of format for information storage. Comprehensive program used to create, read, and maintain TES files. TES developed to meet three goals: First, easy and efficient exchange of programs and other textual data between similar and dissimilar computer systems via magnetic tape. Second, provide transportable management system for textual information. Third, provide common user interface, over wide variety of computing systems, for all activities associated with text exchange.
18 CFR 11.16 - Filing requirements.
Code of Federal Regulations, 2010 CFR
2010-04-01
... generating capacity separately designated. (3) A description of the total storage capacity of the reservoir..., irrigation storage, and flood control storage. Identification, by reservoir elevation, of the portion of the reservoir assigned to each of its respective storage functions. (4) An elevation-capacity curve, or a...
18 CFR 11.16 - Filing requirements.
Code of Federal Regulations, 2011 CFR
2011-04-01
... generating capacity separately designated. (3) A description of the total storage capacity of the reservoir..., irrigation storage, and flood control storage. Identification, by reservoir elevation, of the portion of the reservoir assigned to each of its respective storage functions. (4) An elevation-capacity curve, or a...
A land-surface Testbed for EOSDIS
NASA Technical Reports Server (NTRS)
Emery, William; Kelley, Tim
1994-01-01
The main objective of the Testbed project was to deliver satellite images via the Internet to scientific and educational users free of charge. The main method of operations was to store satellite images on a low cost tape library system, visually browse the raw satellite data, access the raw data filed, navigate the imagery through 'C' programming and X-Windows interface software, and deliver the finished image to the end user over the Internet by means of file transfer protocol methods. The conclusion is that the distribution of satellite imagery by means of the Internet is feasible, and the archiving of large data sets can be accomplished with low cost storage systems allowing multiple users.
[A new tool for retrieving clinical data from various sources].
Nielsen, Erik Waage; Hovland, Anders; Strømsnes, Oddgeir
2006-02-23
A doctor's tool for extracting clinical data from various sources on groups of hospital patients into one file has been in demand. For this purpose we evaluated Qlikview. Based on clinical information required by two cardiologists, an IT specialist with thorough knowledge of the hospital's data system (www.dips.no) used 30 days to assemble one Qlikview file. Data was also assembled from a pre-hospital ambulance system. The 13 Mb Qlikview file held various information on 12430 patients admitted to the cardiac unit 26,287 times over the last 21 years. Included were also 530,912 clinical laboratory analyses from these patients during the past five years. Some information required by the cardiologists was inaccessible due to lack of coding or data storage. Some databases could not export their data. Others were encrypted by the software company. A major part of the required data could be extracted to Qlikview. Searches went fast in spite of the huge amount of data. Qlikview could assemble clinical information to doctors from different data systems. Doctors from different hospitals could share and further refine empty Qlikview files for their own use. When the file is assembled, doctors can, on their own, search for answers to constantly changing clinical questions, also at odd hours.
Dynamic federations: storage aggregation using open tools and protocols
NASA Astrophysics Data System (ADS)
Furano, Fabrizio; Brito da Rocha, Ricardo; Devresse, Adrien; Keeble, Oliver; Álvarez Ayllón, Alejandro; Fuhrmann, Patrick
2012-12-01
A number of storage elements now offer standard protocol interfaces like NFS 4.1/pNFS and WebDAV, for access to their data repositories, in line with the standardization effort of the European Middleware Initiative (EMI). Also the LCG FileCatalogue (LFC) can offer such features. Here we report on work that seeks to exploit the federation potential of these protocols and build a system that offers a unique view of the storage and metadata ensemble and the possibility of integration of other compatible resources such as those from cloud providers. The challenge, here undertaken by the providers of dCache and DPM, and pragmatically open to other Grid and Cloud storage solutions, is to build such a system while being able to accommodate name translations from existing catalogues (e.g. LFCs), experiment-based metadata catalogues, or stateless algorithmic name translations, also known as “trivial file catalogues”. Such so-called storage federations of standard protocols-based storage elements give a unique view of their content, thus promoting simplicity in accessing the data they contain and offering new possibilities for resilience and data placement strategies. The goal is to consider HTTP and NFS4.1-based storage elements and metadata catalogues and make them able to cooperate through an architecture that properly feeds the redirection mechanisms that they are based upon, thus giving the functionalities of a “loosely coupled” storage federation. One of the key requirements is to use standard clients (provided by OS'es or open source distributions, e.g. Web browsers) to access an already aggregated system; this approach is quite different from aggregating the repositories at the client side through some wrapper API, like for instance GFAL, or by developing new custom clients. Other technical challenges that will determine the success of this initiative include performance, latency and scalability, and the ability to create worldwide storage federations that are able to redirect clients to repositories that they can efficiently access, for instance trying to choose the endpoints that are closer or applying other criteria. We believe that the features of a loosely coupled federation of open-protocols-based storage elements will open many possibilities of evolving the current computing models without disrupting them, and, at the same time, will be able to operate with the existing infrastructures, follow their evolution path and add storage centers that can be acquired as a third-party service.
An analysis of file migration in a UNIX supercomputing environment
NASA Technical Reports Server (NTRS)
Miller, Ethan L.; Katz, Randy H.
1992-01-01
The super computer center at the National Center for Atmospheric Research (NCAR) migrates large numbers of files to and from its mass storage system (MSS) because there is insufficient space to store them on the Cray supercomputer's local disks. This paper presents an analysis of file migration data collected over two years. The analysis shows that requests to the MSS are periodic, with one day and one week periods. Read requests to the MSS account for the majority of the periodicity; as write requests are relatively constant over the course of a week. Additionally, reads show a far greater fluctuation than writes over a day and week since reads are driven by human users while writes are machine-driven.
Solar space and water heating system installed at Charlottesville, Virginia
NASA Technical Reports Server (NTRS)
1980-01-01
The solar energy system located at David C. Wilson Neuropsychiatric Hospital, Charlottesville, Virginia, is described. The solar energy system consists of 88 single glazed, Sunworks 'Solector' copper base plate collector modules, hot water coils in the hot air ducts, a Domestic Hot Water (DHW) preheat tank, a 3,000 gallon concrete urethane insulated storage tank and other miscellaneous components. Extracts from the site files, specifications, drawings, installation, operation and maintenance instructions are included.
1976-09-01
technology has made possible the deployment of very sophisticated and highly capable weapon systems. Taking advantage of this technology has carried...3) Ancillary Equipment 208 Types Numerous Notes : 1. Number of ships with this system 2. Includes Tartar used only for surface capability 3. These...maintains the Configuration Item Identification File (CIIF) . The CIIF provides storage and retrieval capability for technical and logistics data specified on
An Efficient Format for Nearly Constant-Time Access to Arbitrary Time Intervals in Large Trace Files
Chan, Anthony; Gropp, William; Lusk, Ewing
2008-01-01
A powerful method to aid in understanding the performance of parallel applications uses log or trace files containing time-stamped events and states (pairs of events). These trace files can be very large, often hundreds or even thousands of megabytes. Because of the cost of accessing and displaying such files, other methods are often used that reduce the size of the tracefiles at the cost of sacrificing detail or other information. This paper describes a hierarchical trace file format that provides for display of an arbitrary time window in a time independent of the total size of the file and roughlymore » proportional to the number of events within the time window. This format eliminates the need to sacrifice data to achieve a smaller trace file size (since storage is inexpensive, it is necessary only to make efficient use of bandwidth to that storage). The format can be used to organize a trace file or to create a separate file of annotations that may be used with conventional trace files. We present an analysis of the time to access all of the events relevant to an interval of time and we describe experiments demonstrating the performance of this file format.« less
Federal Register 2010, 2011, 2012, 2013, 2014
2011-04-25
... accordance with the NRC E-Filing rule (72 FR 49139, August 28, 2007). The E-Filing process requires... requirements of E-Filing, at least ten (10) days prior to the filing deadline, the participant should contact... may attempt to use other software not listed on the Web site, but should note that the NRC's E-Filing...
75 FR 71092 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2010-11-22
... (DIBRS); 18 U.S.C. 922 note, The Brady Handgun Violence Prevention Act; 28 U.S.C. 534 note, Uniform...: Delete entry and replace with ``Electronic storage media and file folders.'' Retrievability: Delete entry..., The Brady Handgun Violence Prevention Act; 28 U.S.C. 534 note, Uniform Federal Crime Reporting Act; 42...
12 CFR 978.5 - Storage of confidential information.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 12 Banks and Banking 7 2010-01-01 2010-01-01 false Storage of confidential information. 978.5... OPERATIONS AND AUTHORITIES BANK REQUESTS FOR INFORMATION § 978.5 Storage of confidential information. Each Bank shall: (a) Store all identified confidential information in secure storage areas or filing...
76 FR 30341 - Reliable Storage 1 LLC;
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-25
... DEPARTMENT OF ENERGY [Project No. 14152-000] Reliable Storage 1 LLC; Notice of Preliminary Permit... March 25, 2011, Reliable Storage 1 LLC filed an application, pursuant to section 4(f) of the Federal... waters owned by others without the owners' express permission. The proposed pumped storage project would...
Interactive display of molecular models using a microcomputer system
NASA Technical Reports Server (NTRS)
Egan, J. T.; Macelroy, R. D.
1980-01-01
A simple, microcomputer-based, interactive graphics display system has been developed for the presentation of perspective views of wire frame molecular models. The display system is based on a TERAK 8510a graphics computer system with a display unit consisting of microprocessor, television display and keyboard subsystems. The operating system includes a screen editor, file manager, PASCAL and BASIC compilers and command options for linking and executing programs. The graphics program, written in USCD PASCAL, involves the centering of the coordinate system, the transformation of centered model coordinates into homogeneous coordinates, the construction of a viewing transformation matrix to operate on the coordinates, clipping invisible points, perspective transformation and scaling to screen coordinates; commands available include ZOOM, ROTATE, RESET, and CHANGEVIEW. Data file structure was chosen to minimize the amount of disk storage space. Despite the inherent slowness of the system, its low cost and flexibility suggests general applicability.
Image selection system. [computerized data storage and retrieval system
NASA Technical Reports Server (NTRS)
Knutson, M. A.; Hurd, D.; Hubble, L.; Kroeck, R. M.
1974-01-01
An image selection (ISS) was developed for the NASA-Ames Research Center Earth Resources Aircraft Project. The ISS is an interactive, graphics oriented, computer retrieval system for aerial imagery. An analysis of user coverage requests and retrieval strategies is presented, followed by a complete system description. Data base structure, retrieval processors, command language, interactive display options, file structures, and the system's capability to manage sets of selected imagery are described. A detailed example of an area coverage request is graphically presented.
77 FR 31346 - Combined Notice of Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-25
...: Filings Instituting Proceedings Docket Numbers: RP12-732-000. Applicants: Golden Triangle Storage, Inc...-000. Applicants: Columbia Gas Transmission, LLC. Description: Negotiated Rate Service Agreement--Rice...
The growth of the UniTree mass storage system at the NASA Center for Computational Sciences
NASA Technical Reports Server (NTRS)
Tarshish, Adina; Salmon, Ellen
1993-01-01
In October 1992, the NASA Center for Computational Sciences made its Convex-based UniTree system generally available to users. The ensuing months saw the growth of near-online data from nil to nearly three terabytes, a doubling of the number of CPU's on the facility's Cray YMP (the primary data source for UniTree), and the necessity for an aggressive regimen for repacking sparse tapes and hierarchical 'vaulting' of old files to freestanding tape. Connectivity was enhanced as well with the addition of UltraNet HiPPI. This paper describes the increasing demands placed on the storage system's performance and throughput that resulted from the significant augmentation of compute-server processor power and network speed.
Master Metadata Repository and Metadata-Management System
NASA Technical Reports Server (NTRS)
Armstrong, Edward; Reed, Nate; Zhang, Wen
2007-01-01
A master metadata repository (MMR) software system manages the storage and searching of metadata pertaining to data from national and international satellite sources of the Global Ocean Data Assimilation Experiment (GODAE) High Resolution Sea Surface Temperature Pilot Project [GHRSSTPP]. These sources produce a total of hundreds of data files daily, each file classified as one of more than ten data products representing global sea-surface temperatures. The MMR is a relational database wherein the metadata are divided into granulelevel records [denoted file records (FRs)] for individual satellite files and collection-level records [denoted data set descriptions (DSDs)] that describe metadata common to all the files from a specific data product. FRs and DSDs adhere to the NASA Directory Interchange Format (DIF). The FRs and DSDs are contained in separate subdatabases linked by a common field. The MMR is configured in MySQL database software with custom Practical Extraction and Reporting Language (PERL) programs to validate and ingest the metadata records. The database contents are converted into the Federal Geographic Data Committee (FGDC) standard format by use of the Extensible Markup Language (XML). A Web interface enables users to search for availability of data from all sources.
77 FR 28869 - Worsham Steed Gas Storage, LLC; Notice of Compliance Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-16
... in the filing. Any person desiring to participate in this rate proceeding must file a motion to... in determining the appropriate action to be taken, but will not serve to make protestants parties to the proceeding. Any person wishing to become a party must file a notice of intervention or motion to...
78 FR 6089 - Combined Notice of Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-29
...: 5 p.m. ET 2/4/13. Docket Numbers: RP13-467-000. Applicants: Texas Eastern Transmission, LP... LLC submits tariff filing per 154.203: Cadeville Gas Storage Tariff Filing 1/18/13 to be effective 1/ 18/2013. Filed Date: 1/18/13. Accession Number: 20130118-5175. Comments Due: 5 p.m. ET 1/30/13...
Optimal File-Distribution in Heterogeneous and Asymmetric Storage Networks
NASA Astrophysics Data System (ADS)
Langner, Tobias; Schindelhauer, Christian; Souza, Alexander
We consider an optimisation problem which is motivated from storage virtualisation in the Internet. While storage networks make use of dedicated hardware to provide homogeneous bandwidth between servers and clients, in the Internet, connections between storage servers and clients are heterogeneous and often asymmetric with respect to upload and download. Thus, for a large file, the question arises how it should be fragmented and distributed among the servers to grant "optimal" access to the contents. We concentrate on the transfer time of a file, which is the time needed for one upload and a sequence of n downloads, using a set of m servers with heterogeneous bandwidths. We assume that fragments of the file can be transferred in parallel to and from multiple servers. This model yields a distribution problem that examines the question of how these fragments should be distributed onto those servers in order to minimise the transfer time. We present an algorithm, called FlowScaling, that finds an optimal solution within running time {O}(m log m). We formulate the distribution problem as a maximum flow problem, which involves a function that states whether a solution with a given transfer time bound exists. This function is then used with a scaling argument to determine an optimal solution within the claimed time complexity.
NASA Technical Reports Server (NTRS)
1977-01-01
Components of a videotape storage and retrieval system originally developed for NASA have been adapted as a tool for law enforcement agencies. Ampex Corp., Redwood City, Cal., built a unique system for NASA-Marshall. The first application of professional broadcast technology to computerized record-keeping, it incorporates new equipment for transporting tapes within the system. After completing the NASA system, Ampex continued development, primarily to improve image resolution. The resulting advanced system, known as the Ampex Videofile, offers advantages over microfilm for filing, storing, retrieving, and distributing large volumes of information. The system's computer stores information in digital code rather than in pictorial form. While microfilm allows visual storage of whole documents, it requires a step before usage--developing the film. With Videofile, the actual document is recorded, complete with photos and graphic material, and a picture of the document is available instantly.
75 FR 57011 - Tallulah Gas Storage LLC; Notice of Application
Federal Register 2010, 2011, 2012, 2013, 2014
2010-09-17
... Storage LLC; Notice of Application September 9, 2010. Take notice that on August 31, 2010, Tallulah Gas Storage LLC (Tallulah), 10370 Richmond Avenue, Suite 510, Houston, TX 77042, filed in Docket No. CP10-494... necessity authorizing Tallulah to construct and operate a natural gas storage facility and pipeline...
78 FR 30918 - Perryville Gas Storage LLC; Notice of Request Under Blanket Authorization
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-23
... Storage LLC; Notice of Request Under Blanket Authorization Take notice that on May 3, 2013, Perryville Gas Storage LLC (Perryville), Three Riverway, Suite 1350, Houston, Texas 77056, filed a prior notice request... Perryville's natural gas storage facility in Franklin and Richland Parishes, Louisiana. Perryville does not...
ESCHER: An interactive mesh-generating editor for preparing finite-element input
NASA Technical Reports Server (NTRS)
Oakes, W. R., Jr.
1984-01-01
ESCHER is an interactive mesh generation and editing program designed to help the user create a finite-element mesh, create additional input for finite-element analysis, including initial conditions, boundary conditions, and slidelines, and generate a NEUTRAL FILE that can be postprocessed for input into several finite-element codes, including ADINA, ADINAT, DYNA, NIKE, TSAAS, and ABUQUS. Two important ESCHER capabilities, interactive geometry creation and mesh archival storge are described in detail. Also described is the interactive command language and the use of interactive graphics. The archival storage and restart file is a modular, entity-based mesh data file. Modules of this file correspond to separate editing modes in the mesh editor, with data definition syntax preserved between the interactive commands and the archival storage file. Because ESCHER was expected to be highly interactive, extensive user documentation was provided in the form of an interactive HELP package.
Globus Identity, Access, and Data Management: Platform Services for Collaborative Science
NASA Astrophysics Data System (ADS)
Ananthakrishnan, R.; Foster, I.; Wagner, R.
2016-12-01
Globus is software-as-a-service for research data management, developed at, and operated by, the University of Chicago. Globus, accessible at www.globus.org, provides high speed, secure file transfer; file sharing directly from existing storage systems; and data publication to institutional repositories. 40,000 registered users have used Globus to transfer tens of billions of files totaling hundreds of petabytes between more than 10,000 storage systems within campuses and national laboratories in the US and internationally. Web, command line, and REST interfaces support both interactive use and integration into applications and infrastructures. An important component of the Globus system is its foundational identity and access management (IAM) platform service, Globus Auth. Both Globus research data management and other applications use Globus Auth for brokering authentication and authorization interactions between end-users, identity providers, resource servers (services), and a range of clients, including web, mobile, and desktop applications, and other services. Compliant with important standards such as OAuth, OpenID, and SAML, Globus Auth provides mechanisms required for an extensible, integrated ecosystem of services and clients for the research and education community. It underpins projects such as the US National Science Foundation's XSEDE system, NCAR's Research Data Archive, and the DOE Systems Biology Knowledge Base. Current work is extending Globus services to be compliant with FEDRAMP standards for security assessment, authorization, and monitoring for cloud services. We will present Globus IAM solutions and give examples of Globus use in various projects for federated access to resources. We will also describe how Globus Auth and Globus research data management capabilities enable rapid development and low-cost operations of secure data sharing platforms that leverage Globus services and integrate them with local policy and security.
78 FR 14531 - ANR Storage Company; Notice of Request Under Blanket Authorization
Federal Register 2010, 2011, 2012, 2013, 2014
2013-03-06
... Company (ANR Storage), 717 Texas Street, Suite 2400, Houston, Texas 77002-2761, filed in Docket No. CP13... Storage Company, 717 Texas Street, Suite 2400, Houston, Texas 77002-2761, or by calling (832) 320-5487...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ilsche, Thomas; Schuchart, Joseph; Cope, Joseph
Event tracing is an important tool for understanding the performance of parallel applications. As concurrency increases in leadership-class computing systems, the quantity of performance log data can overload the parallel file system, perturbing the application being observed. In this work we present a solution for event tracing at leadership scales. We enhance the I/O forwarding system software to aggregate and reorganize log data prior to writing to the storage system, significantly reducing the burden on the underlying file system for this type of traffic. Furthermore, we augment the I/O forwarding system with a write buffering capability to limit the impactmore » of artificial perturbations from log data accesses on traced applications. To validate the approach, we modify the Vampir tracing tool to take advantage of this new capability and show that the approach increases the maximum traced application size by a factor of 5x to more than 200,000 processors.« less
Moving Large Data Sets Over High-Performance Long Distance Networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hodson, Stephen W; Poole, Stephen W; Ruwart, Thomas
2011-04-01
In this project we look at the performance characteristics of three tools used to move large data sets over dedicated long distance networking infrastructure. Although performance studies of wide area networks have been a frequent topic of interest, performance analyses have tended to focus on network latency characteristics and peak throughput using network traffic generators. In this study we instead perform an end-to-end long distance networking analysis that includes reading large data sets from a source file system and committing large data sets to a destination file system. An evaluation of end-to-end data movement is also an evaluation of themore » system configurations employed and the tools used to move the data. For this paper, we have built several storage platforms and connected them with a high performance long distance network configuration. We use these systems to analyze the capabilities of three data movement tools: BBcp, GridFTP, and XDD. Our studies demonstrate that existing data movement tools do not provide efficient performance levels or exercise the storage devices in their highest performance modes. We describe the device information required to achieve high levels of I/O performance and discuss how this data is applicable in use cases beyond data movement performance.« less
Development of climate data storage and processing model
NASA Astrophysics Data System (ADS)
Okladnikov, I. G.; Gordov, E. P.; Titov, A. G.
2016-11-01
We present a storage and processing model for climate datasets elaborated in the framework of a virtual research environment (VRE) for climate and environmental monitoring and analysis of the impact of climate change on the socio-economic processes on local and regional scales. The model is based on a «shared nothings» distributed computing architecture and assumes using a computing network where each computing node is independent and selfsufficient. Each node holds a dedicated software for the processing and visualization of geospatial data providing programming interfaces to communicate with the other nodes. The nodes are interconnected by a local network or the Internet and exchange data and control instructions via SSH connections and web services. Geospatial data is represented by collections of netCDF files stored in a hierarchy of directories in the framework of a file system. To speed up data reading and processing, three approaches are proposed: a precalculation of intermediate products, a distribution of data across multiple storage systems (with or without redundancy), and caching and reuse of the previously obtained products. For a fast search and retrieval of the required data, according to the data storage and processing model, a metadata database is developed. It contains descriptions of the space-time features of the datasets available for processing, their locations, as well as descriptions and run options of the software components for data analysis and visualization. The model and the metadata database together will provide a reliable technological basis for development of a high- performance virtual research environment for climatic and environmental monitoring.
NASA Astrophysics Data System (ADS)
McFall, Steve
1994-03-01
With the increase in business automation and the widespread availability and low cost of computer systems, law enforcement agencies have seen a corresponding increase in criminal acts involving computers. The examination of computer evidence is a new field of forensic science with numerous opportunities for research and development. Research is needed to develop new software utilities to examine computer storage media, expert systems capable of finding criminal activity in large amounts of data, and to find methods of recovering data from chemically and physically damaged computer storage media. In addition, defeating encryption and password protection of computer files is also a topic requiring more research and development.
Oak Ridge Leadership Computing Facility Position Paper
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oral, H Sarp; Hill, Jason J; Thach, Kevin G
This paper discusses the business, administration, reliability, and usability aspects of storage systems at the Oak Ridge Leadership Computing Facility (OLCF). The OLCF has developed key competencies in architecting and administration of large-scale Lustre deployments as well as HPSS archival systems. Additionally as these systems are architected, deployed, and expanded over time reliability and availability factors are a primary driver. This paper focuses on the implementation of the Spider parallel Lustre file system as well as the implementation of the HPSS archive at the OLCF.
75 FR 54606 - Combined Notice of Filings 2
Federal Register 2010, 2011, 2012, 2013, 2014
2010-09-08
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission Combined Notice of Filings 2 August 27, 2010. Take notice that the Commission has received the following Natural Gas Pipeline Rate and Refund Report filings: Docket Numbers: RP10-1072-001. Applicants: Egan Hub Storage, LLC. Description: Egan Hub...
76 FR 3619 - Combined Notice of Filings #1
Federal Register 2010, 2011, 2012, 2013, 2014
2011-01-20
... tariff filing per 35.13(a)(2)(iii): Compressed Air Energy Storage Station Power Definition Revision, to... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission Combined Notice of Filings 1 Monday... Numbers: EG11-43-000. Applicants: LSP Energy, Inc. Description: Notice of Self-Certifiation of EWG Status...
Low-Speed Fingerprint Image Capture System User`s Guide, June 1, 1993
DOE Office of Scientific and Technical Information (OSTI.GOV)
Whitus, B.R.; Goddard, J.S.; Jatko, W.B.
1993-06-01
The Low-Speed Fingerprint Image Capture System (LS-FICS) uses a Sun workstation controlling a Lenzar ElectroOptics Opacity 1000 imaging system to digitize fingerprint card images to support the Federal Bureau of Investigation`s (FBI`s) Automated Fingerprint Identification System (AFIS) program. The system also supports the operations performed by the Oak Ridge National Laboratory- (ORNL-) developed Image Transmission Network (ITN) prototype card scanning system. The input to the system is a single FBI fingerprint card of the agreed-upon standard format and a user-specified identification number. The output is a file formatted to be compatible with the National Institute of Standards and Technology (NIST)more » draft standard for fingerprint data exchange dated June 10, 1992. These NIST compatible files contain the required print and text images. The LS-FICS is designed to provide the FBI with the capability of scanning fingerprint cards into a digital format. The FBI will replicate the system to generate a data base of test images. The Host Workstation contains the image data paths and the compression algorithm. A local area network interface, disk storage, and tape drive are used for the image storage and retrieval, and the Lenzar Opacity 1000 scanner is used to acquire the image. The scanner is capable of resolving 500 pixels/in. in both x and y directions. The print images are maintained in full 8-bit gray scale and compressed with an FBI-approved wavelet-based compression algorithm. The text fields are downsampled to 250 pixels/in. and 2-bit gray scale. The text images are then compressed using a lossless Huffman coding scheme. The text fields retrieved from the output files are easily interpreted when displayed on the screen. Detailed procedures are provided for system calibration and operation. Software tools are provided to verify proper system operation.« less
Incorporating Oracle on-line space management with long-term archival technology
NASA Technical Reports Server (NTRS)
Moran, Steven M.; Zak, Victor J.
1996-01-01
The storage requirements of today's organizations are exploding. As computers continue to escalate in processing power, applications grow in complexity and data files grow in size and in number. As a result, organizations are forced to procure more and more megabytes of storage space. This paper focuses on how to expand the storage capacity of a Very Large Database (VLDB) cost-effectively within a Oracle7 data warehouse system by integrating long term archival storage sub-systems with traditional magnetic media. The Oracle architecture described in this paper was based on an actual proof of concept for a customer looking to store archived data on optical disks yet still have access to this data without user intervention. The customer had a requirement to maintain 10 years worth of data on-line. Data less than a year old still had the potential to be updated thus will reside on conventional magnetic disks. Data older than a year will be considered archived and will be placed on optical disks. The ability to archive data to optical disk and still have access to that data provides the system a means to retain large amounts of data that is readily accessible yet significantly reduces the cost of total system storage. Therefore, the cost benefits of archival storage devices can be incorporated into the Oracle storage medium and I/O subsystem without loosing any of the functionality of transaction processing, yet at the same time providing an organization access to all their data.
Analysis Report for Exascale Storage Requirements for Scientific Data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ruwart, Thomas M.
Over the next 10 years, the Department of Energy will be transitioning from Petascale to Exascale Computing resulting in data storage, networking, and infrastructure requirements to increase by three orders of magnitude. The technologies and best practices used today are the result of a relatively slow evolution of ancestral technologies developed in the 1950s and 1960s. These include magnetic tape, magnetic disk, networking, databases, file systems, and operating systems. These technologies will continue to evolve over the next 10 to 15 years on a reasonably predictable path. Experience with the challenges involved in transitioning these fundamental technologies from Terascale tomore » Petascale computing systems has raised questions about how these will scale another 3 or 4 orders of magnitude to meet the requirements imposed by Exascale computing systems. This report is focused on the most concerning scaling issues with data storage systems as they relate to High Performance Computing- and presents options for a path forward. Given the ability to store exponentially increasing amounts of data, far more advanced concepts and use of metadata will be critical to managing data in Exascale computing systems.« less
The European Southern Observatory-MIDAS table file system
NASA Technical Reports Server (NTRS)
Peron, M.; Grosbol, P.
1992-01-01
The new and substantially upgraded version of the Table File System in MIDAS is presented as a scientific database system. MIDAS applications for performing database operations on tables are discussed, for instance, the exchange of the data to and from the TFS, the selection of objects, the uncertainty joins across tables, and the graphical representation of data. This upgraded version of the TFS is a full implementation of the binary table extension of the FITS format; in addition, it also supports arrays of strings. Different storage strategies for optimal access of very large data sets are implemented and are addressed in detail. As a simple relational database, the TFS may be used for the management of personal data files. This opens the way to intelligent pipeline processing of large amounts of data. One of the key features of the Table File System is to provide also an extensive set of tools for the analysis of the final results of a reduction process. Column operations using standard and special mathematical functions as well as statistical distributions can be carried out; commands for linear regression and model fitting using nonlinear least square methods and user-defined functions are available. Finally, statistical tests of hypothesis and multivariate methods can also operate on tables.
78 FR 123 - Diablo Canyon, Independent Spent Fuel Storage Installation; License Amendment Request...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-02
... receipt of the document. The E-Filing system also distributes an email notice that provides access to the...: Rulemaking and Adjudications Staff; or (2) courier, express mail, or expedited delivery service to the Office... mail as of the time of deposit in the mail, or by courier, express mail, or expedited delivery service...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-16
...-Filing system also distributes an email notice that provides access to the document to the NRC's Office... Adjudications Staff; or (2) courier, express mail, or expedited delivery service to the Office of the Secretary... time of deposit in the mail, or by courier, express mail, or expedited delivery service upon depositing...
78 FR 58529 - Floridian Natural Gas Storage Company, LLC; Notice of Application
Federal Register 2010, 2011, 2012, 2013, 2014
2013-09-24
... Natural Gas Storage Company, LLC; Notice of Application Take notice that on September 4, 2013, Floridian Natural Gas Storage Company, LLC (Floridian Gas Storage), 1000 Louisiana Street, Suite 4361, Houston, Texas 77002, filed in Docket No. CP13-541-000 an application under section 7(c) of the Natural Gas Act...
77 FR 15097 - ANR Storage Company; Notice of Petition for Declaratory Order
Federal Register 2010, 2011, 2012, 2013, 2014
2012-03-14
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. RP12-479-000] ANR Storage... 284.501, et seq., of the Commission's regulations, 18 CFR 284.501, et seq., (2011), ANR Storage Company (ANR Storage) filed a petition for a declaratory order requesting that the Commission issue an...
75 FR 70727 - Perryville Gas Storage LLC ; Notice of Application
Federal Register 2010, 2011, 2012, 2013, 2014
2010-11-18
... Storage LLC ; Notice of Application November 10, 2010. Take notice that on November 5, 2010, Perryville Gas Storage LLC (Perryville), Three Riverway, Suite 1350, Houston, Texas 77056, filed in Docket No... interpretations for the location of the edge of the salt dome relative to the approved natural gas storage Cavern...
75 FR 57747 - Tres Palacios Gas Storage LLC; Notice of Application
Federal Register 2010, 2011, 2012, 2013, 2014
2010-09-22
... Gas Storage LLC; Notice of Application September 15, 2010. Take notice that on September 3, 2010, Tres Palacios Gas Storage LLC (Tres Palacios), 53 Riverside Avenue, Westport, Connecticut 06880, filed in Docket... natural gas storage caverns to the actual capacities available in each cavern as established by the most...
76 FR 13612 - Freebird Gas Storage, LLC; Notice of Request Under Blanket Authorization
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-14
... Storage, LLC; Notice of Request Under Blanket Authorization Take notice that on March 1, 2011, Freebird Gas Storage, LLC (Freebird) filed a Prior Notice Request pursuant to sections 157.205 and 157.208 of... blanket certificate for authorization to increase the storage capacity and deliverability at its East...
75 FR 45610 - Liberty Gas Storage LLC; Notice of Amendment
Federal Register 2010, 2011, 2012, 2013, 2014
2010-08-03
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. CP05-92-007] Liberty Gas Storage LLC; Notice of Amendment Take notice that on July 26, Liberty Gas Storage LLC (``Liberty''), 101... Gas Storage, 101 Ash Street, San Diego, CA 92101, phone (619) 699-5050. The filing is available for...
75 FR 61478 - D'Lo Gas Storage, LLC; Notice of Petition
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-05
... Storage, LLC; Notice of Petition September 24, 2010. Take notice that on September 21, 2010, D'Lo Gas Storage, LLC (Petitioner), 1002 East St. Mary Boulevard, Lafayette, Louisiana 70503, filed in Docket No... determine feasibility of developing the underlying salt dome formation for natural gas storage, all as more...
78 FR 298 - Combined Notice of Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-03
.... Protests may be considered, but intervention is necessary to become a party to the proceeding. Filings in...: 5 p.m. ET 12/28/12. Docket Numbers: RP13-106-001. Applicants: Young Gas Storage Company, Ltd. Description: Young NAESB 2.0 Compliance Filing to be effective 12/ 1/2012. [[Page 299
75 FR 70229 - Combined Notice of Filings No. 2
Federal Register 2010, 2011, 2012, 2013, 2014
2010-11-17
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission Combined Notice of Filings No. 2... Storage Company submits tariff section 6.11.11- GT&C North American Energy Standards Board, v.3.0 etc., to...-001. Applicants: National Grid LNG, LP. Description: National Grid LNG, LP submits tariff filing per...
75 FR 47291 - Notice of Baseline Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2010-08-05
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission Notice of Baseline Filings July 29, 2010. ONEOK Gas Storage, L.L.C Docket No. PR10-67-000. Atmos Energy--Kentucky/Mid-States Division Docket No... applicants listed above submitted their baseline filing of its Statement of Operating Conditions for services...
75 FR 35780 - Centana Intrastate Pipeline, LLC; Notice of Baseline Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-23
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR10-35-000] Centana Intrastate Pipeline, LLC; Notice of Baseline Filing June 16, 2010. Take notice that on June 15, 2010, Centana Intrastate Pipeline, LLC submitted a baseline filing of its Storage Statement of Operating Conditions for...
75 FR 37786 - DCP Guadalupe Pipeline, LLC; Notice of Baseline Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-30
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR10-31-000] DCP Guadalupe Pipeline, LLC; Notice of Baseline Filing June 23, 2010. Take notice that on June 10, 2010, DCP Guadalupe Pipeline, LLC submitted a baseline filing of its Storage Statement of Operating Conditions for services...
Data Service: Distributed Data Capture and Replication
NASA Astrophysics Data System (ADS)
Warner, P. B.; Pietrowicz, S. R.
2007-10-01
Data Service is a critical component of the NOAO Data Management and Science Support (DMaSS) Solutions Platform, which is based on a service-oriented architecture, and is to replace the current NOAO Data Transport System. Its responsibilities include capturing data from NOAO and partner telescopes and instruments and replicating the data across multiple (currently six) storage sites. Java 5 was chosen as the implementation language, and Java EE as the underlying enterprise framework. Application metadata persistence is performed using EJB and Hibernate on the JBoss Application Server, with PostgreSQL as the persistence back-end. Although potentially any underlying mass storage system may be used as the Data Service file persistence technology, DTS deployments and Data Service test deployments currently use the Storage Resource Broker from SDSC. This paper presents an overview and high-level design of the Data Service, including aspects of deployment, e.g., for the LSST Data Challenge at the NCSA computing facilities.
Forming an ad-hoc nearby storage, based on IKAROS and social networking services
NASA Astrophysics Data System (ADS)
Filippidis, Christos; Cotronis, Yiannis; Markou, Christos
2014-06-01
We present an ad-hoc "nearby" storage, based on IKAROS and social networking services, such as Facebook. By design, IKAROS is capable to increase or decrease the number of nodes of the I/O system instance on the fly, without bringing everything down or losing data. IKAROS is capable to decide the file partition distribution schema, by taking on account requests from the user or an application, as well as a domain or a Virtual Organization policy. In this way, it is possible to form multiple instances of smaller capacity higher bandwidth storage utilities capable to respond in an ad-hoc manner. This approach, focusing on flexibility, can scale both up and down and so can provide more cost effective infrastructures for both large scale and smaller size systems. A set of experiments is performed comparing IKAROS with PVFS2 by using multiple clients requests under HPC IOR benchmark and MPICH2.
A Future Accelerated Cognitive Distributed Hybrid Testbed for Big Data Science Analytics
NASA Astrophysics Data System (ADS)
Halem, M.; Prathapan, S.; Golpayegani, N.; Huang, Y.; Blattner, T.; Dorband, J. E.
2016-12-01
As increased sensor spectral data volumes from current and future Earth Observing satellites are assimilated into high-resolution climate models, intensive cognitive machine learning technologies are needed to data mine, extract and intercompare model outputs. It is clear today that the next generation of computers and storage, beyond petascale cluster architectures, will be data centric. They will manage data movement and process data in place. Future cluster nodes have been announced that integrate multiple CPUs with high-speed links to GPUs and MICS on their backplanes with massive non-volatile RAM and access to active flash RAM disk storage. Active Ethernet connected key value store disk storage drives with 10Ge or higher are now available through the Kinetic Open Storage Alliance. At the UMBC Center for Hybrid Multicore Productivity Research, a future state-of-the-art Accelerated Cognitive Computer System (ACCS) for Big Data science is being integrated into the current IBM iDataplex computational system `bluewave'. Based on the next gen IBM 200 PF Sierra processor, an interim two node IBM Power S822 testbed is being integrated with dual Power 8 processors with 10 cores, 1TB Ram, a PCIe to a K80 GPU and an FPGA Coherent Accelerated Processor Interface card to 20TB Flash Ram. This system is to be updated to the Power 8+, an NVlink 1.0 with the Pascal GPU late in 2016. Moreover, the Seagate 96TB Kinetic Disk system with 24 Ethernet connected active disks is integrated into the ACCS storage system. A Lightweight Virtual File System developed at the NASA GSFC is installed on bluewave. Since remote access to publicly available quantum annealing computers is available at several govt labs, the ACCS will offer an in-line Restricted Boltzmann Machine optimization capability to the D-Wave 2X quantum annealing processor over the campus high speed 100 Gb network to Internet 2 for large files. As an evaluation test of the cognitive functionality of the architecture, the following studies utilizing all the system components will be presented; (i) a near real time climate change study generating CO2 fluxes and (ii) a deep dive capability into an 8000 x8000 pixel image pyramid display and (iii) Large dense and sparse eigenvalue decomposition.
Building an organic block storage service at CERN with Ceph
NASA Astrophysics Data System (ADS)
van der Ster, Daniel; Wiebalck, Arne
2014-06-01
Emerging storage requirements, such as the need for block storage for both OpenStack VMs and file services like AFS and NFS, have motivated the development of a generic backend storage service for CERN IT. The goals for such a service include (a) vendor neutrality, (b) horizontal scalability with commodity hardware, (c) fault tolerance at the disk, host, and network levels, and (d) support for geo-replication. Ceph is an attractive option due to its native block device layer RBD which is built upon its scalable, reliable, and performant object storage system, RADOS. It can be considered an "organic" storage solution because of its ability to balance and heal itself while living on an ever-changing set of heterogeneous disk servers. This work will present the outcome of a petabyte-scale test deployment of Ceph by CERN IT. We will first present the architecture and configuration of our cluster, including a summary of best practices learned from the community and discovered internally. Next the results of various functionality and performance tests will be shown: the cluster has been used as a backend block storage system for AFS and NFS servers as well as a large OpenStack cluster at CERN. Finally, we will discuss the next steps and future possibilities for Ceph at CERN.
Grider, Gary A.; Poole, Stephen W.
2015-09-01
Collective buffering and data pattern solutions are provided for storage, retrieval, and/or analysis of data in a collective parallel processing environment. For example, a method can be provided for data storage in a collective parallel processing environment. The method comprises receiving data to be written for a plurality of collective processes within a collective parallel processing environment, extracting a data pattern for the data to be written for the plurality of collective processes, generating a representation describing the data pattern, and saving the data and the representation.
Portable Map-Reduce Utility for MIT SuperCloud Environment
2015-09-17
Reuther, A. Rosa, C. Yee, “Driving Big Data With Big Compute,” IEEE HPEC, Sep 10-12, 2012, Waltham, MA. [6] Apache Hadoop 1.2.1 Documentation: HDFS... big data architecture, which is designed to address these challenges, is made of the computing resources, scheduler, central storage file system...databases, analytics software and web interfaces [1]. These components are common to many big data and supercomputing systems. The platform is
Effect of combined digital imaging parameters on endodontic file measurements.
de Oliveira, Matheus Lima; Pinto, Geraldo Camilo de Souza; Ambrosano, Glaucia Maria Bovi; Tosoni, Guilherme Monteiro
2012-10-01
This study assessed the effect of the combination of a dedicated endodontic filter, spatial resolution, and contrast resolution on the determination of endodontic file lengths. Forty extracted single-rooted teeth were x-rayed with K-files (ISO size 10 and 15) in the root canals. Images were acquired using the VistaScan system (Dürr Dental, Beitigheim-Bissingen, Germany) under different combining parameters of spatial resolution (10 and 25 line pairs per millimeter [lp/mm]) and contrast resolution (8- and 16-bit depths). Subsequently, a dedicated endodontic filter was applied on the 16-bit images, creating 2 additional parameters. Six observers measured the length of the endodontic files in the root canals using the software that accompanies the system. The mean values of the actual file lengths and the measurements of the radiographic images were submitted to 1-way analysis of variance and the Tukey test at a level of significance of 5%. The intraobserver reproducibility was assessed by the intraclass correlation coefficient. All combined image parameters showed excellent intraobserver agreement with intraclass correlation coefficient means higher than 0.98. The imaging parameter of 25 lp/mm and 16 bit associated with the use of the endodontic filter did not differ significantly from the actual file lengths when both file sizes were analyzed together or separately (P > .05). When the size 15 file was evaluated separately, only 8-bit images differed significantly from the actual file lengths (P ≤ .05). The combination of an endodontic filter with high spatial resolution and high contrast resolution is recommended for the determination of file lengths when using storage phosphor plates. Copyright © 2012 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
An Optimizing Compiler for Petascale I/O on Leadership Class Architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choudhary, Alok; Kandemir, Mahmut
In high-performance computing systems, parallel I/O architectures usually have very complex hierarchies with multiple layers that collectively constitute an I/O stack, including high-level I/O libraries such as PnetCDF and HDF5, I/O middleware such as MPI-IO, and parallel file systems such as PVFS and Lustre. Our project explored automated instrumentation and compiler support for I/O intensive applications. Our project made significant progress towards understanding the complex I/O hierarchies of high-performance storage systems (including storage caches, HDDs, and SSDs), and designing and implementing state-of-the-art compiler/runtime system technology that targets I/O intensive HPC applications that target leadership class machine. This final report summarizesmore » the major achievements of the project and also points out promising future directions.« less
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-13
... Storage Water Supply LCC; Notice of Preliminary Permit Application Accepted for Filing and Soliciting... Act (FPA), proposing to study the feasibility of the East Maui Pumped Storage Water Supply Project to.... Bart M. O'Keeffe, East Maui Pumped Storage Water Supply LLC; P.O. Box 1916; Discovery Bay, CA 94505...
78 FR 63179 - Notice of Request Under Blanket Authorization; Petal Gas Storage, LLC.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-23
... Request Under Blanket Authorization; Petal Gas Storage, LLC. Take notice that on October 9, 2013, Petal Gas Storage, L.L.C. (Petal), 9 Greenway Plaza, Suite 2800, Houston, Texas 77046, filed in Docket No... storage capacity in the Petal Salt Dome's Cavern 12A, located in Forrest County, Mississippi, from 8.2 Bcf...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-10-09
... determination by the presiding officer that the filing demonstrates good cause by satisfying the following three... on electronic storage media. Participants may not submit paper copies of their filings unless they.... Participants who believe that they have a good cause for not submitting documents electronically must file an...
75 FR 33799 - Moss Bluff Hub, LLC; Notice of Baseline Filing
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-15
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. PR10-28-000] Moss Bluff Hub, LLC; Notice of Baseline Filing June 8, 2010. Take notice that on June 1, 2010, Moss Bluff Hub, LLC submitted a baseline filing of its Statement of General Terms and Standard Operations Conditions for storage...
Yokohama, Noriya; Tsuchimoto, Tadashi; Oishi, Masamichi; Itou, Katsuya
2007-01-20
It has been noted that the downtime of medical informatics systems is often long. Many systems encounter downtimes of hours or even days, which can have a critical effect on daily operations. Such systems remain especially weak in the areas of database and medical imaging data. The scheme design shows the three-layer architecture of the system: application, database, and storage layers. The application layer uses the DICOM protocol (Digital Imaging and Communication in Medicine) and HTTP (Hyper Text Transport Protocol) with AJAX (Asynchronous JavaScript+XML). The database is designed to decentralize in parallel using cluster technology. Consequently, restoration of the database can be done not only with ease but also with improved retrieval speed. In the storage layer, a network RAID (Redundant Array of Independent Disks) system, it is possible to construct exabyte-scale parallel file systems that exploit storage spread. Development and evaluation of the test-bed has been successful in medical information data backup and recovery in a network environment. This paper presents a schematic design of the new medical informatics system that can be accommodated from a recovery and the dynamic Web application for medical imaging distribution using AJAX.
Schauer, Stephanie L; Maerz, Thomas R; Verdon, Matthew J; Hopfensperger, Daniel J; Davis, Jeffrey P
2014-06-01
The Wisconsin Immunization Registry is a confidential, web-based system used since 1999 as a centralized repository of immunization information for Wisconsin residents. Provide evidence based on Registry experiences with electronic data exchange, comparing the benefits and drawbacks of using the Health Level 7 standard, including the option for real time data exchange vs the flat file method. For data regarding vaccinations received by children aged 4 months through 6 years with Wisconsin addresses that were submitted to the Registry during 2010 and 2011, data timeliness (days from vaccine administration to date information was received) and completeness (percentage of records received that include core data elements for electronic storage) were compared by file submission method. Data submitted using Health Level 7 were substantially more timely than data submitted using the flat file method. Additionally, data submitted using Health Level 7 were substantially more complete for each of the core elements compared to flat file submission. Health care organizations that submit electronic data to immunization information systems should be aware that the technical decision to use the Health Level 7 format, particularly if real-time data exchange is employed, can result in more timely and accurate data. This will assist clinicians in adhering to the Advisory Committee on Immunization Practices schedule and reducing over-immunization.
75 FR 47295 - Combined Notice of Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2010-08-05
.... Take notice that the Commission has received the following Natural Gas Pipeline Rate and Refund Report.... Applicants: Monroe Gas Storage Company, L.P. Description: Monroe Gas Storage Company, LLC submits Substitute... 28, 2010. Docket Numbers: RP10-847-001. Applicants: Monroe Gas Storage Company, L.P. Description...
The new CMS DAQ system for run-2 of the LHC
Bawej, Tomasz; Behrens, Ulf; Branson, James; ...
2015-05-21
The data acquisition (DAQ) system of the CMS experiment at the CERN Large Hadron Collider assembles events at a rate of 100 kHz, transporting event data at an aggregate throughput of 100 GB/s to the high level trigger (HLT) farm. The HLT farm selects interesting events for storage and offline analysis at a rate of around 1 kHz. The DAQ system has been redesigned during the accelerator shutdown in 2013/14. The motivation is twofold: Firstly, the current compute nodes, networking, and storage infrastructure will have reached the end of their lifetime by the time the LHC restarts. Secondly, in ordermore » to handle higher LHC luminosities and event pileup, a number of sub-detectors will be upgraded, increasing the number of readout channels and replacing the off-detector readout electronics with a μTCA implementation. The new DAQ architecture will take advantage of the latest developments in the computing industry. For data concentration, 10/40 Gb/s Ethernet technologies will be used, as well as an implementation of a reduced TCP/IP in FPGA for a reliable transport between custom electronics and commercial computing hardware. A Clos network based on 56 Gb/s FDR Infiniband has been chosen for the event builder with a throughput of ~ 4 Tb/s. The HLT processing is entirely file based. This allows the DAQ and HLT systems to be independent, and to use the HLT software in the same way as for the offline processing. The fully built events are sent to the HLT with 1/10/40 Gb/s Ethernet via network file systems. Hierarchical collection of HLT accepted events and monitoring meta-data are stored into a global file system. As a result, this paper presents the requirements, technical choices, and performance of the new system.« less
A Highly Scalable Data Service (HSDS) using Cloud-based Storage Technologies for Earth Science Data
NASA Astrophysics Data System (ADS)
Michaelis, A.; Readey, J.; Votava, P.; Henderson, J.; Willmore, F.
2017-12-01
Cloud based infrastructure may offer several key benefits of scalability, built in redundancy, security mechanisms and reduced total cost of ownership as compared with a traditional data center approach. However, most of the tools and legacy software systems developed for online data repositories within the federal government were not developed with a cloud based infrastructure in mind and do not fully take advantage of commonly available cloud-based technologies. Moreover, services bases on object storage are well established and provided through all the leading cloud service providers (Amazon Web Service, Microsoft Azure, Google Cloud, etc…) of which can often provide unmatched "scale-out" capabilities and data availability to a large and growing consumer base at a price point unachievable from in-house solutions. We describe a system that utilizes object storage rather than traditional file system based storage to vend earth science data. The system described is not only cost effective, but shows a performance advantage for running many different analytics tasks in the cloud. To enable compatibility with existing tools and applications, we outline client libraries that are API compatible with existing libraries for HDF5 and NetCDF4. Performance of the system is demonstrated using clouds services running on Amazon Web Services.
Re-Organizing Earth Observation Data Storage to Support Temporal Analysis of Big Data
NASA Technical Reports Server (NTRS)
Lynnes, Christopher
2017-01-01
The Earth Observing System Data and Information System archives many datasets that are critical to understanding long-term variations in Earth science properties. Thus, some of these are large, multi-decadal datasets. Yet the challenge in long time series analysis comes less from the sheer volume than the data organization, which is typically one (or a small number of) time steps per file. The overhead of opening and inventorying complex, API-driven data formats such as Hierarchical Data Format introduces a small latency at each time step, which nonetheless adds up for datasets with O(10^6) single-timestep files. Several approaches to reorganizing the data can mitigate this overhead by an order of magnitude: pre-aggregating data along the time axis (time-chunking); storing the data in a highly distributed file system; or storing data in distributed columnar databases. Storing a second copy of the data incurs extra costs, so some selection criteria must be employed, which would be driven by expected or actual usage by the end user community, balanced against the extra cost.
Re-organizing Earth Observation Data Storage to Support Temporal Analysis of Big Data
NASA Astrophysics Data System (ADS)
Lynnes, C.
2017-12-01
The Earth Observing System Data and Information System archives many datasets that are critical to understanding long-term variations in Earth science properties. Thus, some of these are large, multi-decadal datasets. Yet the challenge in long time series analysis comes less from the sheer volume than the data organization, which is typically one (or a small number of) time steps per file. The overhead of opening and inventorying complex, API-driven data formats such as Hierarchical Data Format introduces a small latency at each time step, which nonetheless adds up for datasets with O(10^6) single-timestep files. Several approaches to reorganizing the data can mitigate this overhead by an order of magnitude: pre-aggregating data along the time axis (time-chunking); storing the data in a highly distributed file system; or storing data in distributed columnar databases. Storing a second copy of the data incurs extra costs, so some selection criteria must be employed, which would be driven by expected or actual usage by the end user community, balanced against the extra cost.
Introduction to the Space Weather Monitoring System at KASI
NASA Astrophysics Data System (ADS)
Baek, J.; Choi, S.; Kim, Y.; Cho, K.; Bong, S.; Lee, J.; Kwak, Y.; Hwang, J.; Park, Y.; Hwang, E.
2014-05-01
We have developed the Space Weather Monitoring System (SWMS) at the Korea Astronomy and Space Science Institute (KASI). Since 2007, the system has continuously evolved into a better system. The SWMS consists of several subsystems: applications which acquire and process observational data, servers which run the applications, data storage, and display facilities which show the space weather information. The applications collect solar and space weather data from domestic and oversea sites. The collected data are converted to other format and/or visualized in real time as graphs and illustrations. We manage 3 data acquisition and processing servers, a file service server, a web server, and 3 sets of storage systems. We have developed 30 applications for a variety of data, and the volume of data is about 5.5 GB per day. We provide our customers with space weather contents displayed at the Space Weather Monitoring Lab (SWML) using web services.
Data management system for USGS/USEPA urban hydrology studies program
Doyle, W.H.; Lorens, J.A.
1982-01-01
A data management system was developed to store, update, and retrieve data collected in urban stormwater studies jointly conducted by the U.S. Geological Survey and U.S. Environmental Protection Agency in 11 cities in the United States. The data management system is used to retrieve and combine data from USGS data files for use in rainfall, runoff, and water-quality models and for data computations such as storm loads. The system is based on the data management aspect of the Statistical Analysis System (SAS) and was used to create all the data files in the data base. SAS is used for storage and retrieval of basin physiography, land-use, and environmental practices inventory data. Also, storm-event water-quality characteristics are stored in the data base. The advantages of using SAS to create and manage a data base are many with a few being that it is simple, easy to use, contains a comprehensive statistical package, and can be used to modify files very easily. Data base system development has progressed rapidly during the last two decades and the data managment system concepts used in this study reflect the advancement made in computer technology during this era. Urban stormwater data is, however, just one application for which the system can be used. (USGS)
Mass Storage System - Gyrfalcon | High-Performance Computing | NREL
. At the command line of one of Peregrine's login nodes, enter one of the following commands to copy directory.tgz /mss/
78 FR 25739 - Combined Notice of Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-02
...: Transcontinental Gas Pipe Line Company. Description: Annual Adjustment to Rate Schedule SS-2 Storage Gas Balances.... Description: Removal of Non Conforming and Neg Rate TSAs and Points of Contact Update Filing to be effective 5...
Solar heating system installed at Troy, Ohio
NASA Technical Reports Server (NTRS)
1980-01-01
The completed system was composed of three basic subsystems: the collector system consisting of 3,264 square feet of Owens Illinois evacuated glass tube collectors; the storage system which included a 5,000 gallon insulated steel tank; and the distribution and control system which included piping, pumping and heat transfer components as well as the solemoid activated valves and control logic for the efficient and safe operation of the entire system. This solar heating system was installed in an existing facility and was, therefore, a retrofit system. Extracts from the site files, specifications, drawings, installation, operation and maintenance instructions are included.
Distributed Storage Healthcare — The Basis of a Planet-Wide Public Health Care Network
Kakouros, Nikolaos
2013-01-01
Background: As health providers move towards higher levels of information technology (IT) integration, they become increasingly dependent on the availability of the electronic health record (EHR). Current solutions of individually managed storage by each healthcare provider focus on efforts to ensure data security, availability and redundancy. Such models, however, scale poorly to a future of a planet-wide public health-care network (PWPHN). Our aim was to review the research literature on distributed storage systems and propose methods that may aid the implementation of a PWPHN. Methods: A systematic review was carried out of the research dealing with distributed storage systems and EHR. A literature search was conducted on five electronic databases: Pubmed/Medline, Cinalh, EMBASE, Web of Science (ISI) and Google Scholar and then expanded to include non-authoritative sources. Results: The English National Health Service Spine represents the most established country-wide PHN but is limited in deployment and remains underused. Other, literature identified and established distributed EHR attempts are more limited in scope. We discuss the currently available distributed file storage solutions and propose a schema of how one of these technologies can be used to deploy a distributed storage of EHR with benefits in terms of enhanced fault tolerance and global availability within the PWPHN. We conclude that a PWPHN distributed health care record storage system is technically feasible over current Internet infrastructure. Nonetheless, the socioeconomic viability of PWPHN implementations remains to be determined. PMID:23459171
Yang, Chunguang G; Granite, Stephen J; Van Eyk, Jennifer E; Winslow, Raimond L
2006-11-01
Protein identification using MS is an important technique in proteomics as well as a major generator of proteomics data. We have designed the protein identification data object model (PDOM) and developed a parser based on this model to facilitate the analysis and storage of these data. The parser works with HTML or XML files saved or exported from MASCOT MS/MS ions search in peptide summary report or MASCOT PMF search in protein summary report. The program creates PDOM objects, eliminates redundancy in the input file, and has the capability to output any PDOM object to a relational database. This program facilitates additional analysis of MASCOT search results and aids the storage of protein identification information. The implementation is extensible and can serve as a template to develop parsers for other search engines. The parser can be used as a stand-alone application or can be driven by other Java programs. It is currently being used as the front end for a system that loads HTML and XML result files of MASCOT searches into a relational database. The source code is freely available at http://www.ccbm.jhu.edu and the program uses only free and open-source Java libraries.
77 FR 23241 - Floridian Natural Gas Storage Company, LLC; Notice of Application
Federal Register 2010, 2011, 2012, 2013, 2014
2012-04-18
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. CP12-100-000] Floridian Natural Gas Storage Company, LLC; Notice of Application Take notice that on March 30, 2012, Floridian Natural Gas Storage Company, LLC (FGS), 1000 Louisiana Street, Suite 4361, Houston, Texas 77002, filed in...
77 FR 31840 - Perryville Gas Storage LLC; Notice of Amendment
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-30
... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. CP12-460-000] Perryville Gas Storage LLC; Notice of Amendment Take notice that on May 11, 2012, Perryville Gas Storage LLC (Perryville), Three Riverway, Suite 1350, Houston, Texas 77056, filed in the above referenced docket an application...
Ensuring long-term reliability of the data storage on optical disc
NASA Astrophysics Data System (ADS)
Chen, Ken; Pan, Longfa; Xu, Bin; Liu, Wei
2008-12-01
"Quality requirements and handling regulation of archival optical disc for electronic records filing" is released by The State Archives Administration of the People's Republic of China (SAAC) on its network in March 2007. This document established a complete operative managing process for optical disc data storage in archives departments. The quality requirements of the optical disc used in archives departments are stipulated. Quality check of the recorded disc before filing is considered to be necessary and the threshold of the parameter of the qualified filing disc is set down. The handling regulations for the staffs in the archives departments are described. Recommended environment conditions of the disc preservation, recording, accessing and testing are presented. The block error rate of the disc is selected as main monitoring parameter of the lifetime of the filing disc and three classes pre-alarm lines are created for marking of different quality check intervals. The strategy of monitoring the variation of the error rate curve of the filing discs and moving the data to a new disc or a new media when the error rate of the disc reaches the third class pre-alarm line will effectively guarantee the data migration before permanent loss. Only when every step of the procedure is strictly implemented, it is believed that long-term reliability of the data storage on optical disc for archives departments can be effectively ensured.