SSeCloud: Using secret sharing scheme to secure keys
NASA Astrophysics Data System (ADS)
Hu, Liang; Huang, Yang; Yang, Disheng; Zhang, Yuzhen; Liu, Hengchang
2017-08-01
With the use of cloud storage services, one of the concerns is how to protect sensitive data securely and privately. While users enjoy the convenience of data storage provided by semi-trusted cloud storage providers, they are confronted with all kinds of risks at the same time. In this paper, we present SSeCloud, a secure cloud storage system that improves security and usability by applying secret sharing scheme to secure keys. The system encrypts uploading files on the client side and splits encrypted keys into three shares. Each of them is respectively stored by users, cloud storage providers and the alternative third trusted party. Any two of the parties can reconstruct keys. Evaluation results of prototype system show that SSeCloud provides high security without too much performance penalty.
Scientific Data Storage for Cloud Computing
NASA Astrophysics Data System (ADS)
Readey, J.
2014-12-01
Traditionally data storage used for geophysical software systems has centered on file-based systems and libraries such as NetCDF and HDF5. In contrast cloud based infrastructure providers such as Amazon AWS, Microsoft Azure, and the Google Cloud Platform generally provide storage technologies based on an object based storage service (for large binary objects) complemented by a database service (for small objects that can be represented as key-value pairs). These systems have been shown to be highly scalable, reliable, and cost effective. We will discuss a proposed system that leverages these cloud-based storage technologies to provide an API-compatible library for traditional NetCDF and HDF5 applications. This system will enable cloud storage suitable for geophysical applications that can scale up to petabytes of data and thousands of users. We'll also cover other advantages of this system such as enhanced metadata search.
Evaluation of the Huawei UDS cloud storage system for CERN specific data
NASA Astrophysics Data System (ADS)
Zotes Resines, M.; Heikkila, S. S.; Duellmann, D.; Adde, G.; Toebbicke, R.; Hughes, J.; Wang, L.
2014-06-01
Cloud storage is an emerging architecture aiming to provide increased scalability and access performance, compared to more traditional solutions. CERN is evaluating this promise using Huawei UDS and OpenStack SWIFT storage deployments, focusing on the needs of high-energy physics. Both deployed setups implement S3, one of the protocols that are emerging as a standard in the cloud storage market. A set of client machines is used to generate I/O load patterns to evaluate the storage system performance. The presented read and write test results indicate scalability both in metadata and data perspectives. Futher the Huawei UDS cloud storage is shown to be able to recover from a major failure of losing 16 disks. Both cloud storages are finally demonstrated to function as back-end storage systems to a filesystem, which is used to deliver high energy physics software.
Proactive replica checking to assure reliability of data in cloud storage with minimum replication
NASA Astrophysics Data System (ADS)
Murarka, Damini; Maheswari, G. Uma
2017-11-01
The two major issues for cloud storage systems are data reliability and storage costs. For data reliability protection, multi-replica replication strategy which is used mostly in current clouds acquires huge storage consumption, leading to a large storage cost for applications within the loud specifically. This paper presents a cost-efficient data reliability mechanism named PRCR to cut back the cloud storage consumption. PRCR ensures data reliability of large cloud information with the replication that might conjointly function as a price effective benchmark for replication. The duplication shows that when resembled to the standard three-replica approach, PRCR will scale back to consume only a simple fraction of the cloud storage from one-third of the storage, thence considerably minimizing the cloud storage price.
Notes on a storage manager for the Clouds kernel
NASA Technical Reports Server (NTRS)
Pitts, David V.; Spafford, Eugene H.
1986-01-01
The Clouds project is research directed towards producing a reliable distributed computing system. The initial goal is to produce a kernel which provides a reliable environment with which a distributed operating system can be built. The Clouds kernal consists of a set of replicated subkernels, each of which runs on a machine in the Clouds system. Each subkernel is responsible for the management of resources on its machine; the subkernal components communicate to provide the cooperation necessary to meld the various machines into one kernel. The implementation of a kernel-level storage manager that supports reliability is documented. The storage manager is a part of each subkernel and maintains the secondary storage residing at each machine in the distributed system. In addition to providing the usual data transfer services, the storage manager ensures that data being stored survives machine and system crashes, and that the secondary storage of a failed machine is recovered (made consistent) automatically when the machine is restarted. Since the storage manager is part of the Clouds kernel, efficiency of operation is also a concern.
NAFFS: network attached flash file system for cloud storage on portable consumer electronics
NASA Astrophysics Data System (ADS)
Han, Lin; Huang, Hao; Xie, Changsheng
Cloud storage technology has become a research hotspot in recent years, while the existing cloud storage services are mainly designed for data storage needs with stable high speed Internet connection. Mobile Internet connections are often unstable and the speed is relatively low. These native features of mobile Internet limit the use of cloud storage in portable consumer electronics. The Network Attached Flash File System (NAFFS) presented the idea of taking the portable device built-in NAND flash memory as the front-end cache of virtualized cloud storage device. Modern portable devices with Internet connection have built-in more than 1GB NAND Flash, which is quite enough for daily data storage. The data transfer rate of NAND flash device is much higher than mobile Internet connections[1], and its non-volatile feature makes it very suitable as the cache device of Internet cloud storage on portable device, which often have unstable power supply and intermittent Internet connection. In the present work, NAFFS is evaluated with several benchmarks, and its performance is compared with traditional network attached file systems, such as NFS. Our evaluation results indicate that the NAFFS achieves an average accessing speed of 3.38MB/s, which is about 3 times faster than directly accessing cloud storage by mobile Internet connection, and offers a more stable interface than that of directly using cloud storage API. Unstable Internet connection and sudden power off condition are tolerable, and no data in cache will be lost in such situation.
Using Cloud-based Storage Technologies for Earth Science Data
NASA Astrophysics Data System (ADS)
Michaelis, A.; Readey, J.; Votava, P.
2016-12-01
Cloud based infrastructure may offer several key benefits of scalability, built in redundancy and reduced total cost of ownership as compared with a traditional data center approach. However, most of the tools and software systems developed for NASA data repositories were not developed with a cloud based infrastructure in mind and do not fully take advantage of commonly available cloud-based technologies. Object storage services are provided through all the leading public (Amazon Web Service, Microsoft Azure, Google Cloud, etc.) and private (Open Stack) clouds, and may provide a more cost-effective means of storing large data collections online. We describe a system that utilizes object storage rather than traditional file system based storage to vend earth science data. The system described is not only cost effective, but shows superior performance for running many different analytics tasks in the cloud. To enable compatibility with existing tools and applications, we outline client libraries that are API compatible with existing libraries for HDF5 and NetCDF4. Performance of the system is demonstrated using clouds services running on Amazon Web Services.
NASA Astrophysics Data System (ADS)
Murata, K. T.
2014-12-01
Data-intensive or data-centric science is 4th paradigm after observational and/or experimental science (1st paradigm), theoretical science (2nd paradigm) and numerical science (3rd paradigm). Science cloud is an infrastructure for 4th science methodology. The NICT science cloud is designed for big data sciences of Earth, space and other sciences based on modern informatics and information technologies [1]. Data flow on the cloud is through the following three techniques; (1) data crawling and transfer, (2) data preservation and stewardship, and (3) data processing and visualization. Original tools and applications of these techniques have been designed and implemented. We mash up these tools and applications on the NICT Science Cloud to build up customized systems for each project. In this paper, we discuss science data processing through these three steps. For big data science, data file deployment on a distributed storage system should be well designed in order to save storage cost and transfer time. We developed a high-bandwidth virtual remote storage system (HbVRS) and data crawling tool, NICTY/DLA and Wide-area Observation Network Monitoring (WONM) system, respectively. Data files are saved on the cloud storage system according to both data preservation policy and data processing plan. The storage system is developed via distributed file system middle-ware (Gfarm: GRID datafarm). It is effective since disaster recovery (DR) and parallel data processing are carried out simultaneously without moving these big data from storage to storage. Data files are managed on our Web application, WSDBank (World Science Data Bank). The big-data on the cloud are processed via Pwrake, which is a workflow tool with high-bandwidth of I/O. There are several visualization tools on the cloud; VirtualAurora for magnetosphere and ionosphere, VDVGE for google Earth, STICKER for urban environment data and STARStouch for multi-disciplinary data. There are 30 projects running on the NICT Science Cloud for Earth and space science. In 2003 56 refereed papers were published. At the end, we introduce a couple of successful results of Earth and space sciences using these three techniques carried out on the NICT Sciences Cloud. [1] http://sc-web.nict.go.jp
A Highly Scalable Data Service (HSDS) using Cloud-based Storage Technologies for Earth Science Data
NASA Astrophysics Data System (ADS)
Michaelis, A.; Readey, J.; Votava, P.; Henderson, J.; Willmore, F.
2017-12-01
Cloud based infrastructure may offer several key benefits of scalability, built in redundancy, security mechanisms and reduced total cost of ownership as compared with a traditional data center approach. However, most of the tools and legacy software systems developed for online data repositories within the federal government were not developed with a cloud based infrastructure in mind and do not fully take advantage of commonly available cloud-based technologies. Moreover, services bases on object storage are well established and provided through all the leading cloud service providers (Amazon Web Service, Microsoft Azure, Google Cloud, etc…) of which can often provide unmatched "scale-out" capabilities and data availability to a large and growing consumer base at a price point unachievable from in-house solutions. We describe a system that utilizes object storage rather than traditional file system based storage to vend earth science data. The system described is not only cost effective, but shows a performance advantage for running many different analytics tasks in the cloud. To enable compatibility with existing tools and applications, we outline client libraries that are API compatible with existing libraries for HDF5 and NetCDF4. Performance of the system is demonstrated using clouds services running on Amazon Web Services.
Bent, John M.; Faibish, Sorin; Grider, Gary
2015-06-30
Cloud object storage is enabled for archived data, such as checkpoints and results, of high performance computing applications using a middleware process. A plurality of archived files, such as checkpoint files and results, generated by a plurality of processes in a parallel computing system are stored by obtaining the plurality of archived files from the parallel computing system; converting the plurality of archived files to objects using a log structured file system middleware process; and providing the objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
Gutiérrez, Miguel F; Cajiao, Alejandro; Hidalgo, José A; Cerón, Jesús D; López, Diego M; Quintero, Víctor M; Rendón, Alvaro
2014-01-01
This article presents the development process of an acquisition and data storage system managing clinical variables through a cloud storage service and a Personal Health Record (PHR) System. First, the paper explains how a Wireless Body Area Network (WBAN) that captures data from two sensors corresponding to arterial pressure and heart rate is designed. Second, this paper illustrates how data collected by the WBAN are transmitted to a cloud storage service. It is worth mentioning that this cloud service allows the data to be stored in a persistent way on an online database system. Finally, the paper describes, how the data stored in the cloud service are sent to the Indivo PHR System, where they are registered and charted for future revision by health professionals. The research demonstrated the feasibility of implementing WBAN networks for the acquisition of clinical data, and particularly for the use of Web technologies and standards to provide interoperability with PHR Systems at technical and syntactic levels.
Integration of cloud-based storage in BES III computing environment
NASA Astrophysics Data System (ADS)
Wang, L.; Hernandez, F.; Deng, Z.
2014-06-01
We present an on-going work that aims to evaluate the suitability of cloud-based storage as a supplement to the Lustre file system for storing experimental data for the BES III physics experiment and as a backend for storing files belonging to individual members of the collaboration. In particular, we discuss our findings regarding the support of cloud-based storage in the software stack of the experiment. We report on our development work that improves the support of CERN' s ROOT data analysis framework and allows efficient remote access to data through several cloud storage protocols. We also present our efforts providing the experiment with efficient command line tools for navigating and interacting with cloud storage-based data repositories both from interactive sessions and grid jobs.
Applying a cloud computing approach to storage architectures for spacecraft
NASA Astrophysics Data System (ADS)
Baldor, Sue A.; Quiroz, Carlos; Wood, Paul
As sensor technologies, processor speeds, and memory densities increase, spacecraft command, control, processing, and data storage systems have grown in complexity to take advantage of these improvements and expand the possible missions of spacecraft. Spacecraft systems engineers are increasingly looking for novel ways to address this growth in complexity and mitigate associated risks. Looking to conventional computing, many solutions have been executed to solve both the problem of complexity and heterogeneity in systems. In particular, the cloud-based paradigm provides a solution for distributing applications and storage capabilities across multiple platforms. In this paper, we propose utilizing a cloud-like architecture to provide a scalable mechanism for providing mass storage in spacecraft networks that can be reused on multiple spacecraft systems. By presenting a consistent interface to applications and devices that request data to be stored, complex systems designed by multiple organizations may be more readily integrated. Behind the abstraction, the cloud storage capability would manage wear-leveling, power consumption, and other attributes related to the physical memory devices, critical components in any mass storage solution for spacecraft. Our approach employs SpaceWire networks and SpaceWire-capable devices, although the concept could easily be extended to non-heterogeneous networks consisting of multiple spacecraft and potentially the ground segment.
Global Software Development with Cloud Platforms
NASA Astrophysics Data System (ADS)
Yara, Pavan; Ramachandran, Ramaseshan; Balasubramanian, Gayathri; Muthuswamy, Karthik; Chandrasekar, Divya
Offshore and outsourced distributed software development models and processes are facing challenges, previously unknown, with respect to computing capacity, bandwidth, storage, security, complexity, reliability, and business uncertainty. Clouds promise to address these challenges by adopting recent advances in virtualization, parallel and distributed systems, utility computing, and software services. In this paper, we envision a cloud-based platform that addresses some of these core problems. We outline a generic cloud architecture, its design and our first implementation results for three cloud forms - a compute cloud, a storage cloud and a cloud-based software service- in the context of global distributed software development (GSD). Our ”compute cloud” provides computational services such as continuous code integration and a compile server farm, ”storage cloud” offers storage (block or file-based) services with an on-line virtual storage service, whereas the on-line virtual labs represent a useful cloud service. We note some of the use cases for clouds in GSD, the lessons learned with our prototypes and identify challenges that must be conquered before realizing the full business benefits. We believe that in the future, software practitioners will focus more on these cloud computing platforms and see clouds as a means to supporting a ecosystem of clients, developers and other key stakeholders.
Virtualization and cloud computing in dentistry.
Chow, Frank; Muftu, Ali; Shorter, Richard
2014-01-01
The use of virtualization and cloud computing has changed the way we use computers. Virtualization is a method of placing software called a hypervisor on the hardware of a computer or a host operating system. It allows a guest operating system to run on top of the physical computer with a virtual machine (i.e., virtual computer). Virtualization allows multiple virtual computers to run on top of one physical computer and to share its hardware resources, such as printers, scanners, and modems. This increases the efficient use of the computer by decreasing costs (e.g., hardware, electricity administration, and management) since only one physical computer is needed and running. This virtualization platform is the basis for cloud computing. It has expanded into areas of server and storage virtualization. One of the commonly used dental storage systems is cloud storage. Patient information is encrypted as required by the Health Insurance Portability and Accountability Act (HIPAA) and stored on off-site private cloud services for a monthly service fee. As computer costs continue to increase, so too will the need for more storage and processing power. Virtual and cloud computing will be a method for dentists to minimize costs and maximize computer efficiency in the near future. This article will provide some useful information on current uses of cloud computing.
Cloud Computing and Its Applications in GIS
NASA Astrophysics Data System (ADS)
Kang, Cao
2011-12-01
Cloud computing is a novel computing paradigm that offers highly scalable and highly available distributed computing services. The objectives of this research are to: 1. analyze and understand cloud computing and its potential for GIS; 2. discover the feasibilities of migrating truly spatial GIS algorithms to distributed computing infrastructures; 3. explore a solution to host and serve large volumes of raster GIS data efficiently and speedily. These objectives thus form the basis for three professional articles. The first article is entitled "Cloud Computing and Its Applications in GIS". This paper introduces the concept, structure, and features of cloud computing. Features of cloud computing such as scalability, parallelization, and high availability make it a very capable computing paradigm. Unlike High Performance Computing (HPC), cloud computing uses inexpensive commodity computers. The uniform administration systems in cloud computing make it easier to use than GRID computing. Potential advantages of cloud-based GIS systems such as lower barrier to entry are consequently presented. Three cloud-based GIS system architectures are proposed: public cloud- based GIS systems, private cloud-based GIS systems and hybrid cloud-based GIS systems. Public cloud-based GIS systems provide the lowest entry barriers for users among these three architectures, but their advantages are offset by data security and privacy related issues. Private cloud-based GIS systems provide the best data protection, though they have the highest entry barriers. Hybrid cloud-based GIS systems provide a compromise between these extremes. The second article is entitled "A cloud computing algorithm for the calculation of Euclidian distance for raster GIS". Euclidean distance is a truly spatial GIS algorithm. Classical algorithms such as the pushbroom and growth ring techniques require computational propagation through the entire raster image, which makes it incompatible with the distributed nature of cloud computing. This paper presents a parallel Euclidean distance algorithm that works seamlessly with the distributed nature of cloud computing infrastructures. The mechanism of this algorithm is to subdivide a raster image into sub-images and wrap them with a one pixel deep edge layer of individually computed distance information. Each sub-image is then processed by a separate node, after which the resulting sub-images are reassembled into the final output. It is shown that while any rectangular sub-image shape can be used, those approximating squares are computationally optimal. This study also serves as a demonstration of this subdivide and layer-wrap strategy, which would enable the migration of many truly spatial GIS algorithms to cloud computing infrastructures. However, this research also indicates that certain spatial GIS algorithms such as cost distance cannot be migrated by adopting this mechanism, which presents significant challenges for the development of cloud-based GIS systems. The third article is entitled "A Distributed Storage Schema for Cloud Computing based Raster GIS Systems". This paper proposes a NoSQL Database Management System (NDDBMS) based raster GIS data storage schema. NDDBMS has good scalability and is able to use distributed commodity computers, which make it superior to Relational Database Management Systems (RDBMS) in a cloud computing environment. In order to provide optimized data service performance, the proposed storage schema analyzes the nature of commonly used raster GIS data sets. It discriminates two categories of commonly used data sets, and then designs corresponding data storage models for both categories. As a result, the proposed storage schema is capable of hosting and serving enormous volumes of raster GIS data speedily and efficiently on cloud computing infrastructures. In addition, the scheme also takes advantage of the data compression characteristics of Quadtrees, thus promoting efficient data storage. Through this assessment of cloud computing technology, the exploration of the challenges and solutions to the migration of GIS algorithms to cloud computing infrastructures, and the examination of strategies for serving large amounts of GIS data in a cloud computing infrastructure, this dissertation lends support to the feasibility of building a cloud-based GIS system. However, there are still challenges that need to be addressed before a full-scale functional cloud-based GIS system can be successfully implemented. (Abstract shortened by UMI.)
Arctic Boreal Vulnerability Experiment (ABoVE) Science Cloud
NASA Astrophysics Data System (ADS)
Duffy, D.; Schnase, J. L.; McInerney, M.; Webster, W. P.; Sinno, S.; Thompson, J. H.; Griffith, P. C.; Hoy, E.; Carroll, M.
2014-12-01
The effects of climate change are being revealed at alarming rates in the Arctic and Boreal regions of the planet. NASA's Terrestrial Ecology Program has launched a major field campaign to study these effects over the next 5 to 8 years. The Arctic Boreal Vulnerability Experiment (ABoVE) will challenge scientists to take measurements in the field, study remote observations, and even run models to better understand the impacts of a rapidly changing climate for areas of Alaska and western Canada. The NASA Center for Climate Simulation (NCCS) at the Goddard Space Flight Center (GSFC) has partnered with the Terrestrial Ecology Program to create a science cloud designed for this field campaign - the ABoVE Science Cloud. The cloud combines traditional high performance computing with emerging technologies to create an environment specifically designed for large-scale climate analytics. The ABoVE Science Cloud utilizes (1) virtualized high-speed InfiniBand networks, (2) a combination of high-performance file systems and object storage, and (3) virtual system environments tailored for data intensive, science applications. At the center of the architecture is a large object storage environment, much like a traditional high-performance file system, that supports data proximal processing using technologies like MapReduce on a Hadoop Distributed File System (HDFS). Surrounding the storage is a cloud of high performance compute resources with many processing cores and large memory coupled to the storage through an InfiniBand network. Virtual systems can be tailored to a specific scientist and provisioned on the compute resources with extremely high-speed network connectivity to the storage and to other virtual systems. In this talk, we will present the architectural components of the science cloud and examples of how it is being used to meet the needs of the ABoVE campaign. In our experience, the science cloud approach significantly lowers the barriers and risks to organizations that require high performance computing solutions and provides the NCCS with the agility required to meet our customers' rapidly increasing and evolving requirements.
Bent, John M.; Faibish, Sorin; Grider, Gary
2016-04-19
Cloud object storage is enabled for checkpoints of high performance computing applications using a middleware process. A plurality of files, such as checkpoint files, generated by a plurality of processes in a parallel computing system are stored by obtaining said plurality of files from said parallel computing system; converting said plurality of files to objects using a log structured file system middleware process; and providing said objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
MarFS, a Near-POSIX Interface to Cloud Objects
DOE Office of Scientific and Technical Information (OSTI.GOV)
Inman, Jeffrey Thornton; Vining, William Flynn; Ransom, Garrett Wilson
The engineering forces driving development of “cloud” storage have produced resilient, cost-effective storage systems that can scale to 100s of petabytes, with good parallel access and bandwidth. These features would make a good match for the vast storage needs of High-Performance Computing datacenters, but cloud storage gains some of its capability from its use of HTTP-style Representational State Transfer (REST) semantics, whereas most large datacenters have legacy applications that rely on POSIX file-system semantics. MarFS is an open-source project at Los Alamos National Laboratory that allows us to present cloud-style object-storage as a scalable near-POSIX file system. We have alsomore » developed a new storage architecture to improve bandwidth and scalability beyond what’s available in commodity object stores, while retaining their resilience and economy. Additionally, we present a scheme for scaling the POSIX interface to allow billions of files in a single directory and trillions of files in total.« less
MarFS, a Near-POSIX Interface to Cloud Objects
Inman, Jeffrey Thornton; Vining, William Flynn; Ransom, Garrett Wilson; ...
2017-01-01
The engineering forces driving development of “cloud” storage have produced resilient, cost-effective storage systems that can scale to 100s of petabytes, with good parallel access and bandwidth. These features would make a good match for the vast storage needs of High-Performance Computing datacenters, but cloud storage gains some of its capability from its use of HTTP-style Representational State Transfer (REST) semantics, whereas most large datacenters have legacy applications that rely on POSIX file-system semantics. MarFS is an open-source project at Los Alamos National Laboratory that allows us to present cloud-style object-storage as a scalable near-POSIX file system. We have alsomore » developed a new storage architecture to improve bandwidth and scalability beyond what’s available in commodity object stores, while retaining their resilience and economy. Additionally, we present a scheme for scaling the POSIX interface to allow billions of files in a single directory and trillions of files in total.« less
Research on cloud-based remote measurement and analysis system
NASA Astrophysics Data System (ADS)
Gao, Zhiqiang; He, Lingsong; Su, Wei; Wang, Can; Zhang, Changfan
2015-02-01
The promising potential of cloud computing and its convergence with technologies such as cloud storage, cloud push, mobile computing allows for creation and delivery of newer type of cloud service. Combined with the thought of cloud computing, this paper presents a cloud-based remote measurement and analysis system. This system mainly consists of three parts: signal acquisition client, web server deployed on the cloud service, and remote client. This system is a special website developed using asp.net and Flex RIA technology, which solves the selective contradiction between two monitoring modes, B/S and C/S. This platform supplies customer condition monitoring and data analysis service by Internet, which was deployed on the cloud server. Signal acquisition device is responsible for data (sensor data, audio, video, etc.) collection and pushes the monitoring data to the cloud storage database regularly. Data acquisition equipment in this system is only conditioned with the function of data collection and network function such as smartphone and smart sensor. This system's scale can adjust dynamically according to the amount of applications and users, so it won't cause waste of resources. As a representative case study, we developed a prototype system based on Ali cloud service using the rotor test rig as the research object. Experimental results demonstrate that the proposed system architecture is feasible.
Proof of cipher text ownership based on convergence encryption
NASA Astrophysics Data System (ADS)
Zhong, Weiwei; Liu, Zhusong
2017-08-01
Cloud storage systems save disk space and bandwidth through deduplication technology, but with the use of this technology has been targeted security attacks: the attacker can get the original file just use hash value to deceive the server to obtain the file ownership. In order to solve the above security problems and the different security requirements of cloud storage system files, an efficient information theory security proof of ownership scheme is proposed. This scheme protects the data through the convergence encryption method, and uses the improved block-level proof of ownership scheme, and can carry out block-level client deduplication to achieve efficient and secure cloud storage deduplication scheme.
Centralized Duplicate Removal Video Storage System with Privacy Preservation in IoT.
Yan, Hongyang; Li, Xuan; Wang, Yu; Jia, Chunfu
2018-06-04
In recent years, the Internet of Things (IoT) has found wide application and attracted much attention. Since most of the end-terminals in IoT have limited capabilities for storage and computing, it has become a trend to outsource the data from local to cloud computing. To further reduce the communication bandwidth and storage space, data deduplication has been widely adopted to eliminate the redundant data. However, since data collected in IoT are sensitive and closely related to users' personal information, the privacy protection of users' information becomes a challenge. As the channels, like the wireless channels between the terminals and the cloud servers in IoT, are public and the cloud servers are not fully trusted, data have to be encrypted before being uploaded to the cloud. However, encryption makes the performance of deduplication by the cloud server difficult because the ciphertext will be different even if the underlying plaintext is identical. In this paper, we build a centralized privacy-preserving duplicate removal storage system, which supports both file-level and block-level deduplication. In order to avoid the leakage of statistical information of data, Intel Software Guard Extensions (SGX) technology is utilized to protect the deduplication process on the cloud server. The results of the experimental analysis demonstrate that the new scheme can significantly improve the deduplication efficiency and enhance the security. It is envisioned that the duplicated removal system with privacy preservation will be of great use in the centralized storage environment of IoT.
Cloud archiving and data mining of High-Resolution Rapid Refresh forecast model output
NASA Astrophysics Data System (ADS)
Blaylock, Brian K.; Horel, John D.; Liston, Samuel T.
2017-12-01
Weather-related research often requires synthesizing vast amounts of data that need archival solutions that are both economical and viable during and past the lifetime of the project. Public cloud computing services (e.g., from Amazon, Microsoft, or Google) or private clouds managed by research institutions are providing object data storage systems potentially appropriate for long-term archives of such large geophysical data sets. We illustrate the use of a private cloud object store developed by the Center for High Performance Computing (CHPC) at the University of Utah. Since early 2015, we have been archiving thousands of two-dimensional gridded fields (each one containing over 1.9 million values over the contiguous United States) from the High-Resolution Rapid Refresh (HRRR) data assimilation and forecast modeling system. The archive is being used for retrospective analyses of meteorological conditions during high-impact weather events, assessing the accuracy of the HRRR forecasts, and providing initial and boundary conditions for research simulations. The archive is accessible interactively and through automated download procedures for researchers at other institutions that can be tailored by the user to extract individual two-dimensional grids from within the highly compressed files. Characteristics of the CHPC object storage system are summarized relative to network file system storage or tape storage solutions. The CHPC storage system is proving to be a scalable, reliable, extensible, affordable, and usable archive solution for our research.
Prototyping manufacturing in the cloud
NASA Astrophysics Data System (ADS)
Ciortea, E. M.
2017-08-01
This paper attempts a theoretical approach to cloud systems with impacts on production systems. I call systems as cloud computing because form a relatively new concept in the field of informatics, representing an overall distributed computing services, applications, access to information and data storage without the user to know the physical location and configuration of systems. The advantages of this approach are especially computing speed and storage capacity without investment in additional configurations, synchronizing user data, data processing using web applications. The disadvantage is that it wants to identify a solution for data security, leading to mistrust users. The case study is applied to a module of the system of production, because the system is complex.
Efficient proof of ownership for cloud storage systems
NASA Astrophysics Data System (ADS)
Zhong, Weiwei; Liu, Zhusong
2017-08-01
Cloud storage system through the deduplication technology to save disk space and bandwidth, but the use of this technology has appeared targeted security attacks: the attacker can deceive the server to obtain ownership of the file by get the hash value of original file. In order to solve the above security problems and the different security requirements of the files in the cloud storage system, an efficient and information-theoretical secure proof of ownership sceme is proposed to support the file rating. Through the K-means algorithm to implement file rating, and use random seed technology and pre-calculation method to achieve safe and efficient proof of ownership scheme. Finally, the scheme is information-theoretical secure, and achieve better performance in the most sensitive areas of client-side I/O and computation.
A model of cloud application assignments in software-defined storages
NASA Astrophysics Data System (ADS)
Bolodurina, Irina P.; Parfenov, Denis I.; Polezhaev, Petr N.; E Shukhman, Alexander
2017-01-01
The aim of this study is to analyze the structure and mechanisms of interaction of typical cloud applications and to suggest the approaches to optimize their placement in storage systems. In this paper, we describe a generalized model of cloud applications including the three basic layers: a model of application, a model of service, and a model of resource. The distinctive feature of the model suggested implies analyzing cloud resources from the user point of view and from the point of view of a software-defined infrastructure of the virtual data center (DC). The innovation character of this model is in describing at the same time the application data placements, as well as the state of the virtual environment, taking into account the network topology. The model of software-defined storage has been developed as a submodel within the resource model. This model allows implementing the algorithm for control of cloud application assignments in software-defined storages. Experimental researches returned this algorithm decreases in cloud application response time and performance growth in user request processes. The use of software-defined data storages allows the decrease in the number of physical store devices, which demonstrates the efficiency of our algorithm.
Cloud Based Drive Forensic and DDoS Analysis on Seafile as Case Study
NASA Astrophysics Data System (ADS)
Bahaweres, R. B.; Santo, N. B.; Ningsih, A. S.
2017-01-01
The rapid development of Internet due to increasing data rates through both broadband cable networks and 4G wireless mobile, make everyone easily connected to the internet. Storages as Services (StaaS) is more popular and many users want to store their data in one place so that whenever they need they can easily access anywhere, any place and anytime in the cloud. The use of the service makes it vulnerable to use by someone to commit a crime or can do Denial of Service (DoS) on cloud storage services. The criminals can use the cloud storage services to store, upload and download illegal file or document to the cloud storage. In this study, we try to implement a private cloud storage using Seafile on Raspberry Pi and perform simulations in Local Area Network and Wi-Fi environment to analyze forensically to discover or open a criminal act can be traced and proved forensically. Also, we can identify, collect and analyze the artifact of server and client, such as a registry of the desktop client, the file system, the log of seafile, the cache of the browser, and database forensic.
Forensic Investigation of Cooperative Storage Cloud Service: Symform as a Case Study.
Teing, Yee-Yang; Dehghantanha, Ali; Choo, Kim-Kwang Raymond; Dargahi, Tooska; Conti, Mauro
2017-05-01
Researchers envisioned Storage as a Service (StaaS) as an effective solution to the distributed management of digital data. Cooperative storage cloud forensic is relatively new and is an under-explored area of research. Using Symform as a case study, we seek to determine the data remnants from the use of cooperative cloud storage services. In particular, we consider both mobile devices and personal computers running various popular operating systems, namely Windows 8.1, Mac OS X Mavericks 10.9.5, Ubuntu 14.04.1 LTS, iOS 7.1.2, and Android KitKat 4.4.4. Potential artefacts recovered during the research include data relating to the installation and uninstallation of the cloud applications, log-in to and log-out from Symform account using the client application, file synchronization as well as their time stamp information. This research contributes to an in-depth understanding of the types of terrestrial artifacts that are likely to remain after the use of cooperative storage cloud on client devices. © 2016 American Academy of Forensic Sciences.
Bailey, Sarah F; Scheible, Melissa K; Williams, Christopher; Silva, Deborah S B S; Hoggan, Marina; Eichman, Christopher; Faith, Seth A
2017-11-01
Next-generation Sequencing (NGS) is a rapidly evolving technology with demonstrated benefits for forensic genetic applications, and the strategies to analyze and manage the massive NGS datasets are currently in development. Here, the computing, data storage, connectivity, and security resources of the Cloud were evaluated as a model for forensic laboratory systems that produce NGS data. A complete front-to-end Cloud system was developed to upload, process, and interpret raw NGS data using a web browser dashboard. The system was extensible, demonstrating analysis capabilities of autosomal and Y-STRs from a variety of NGS instrumentation (Illumina MiniSeq and MiSeq, and Oxford Nanopore MinION). NGS data for STRs were concordant with standard reference materials previously characterized with capillary electrophoresis and Sanger sequencing. The computing power of the Cloud was implemented with on-demand auto-scaling to allow multiple file analysis in tandem. The system was designed to store resulting data in a relational database, amenable to downstream sample interpretations and databasing applications following the most recent guidelines in nomenclature for sequenced alleles. Lastly, a multi-layered Cloud security architecture was tested and showed that industry standards for securing data and computing resources were readily applied to the NGS system without disadvantageous effects for bioinformatic analysis, connectivity or data storage/retrieval. The results of this study demonstrate the feasibility of using Cloud-based systems for secured NGS data analysis, storage, databasing, and multi-user distributed connectivity. Copyright © 2017 Elsevier B.V. All rights reserved.
The design of an m-Health monitoring system based on a cloud computing platform
NASA Astrophysics Data System (ADS)
Xu, Boyi; Xu, Lida; Cai, Hongming; Jiang, Lihong; Luo, Yang; Gu, Yizhi
2017-01-01
Compared to traditional medical services provided within hospitals, m-Health monitoring systems (MHMSs) face more challenges in personalised health data processing. To achieve personalised and high-quality health monitoring by means of new technologies, such as mobile network and cloud computing, in this paper, a framework of an m-Health monitoring system based on a cloud computing platform (Cloud-MHMS) is designed to implement pervasive health monitoring. Furthermore, the modules of the framework, which are Cloud Storage and Multiple Tenants Access Control Layer, Healthcare Data Annotation Layer, and Healthcare Data Analysis Layer, are discussed. In the data storage layer, a multiple tenant access method is designed to protect patient privacy. In the data annotation layer, linked open data are adopted to augment health data interoperability semantically. In the data analysis layer, the process mining algorithm and similarity calculating method are implemented to support personalised treatment plan selection. These three modules cooperate to implement the core functions in the process of health monitoring, which are data storage, data processing, and data analysis. Finally, we study the application of our architecture in the monitoring of antimicrobial drug usage to demonstrate the usability of our method in personal healthcare analysis.
A protect solution for data security in mobile cloud storage
NASA Astrophysics Data System (ADS)
Yu, Xiaojun; Wen, Qiaoyan
2013-03-01
It is popular to access the cloud storage by mobile devices. However, this application suffer data security risk, especial the data leakage and privacy violate problem. This risk exists not only in cloud storage system, but also in mobile client platform. To reduce the security risk, this paper proposed a new security solution. It makes full use of the searchable encryption and trusted computing technology. Given the performance limit of the mobile devices, it proposes the trusted proxy based protection architecture. The design basic idea, deploy model and key flows are detailed. The analysis from the security and performance shows the advantage.
Storage quality-of-service in cloud-based scientific environments: a standardization approach
NASA Astrophysics Data System (ADS)
Millar, Paul; Fuhrmann, Patrick; Hardt, Marcus; Ertl, Benjamin; Brzezniak, Maciej
2017-10-01
When preparing the Data Management Plan for larger scientific endeavors, PIs have to balance between the most appropriate qualities of storage space along the line of the planned data life-cycle, its price and the available funding. Storage properties can be the media type, implicitly determining access latency and durability of stored data, the number and locality of replicas, as well as available access protocols or authentication mechanisms. Negotiations between the scientific community and the responsible infrastructures generally happen upfront, where the amount of storage space, media types, like: disk, tape and SSD and the foreseeable data life-cycles are negotiated. With the introduction of cloud management platforms, both in computing and storage, resources can be brokered to achieve the best price per unit of a given quality. However, in order to allow the platform orchestrator to programmatically negotiate the most appropriate resources, a standard vocabulary for different properties of resources and a commonly agreed protocol to communicate those, has to be available. In order to agree on a basic vocabulary for storage space properties, the storage infrastructure group in INDIGO-DataCloud together with INDIGO-associated and external scientific groups, created a working group under the umbrella of the Research Data Alliance (RDA). As communication protocol, to query and negotiate storage qualities, the Cloud Data Management Interface (CDMI) has been selected. Necessary extensions to CDMI are defined in regular meetings between INDIGO and the Storage Network Industry Association (SNIA). Furthermore, INDIGO is contributing to the SNIA CDMI reference implementation as the basis for interfacing the various storage systems in INDIGO to the agreed protocol and to provide an official Open-Source skeleton for systems not being maintained by INDIGO partners.
Cloud Computing for radiologists.
Kharat, Amit T; Safvi, Amjad; Thind, Ss; Singh, Amarjit
2012-07-01
Cloud computing is a concept wherein a computer grid is created using the Internet with the sole purpose of utilizing shared resources such as computer software, hardware, on a pay-per-use model. Using Cloud computing, radiology users can efficiently manage multimodality imaging units by using the latest software and hardware without paying huge upfront costs. Cloud computing systems usually work on public, private, hybrid, or community models. Using the various components of a Cloud, such as applications, client, infrastructure, storage, services, and processing power, Cloud computing can help imaging units rapidly scale and descale operations and avoid huge spending on maintenance of costly applications and storage. Cloud computing allows flexibility in imaging. It sets free radiology from the confines of a hospital and creates a virtual mobile office. The downsides to Cloud computing involve security and privacy issues which need to be addressed to ensure the success of Cloud computing in the future.
Cloud Computing for radiologists
Kharat, Amit T; Safvi, Amjad; Thind, SS; Singh, Amarjit
2012-01-01
Cloud computing is a concept wherein a computer grid is created using the Internet with the sole purpose of utilizing shared resources such as computer software, hardware, on a pay-per-use model. Using Cloud computing, radiology users can efficiently manage multimodality imaging units by using the latest software and hardware without paying huge upfront costs. Cloud computing systems usually work on public, private, hybrid, or community models. Using the various components of a Cloud, such as applications, client, infrastructure, storage, services, and processing power, Cloud computing can help imaging units rapidly scale and descale operations and avoid huge spending on maintenance of costly applications and storage. Cloud computing allows flexibility in imaging. It sets free radiology from the confines of a hospital and creates a virtual mobile office. The downsides to Cloud computing involve security and privacy issues which need to be addressed to ensure the success of Cloud computing in the future. PMID:23599560
Context-aware distributed cloud computing using CloudScheduler
NASA Astrophysics Data System (ADS)
Seuster, R.; Leavett-Brown, CR; Casteels, K.; Driemel, C.; Paterson, M.; Ring, D.; Sobie, RJ; Taylor, RP; Weldon, J.
2017-10-01
The distributed cloud using the CloudScheduler VM provisioning service is one of the longest running systems for HEP workloads. It has run millions of jobs for ATLAS and Belle II over the past few years using private and commercial clouds around the world. Our goal is to scale the distributed cloud to the 10,000-core level, with the ability to run any type of application (low I/O, high I/O and high memory) on any cloud. To achieve this goal, we have been implementing changes that utilize context-aware computing designs that are currently employed in the mobile communication industry. Context-awareness makes use of real-time and archived data to respond to user or system requirements. In our distributed cloud, we have many opportunistic clouds with no local HEP services, software or storage repositories. A context-aware design significantly improves the reliability and performance of our system by locating the nearest location of the required services. We describe how we are collecting and managing contextual information from our workload management systems, the clouds, the virtual machines and our services. This information is used not only to monitor the system but also to carry out automated corrective actions. We are incrementally adding new alerting and response services to our distributed cloud. This will enable us to scale the number of clouds and virtual machines. Further, a context-aware design will enable us to run analysis or high I/O application on opportunistic clouds. We envisage an open-source HTTP data federation (for example, the DynaFed system at CERN) as a service that would provide us access to existing storage elements used by the HEP experiments.
Cloud Based Educational Systems and Its Challenges and Opportunities and Issues
ERIC Educational Resources Information Center
Paul, Prantosh Kr.; Lata Dangwal, Kiran
2014-01-01
Cloud Computing (CC) is actually is a set of hardware, software, networks, storage, services an interface combines to deliver aspects of computing as a service. Cloud Computing (CC) actually uses the central remote servers to maintain data and applications. Practically Cloud Computing (CC) is extension of Grid computing with independency and…
Research on Key Technologies of Cloud Computing
NASA Astrophysics Data System (ADS)
Zhang, Shufen; Yan, Hongcan; Chen, Xuebin
With the development of multi-core processors, virtualization, distributed storage, broadband Internet and automatic management, a new type of computing mode named cloud computing is produced. It distributes computation task on the resource pool which consists of massive computers, so the application systems can obtain the computing power, the storage space and software service according to its demand. It can concentrate all the computing resources and manage them automatically by the software without intervene. This makes application offers not to annoy for tedious details and more absorbed in his business. It will be advantageous to innovation and reduce cost. It's the ultimate goal of cloud computing to provide calculation, services and applications as a public facility for the public, So that people can use the computer resources just like using water, electricity, gas and telephone. Currently, the understanding of cloud computing is developing and changing constantly, cloud computing still has no unanimous definition. This paper describes three main service forms of cloud computing: SAAS, PAAS, IAAS, compared the definition of cloud computing which is given by Google, Amazon, IBM and other companies, summarized the basic characteristics of cloud computing, and emphasized on the key technologies such as data storage, data management, virtualization and programming model.
NASA Astrophysics Data System (ADS)
Weeden, R.; Horn, W. B.; Dimarchi, H.; Arko, S. A.; Hogenson, K.
2017-12-01
A problem often faced by Earth science researchers is the question of how to scale algorithms that were developed against few datasets and take them to regional or global scales. This problem only gets worse as we look to a future with larger and larger datasets becoming available. One significant hurdle can be having the processing and storage resources available for such a task, not to mention the administration of those resources. As a processing environment, the cloud offers nearly unlimited potential for compute and storage, with limited administration required. The goal of the Hybrid Pluggable Processing Pipeline (HyP3) project was to demonstrate the utility of the Amazon cloud to process large amounts of data quickly and cost effectively. Principally built by three undergraduate students at the ASF DAAC, the HyP3 system relies on core Amazon cloud services such as Lambda, Relational Database Service (RDS), Elastic Compute Cloud (EC2), Simple Storage Service (S3), and Elastic Beanstalk. HyP3 provides an Application Programming Interface (API) through which users can programmatically interface with the HyP3 system; allowing them to monitor and control processing jobs running in HyP3, and retrieve the generated HyP3 products when completed. This presentation will focus on the development techniques and enabling technologies that were used in developing the HyP3 system. Data and process flow, from new subscription through to order completion will be shown, highlighting the benefits of the cloud for each step. Because the HyP3 system can be accessed directly from a user's Python scripts, powerful applications leveraging SAR products can be put together fairly easily. This is the true power of HyP3; allowing people to programmatically leverage the power of the cloud.
Mobile healthcare information management utilizing Cloud Computing and Android OS.
Doukas, Charalampos; Pliakas, Thomas; Maglogiannis, Ilias
2010-01-01
Cloud Computing provides functionality for managing information data in a distributed, ubiquitous and pervasive manner supporting several platforms, systems and applications. This work presents the implementation of a mobile system that enables electronic healthcare data storage, update and retrieval using Cloud Computing. The mobile application is developed using Google's Android operating system and provides management of patient health records and medical images (supporting DICOM format and JPEG2000 coding). The developed system has been evaluated using the Amazon's S3 cloud service. This article summarizes the implementation details and presents initial results of the system in practice.
Optimizing the Use of Storage Systems Provided by Cloud Computing Environments
NASA Astrophysics Data System (ADS)
Gallagher, J. H.; Potter, N.; Byrne, D. A.; Ogata, J.; Relph, J.
2013-12-01
Cloud computing systems present a set of features that include familiar computing resources (albeit augmented to support dynamic scaling of processing power) bundled with a mix of conventional and unconventional storage systems. The linux base on which many Cloud environments (e.g., Amazon) are based make it tempting to assume that any Unix software will run efficiently in this environment efficiently without change. OPeNDAP and NODC collaborated on a short project to explore how the S3 and Glacier storage systems provided by the Amazon Cloud Computing infrastructure could be used with a data server developed primarily to access data stored in a traditional Unix file system. Our work used the Amazon cloud system, but we strived for designs that could be adapted easily to other systems like OpenStack. Lastly, we evaluated different architectures from a computer security perspective. We found that there are considerable issues associated with treating S3 as if it is a traditional file system, even though doing so is conceptually simple. These issues include performance penalties because using a software tool that emulates a traditional file system to store data in S3 performs poorly when compared to a storing data directly in S3. We also found there are important benefits beyond performance to ensuring that data written to S3 can directly accessed without relying on a specific software tool. To provide a hierarchical organization to the data stored in S3, we wrote 'catalog' files, using XML. These catalog files map discrete files to S3 access keys. Like a traditional file system's directories, the catalogs can also contain references to other catalogs, providing a simple but effective hierarchy overlaid on top of S3's flat storage space. An added benefit to these catalogs is that they can be viewed in a web browser; our storage scheme provides both efficient access for the data server and access via a web browser. We also looked at the Glacier storage system and found that the system's response characteristics are very different from a traditional file system or database; it behaves like a near-line storage system. To be used by a traditional data server, the underlying access protocol must support asynchronous accesses. This is because the Glacier system takes a minimum of four hours to deliver any data object, so systems built with the expectation of instant access (i.e., most web systems) must be fundamentally changed to use Glacier. Part of a related project has been to develop an asynchronous access mode for OPeNDAP, and we have developed a design using that new addition to the DAP protocol with Glacier as a near-line mass store. In summary, we found that both S3 and Glacier require special treatment to be effectively used by a data server. It is important to add (new) interfaces to data servers that enable them to use these storage devices through their native interfaces. We also found that our designs could easily map to a cloud environment based on OpenStack. Lastly, we noted that while these designs invited more liberal use of remote references for data objects, that can expose software to new security risks.
dCache, Sync-and-Share for Big Data
NASA Astrophysics Data System (ADS)
Millar, AP; Fuhrmann, P.; Mkrtchyan, T.; Behrmann, G.; Bernardt, C.; Buchholz, Q.; Guelzow, V.; Litvintsev, D.; Schwank, K.; Rossi, A.; van der Reest, P.
2015-12-01
The availability of cheap, easy-to-use sync-and-share cloud services has split the scientific storage world into the traditional big data management systems and the very attractive sync-and-share services. With the former, the location of data is well understood while the latter is mostly operated in the Cloud, resulting in a rather complex legal situation. Beside legal issues, those two worlds have little overlap in user authentication and access protocols. While traditional storage technologies, popular in HEP, are based on X.509, cloud services and sync-and-share software technologies are generally based on username/password authentication or mechanisms like SAML or Open ID Connect. Similarly, data access models offered by both are somewhat different, with sync-and-share services often using proprietary protocols. As both approaches are very attractive, dCache.org developed a hybrid system, providing the best of both worlds. To avoid reinventing the wheel, dCache.org decided to embed another Open Source project: OwnCloud. This offers the required modern access capabilities but does not support the managed data functionality needed for large capacity data storage. With this hybrid system, scientists can share files and synchronize their data with laptops or mobile devices as easy as with any other cloud storage service. On top of this, the same data can be accessed via established mechanisms, like GridFTP to serve the Globus Transfer Service or the WLCG FTS3 tool, or the data can be made available to worker nodes or HPC applications via a mounted filesystem. As dCache provides a flexible authentication module, the same user can access its storage via different authentication mechanisms; e.g., X.509 and SAML. Additionally, users can specify the desired quality of service or trigger media transitions as necessary, thus tuning data access latency to the planned access profile. Such features are a natural consequence of using dCache. We will describe the design of the hybrid dCache/OwnCloud system, report on several months of operations experience running it at DESY, and elucidate the future road-map.
NASA Astrophysics Data System (ADS)
Nguyen, L.; Chee, T.; Minnis, P.; Spangenberg, D.; Ayers, J. K.; Palikonda, R.; Vakhnin, A.; Dubois, R.; Murphy, P. R.
2014-12-01
The processing, storage and dissemination of satellite cloud and radiation products produced at NASA Langley Research Center are key activities for the Climate Science Branch. A constellation of systems operates in sync to accomplish these goals. Because of the complexity involved with operating such intricate systems, there are both high failure rates and high costs for hardware and system maintenance. Cloud computing has the potential to ameliorate cost and complexity issues. Over time, the cloud computing model has evolved and hybrid systems comprising off-site as well as on-site resources are now common. Towards our mission of providing the highest quality research products to the widest audience, we have explored the use of the Amazon Web Services (AWS) Cloud and Storage and present a case study of our results and efforts. This project builds upon NASA Langley Cloud and Radiation Group's experience with operating large and complex computing infrastructures in a reliable and cost effective manner to explore novel ways to leverage cloud computing resources in the atmospheric science environment. Our case study presents the project requirements and then examines the fit of AWS with the LaRC computing model. We also discuss the evaluation metrics, feasibility, and outcomes and close the case study with the lessons we learned that would apply to others interested in exploring the implementation of the AWS system in their own atmospheric science computing environments.
Performance, Agility and Cost of Cloud Computing Services for NASA GES DISC Giovanni Application
NASA Astrophysics Data System (ADS)
Pham, L.; Chen, A.; Wharton, S.; Winter, E. L.; Lynnes, C.
2013-12-01
The NASA Goddard Earth Science Data and Information Services Center (GES DISC) is investigating the performance, agility and cost of Cloud computing for GES DISC applications. Giovanni (Geospatial Interactive Online Visualization ANd aNalysis Infrastructure), one of the core applications at the GES DISC for online climate-related Earth science data access, subsetting, analysis, visualization, and downloading, was used to evaluate the feasibility and effort of porting an application to the Amazon Cloud Services platform. The performance and the cost of running Giovanni on the Amazon Cloud were compared to similar parameters for the GES DISC local operational system. A Giovanni Time-Series analysis of aerosol absorption optical depth (388nm) from OMI (Ozone Monitoring Instrument)/Aura was selected for these comparisons. All required data were pre-cached in both the Cloud and local system to avoid data transfer delays. The 3-, 6-, 12-, and 24-month data were used for analysis on the Cloud and local system respectively, and the processing times for the analysis were used to evaluate system performance. To investigate application agility, Giovanni was installed and tested on multiple Cloud platforms. The cost of using a Cloud computing platform mainly consists of: computing, storage, data requests, and data transfer in/out. The Cloud computing cost is calculated based on the hourly rate, and the storage cost is calculated based on the rate of Gigabytes per month. Cost for incoming data transfer is free, and for data transfer out, the cost is based on the rate in Gigabytes. The costs for a local server system consist of buying hardware/software, system maintenance/updating, and operating cost. The results showed that the Cloud platform had a 38% better performance and cost 36% less than the local system. This investigation shows the potential of cloud computing to increase system performance and lower the overall cost of system management.
Challenges in Securing the Interface Between the Cloud and Pervasive Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lagesse, Brent J
2011-01-01
Cloud computing presents an opportunity for pervasive systems to leverage computational and storage resources to accomplish tasks that would not normally be possible on such resource-constrained devices. Cloud computing can enable hardware designers to build lighter systems that last longer and are more mobile. Despite the advantages cloud computing offers to the designers of pervasive systems, there are some limitations of leveraging cloud computing that must be addressed. We take the position that cloud-based pervasive system must be secured holistically and discuss ways this might be accomplished. In this paper, we discuss a pervasive system utilizing cloud computing resources andmore » issues that must be addressed in such a system. In this system, the user's mobile device cannot always have network access to leverage resources from the cloud, so it must make intelligent decisions about what data should be stored locally and what processes should be run locally. As a result of these decisions, the user becomes vulnerable to attacks while interfacing with the pervasive system.« less
Integration of XRootD into the cloud infrastructure for ALICE data analysis
NASA Astrophysics Data System (ADS)
Kompaniets, Mikhail; Shadura, Oksana; Svirin, Pavlo; Yurchenko, Volodymyr; Zarochentsev, Andrey
2015-12-01
Cloud technologies allow easy load balancing between different tasks and projects. From the viewpoint of the data analysis in the ALICE experiment, cloud allows to deploy software using Cern Virtual Machine (CernVM) and CernVM File System (CVMFS), to run different (including outdated) versions of software for long term data preservation and to dynamically allocate resources for different computing activities, e.g. grid site, ALICE Analysis Facility (AAF) and possible usage for local projects or other LHC experiments. We present a cloud solution for Tier-3 sites based on OpenStack and Ceph distributed storage with an integrated XRootD based storage element (SE). One of the key features of the solution is based on idea that Ceph has been used as a backend for Cinder Block Storage service for OpenStack, and in the same time as a storage backend for XRootD, with redundancy and availability of data preserved by Ceph settings. For faster and easier OpenStack deployment was applied the Packstack solution, which is based on the Puppet configuration management system. Ceph installation and configuration operations are structured and converted to Puppet manifests describing node configurations and integrated into Packstack. This solution can be easily deployed, maintained and used even in small groups with limited computing resources and small organizations, which usually have lack of IT support. The proposed infrastructure has been tested on two different clouds (SPbSU & BITP) and integrates successfully with the ALICE data analysis model.
Enabling Large-Scale Biomedical Analysis in the Cloud
Lin, Ying-Chih; Yu, Chin-Sheng; Lin, Yen-Jen
2013-01-01
Recent progress in high-throughput instrumentations has led to an astonishing growth in both volume and complexity of biomedical data collected from various sources. The planet-size data brings serious challenges to the storage and computing technologies. Cloud computing is an alternative to crack the nut because it gives concurrent consideration to enable storage and high-performance computing on large-scale data. This work briefly introduces the data intensive computing system and summarizes existing cloud-based resources in bioinformatics. These developments and applications would facilitate biomedical research to make the vast amount of diversification data meaningful and usable. PMID:24288665
Efficient secure-channel free public key encryption with keyword search for EMRs in cloud storage.
Guo, Lifeng; Yau, Wei-Chuen
2015-02-01
Searchable encryption is an important cryptographic primitive that enables privacy-preserving keyword search on encrypted electronic medical records (EMRs) in cloud storage. Efficiency of such searchable encryption in a medical cloud storage system is very crucial as it involves client platforms such as smartphones or tablets that only have constrained computing power and resources. In this paper, we propose an efficient secure-channel free public key encryption with keyword search (SCF-PEKS) scheme that is proven secure in the standard model. We show that our SCF-PEKS scheme is not only secure against chosen keyword and ciphertext attacks (IND-SCF-CKCA), but also secure against keyword guessing attacks (IND-KGA). Furthermore, our proposed scheme is more efficient than other recent SCF-PEKS schemes in the literature.
A Secure and Efficient Audit Mechanism for Dynamic Shared Data in Cloud Storage
2014-01-01
With popularization of cloud services, multiple users easily share and update their data through cloud storage. For data integrity and consistency in the cloud storage, the audit mechanisms were proposed. However, existing approaches have some security vulnerabilities and require a lot of computational overheads. This paper proposes a secure and efficient audit mechanism for dynamic shared data in cloud storage. The proposed scheme prevents a malicious cloud service provider from deceiving an auditor. Moreover, it devises a new index table management method and reduces the auditing cost by employing less complex operations. We prove the resistance against some attacks and show less computation cost and shorter time for auditing when compared with conventional approaches. The results present that the proposed scheme is secure and efficient for cloud storage services managing dynamic shared data. PMID:24959630
A secure and efficient audit mechanism for dynamic shared data in cloud storage.
Kwon, Ohmin; Koo, Dongyoung; Shin, Yongjoo; Yoon, Hyunsoo
2014-01-01
With popularization of cloud services, multiple users easily share and update their data through cloud storage. For data integrity and consistency in the cloud storage, the audit mechanisms were proposed. However, existing approaches have some security vulnerabilities and require a lot of computational overheads. This paper proposes a secure and efficient audit mechanism for dynamic shared data in cloud storage. The proposed scheme prevents a malicious cloud service provider from deceiving an auditor. Moreover, it devises a new index table management method and reduces the auditing cost by employing less complex operations. We prove the resistance against some attacks and show less computation cost and shorter time for auditing when compared with conventional approaches. The results present that the proposed scheme is secure and efficient for cloud storage services managing dynamic shared data.
BlueSky Cloud Framework: An E-Learning Framework Embracing Cloud Computing
NASA Astrophysics Data System (ADS)
Dong, Bo; Zheng, Qinghua; Qiao, Mu; Shu, Jian; Yang, Jie
Currently, E-Learning has grown into a widely accepted way of learning. With the huge growth of users, services, education contents and resources, E-Learning systems are facing challenges of optimizing resource allocations, dealing with dynamic concurrency demands, handling rapid storage growth requirements and cost controlling. In this paper, an E-Learning framework based on cloud computing is presented, namely BlueSky cloud framework. Particularly, the architecture and core components of BlueSky cloud framework are introduced. In BlueSky cloud framework, physical machines are virtualized, and allocated on demand for E-Learning systems. Moreover, BlueSky cloud framework combines with traditional middleware functions (such as load balancing and data caching) to serve for E-Learning systems as a general architecture. It delivers reliable, scalable and cost-efficient services to E-Learning systems, and E-Learning organizations can establish systems through these services in a simple way. BlueSky cloud framework solves the challenges faced by E-Learning, and improves the performance, availability and scalability of E-Learning systems.
Investigation of Storage Options for Scientific Computing on Grid and Cloud Facilities
NASA Astrophysics Data System (ADS)
Garzoglio, Gabriele
2012-12-01
In recent years, several new storage technologies, such as Lustre, Hadoop, OrangeFS, and BlueArc, have emerged. While several groups have run benchmarks to characterize them under a variety of configurations, more work is needed to evaluate these technologies for the use cases of scientific computing on Grid clusters and Cloud facilities. This paper discusses our evaluation of the technologies as deployed on a test bed at FermiCloud, one of the Fermilab infrastructure-as-a-service Cloud facilities. The test bed consists of 4 server-class nodes with 40 TB of disk space and up to 50 virtual machine clients, some running on the storage server nodes themselves. With this configuration, the evaluation compares the performance of some of these technologies when deployed on virtual machines and on “bare metal” nodes. In addition to running standard benchmarks such as IOZone to check the sanity of our installation, we have run I/O intensive tests using physics-analysis applications. This paper presents how the storage solutions perform in a variety of realistic use cases of scientific computing. One interesting difference among the storage systems tested is found in a decrease in total read throughput with increasing number of client processes, which occurs in some implementations but not others.
Integration of end-user Cloud storage for CMS analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Riahi, Hassen; Aimar, Alberto; Ayllon, Alejandro Alvarez
End-user Cloud storage is increasing rapidly in popularity in research communities thanks to the collaboration capabilities it offers, namely synchronisation and sharing. CERN IT has implemented a model of such storage named, CERNBox, integrated with the CERN AuthN and AuthZ services. To exploit the use of the end-user Cloud storage for the distributed data analysis activity, the CMS experiment has started the integration of CERNBox as a Grid resource. This will allow CMS users to make use of their own storage in the Cloud for their analysis activities as well as to benefit from synchronisation and sharing capabilities to achievemore » results faster and more effectively. It will provide an integration model of Cloud storages in the Grid, which is implemented and commissioned over the world’s largest computing Grid infrastructure, Worldwide LHC Computing Grid (WLCG). In this paper, we present the integration strategy and infrastructure changes needed in order to transparently integrate end-user Cloud storage with the CMS distributed computing model. We describe the new challenges faced in data management between Grid and Cloud and how they were addressed, along with details of the support for Cloud storage recently introduced into the WLCG data movement middleware, FTS3. Finally, the commissioning experience of CERNBox for the distributed data analysis activity is also presented.« less
Integration of end-user Cloud storage for CMS analysis
Riahi, Hassen; Aimar, Alberto; Ayllon, Alejandro Alvarez; ...
2017-05-19
End-user Cloud storage is increasing rapidly in popularity in research communities thanks to the collaboration capabilities it offers, namely synchronisation and sharing. CERN IT has implemented a model of such storage named, CERNBox, integrated with the CERN AuthN and AuthZ services. To exploit the use of the end-user Cloud storage for the distributed data analysis activity, the CMS experiment has started the integration of CERNBox as a Grid resource. This will allow CMS users to make use of their own storage in the Cloud for their analysis activities as well as to benefit from synchronisation and sharing capabilities to achievemore » results faster and more effectively. It will provide an integration model of Cloud storages in the Grid, which is implemented and commissioned over the world’s largest computing Grid infrastructure, Worldwide LHC Computing Grid (WLCG). In this paper, we present the integration strategy and infrastructure changes needed in order to transparently integrate end-user Cloud storage with the CMS distributed computing model. We describe the new challenges faced in data management between Grid and Cloud and how they were addressed, along with details of the support for Cloud storage recently introduced into the WLCG data movement middleware, FTS3. Finally, the commissioning experience of CERNBox for the distributed data analysis activity is also presented.« less
NASA Astrophysics Data System (ADS)
Morikawa, Y.; Murata, K. T.; Watari, S.; Kato, H.; Yamamoto, K.; Inoue, S.; Tsubouchi, K.; Fukazawa, K.; Kimura, E.; Tatebe, O.; Shimojo, S.
2010-12-01
Main methodologies of Solar-Terrestrial Physics (STP) so far are theoretical, experimental and observational, and computer simulation approaches. Recently "informatics" is expected as a new (fourth) approach to the STP studies. Informatics is a methodology to analyze large-scale data (observation data and computer simulation data) to obtain new findings using a variety of data processing techniques. At NICT (National Institute of Information and Communications Technology, Japan) we are now developing a new research environment named "OneSpaceNet". The OneSpaceNet is a cloud-computing environment specialized for science works, which connects many researchers with high-speed network (JGN: Japan Gigabit Network). The JGN is a wide-area back-born network operated by NICT; it provides 10G network and many access points (AP) over Japan. The OneSpaceNet also provides with rich computer resources for research studies, such as super-computers, large-scale data storage area, licensed applications, visualization devices (like tiled display wall: TDW), database/DBMS, cluster computers (4-8 nodes) for data processing and communication devices. What is amazing in use of the science cloud is that a user simply prepares a terminal (low-cost PC). Once connecting the PC to JGN2plus, the user can make full use of the rich resources of the science cloud. Using communication devices, such as video-conference system, streaming and reflector servers, and media-players, the users on the OneSpaceNet can make research communications as if they belong to a same (one) laboratory: they are members of a virtual laboratory. The specification of the computer resources on the OneSpaceNet is as follows: The size of data storage we have developed so far is almost 1PB. The number of the data files managed on the cloud storage is getting larger and now more than 40,000,000. What is notable is that the disks forming the large-scale storage are distributed to 5 data centers over Japan (but the storage system performs as one disk). There are three supercomputers allocated on the cloud, one from Tokyo, one from Osaka and the other from Nagoya. One's simulation job data on any supercomputers are saved on the cloud data storage (same directory); it is a kind of virtual computing environment. The tiled display wall has 36 panels acting as one display; the pixel (resolution) size of it is as large as 18000x4300. This size is enough to preview or analyze the large-scale computer simulation data. It also allows us to take a look of multiple (e.g., 100 pictures) on one screen together with many researchers. In our talk we also present a brief report of the initial results using the OneSpaceNet for Global MHD simulations as an example of successful use of our science cloud; (i) Ultra-high time resolution visualization of Global MHD simulations on the large-scale storage and parallel processing system on the cloud, (ii) Database of real-time Global MHD simulation and statistic analyses of the data, and (iii) 3D Web service of Global MHD simulations.
Utilizing HDF4 File Content Maps for the Cloud
NASA Technical Reports Server (NTRS)
Lee, Hyokyung Joe
2016-01-01
We demonstrate a prototype study that HDF4 file content map can be used for efficiently organizing data in cloud object storage system to facilitate cloud computing. This approach can be extended to any binary data formats and to any existing big data analytics solution powered by cloud computing because HDF4 file content map project started as long term preservation of NASA data that doesn't require HDF4 APIs to access data.
AstroCloud, a Cyber-Infrastructure for Astronomy Research: Cloud Computing Environments
NASA Astrophysics Data System (ADS)
Li, C.; Wang, J.; Cui, C.; He, B.; Fan, D.; Yang, Y.; Chen, J.; Zhang, H.; Yu, C.; Xiao, J.; Wang, C.; Cao, Z.; Fan, Y.; Hong, Z.; Li, S.; Mi, L.; Wan, W.; Wang, J.; Yin, S.
2015-09-01
AstroCloud is a cyber-Infrastructure for Astronomy Research initiated by Chinese Virtual Observatory (China-VO) under funding support from NDRC (National Development and Reform commission) and CAS (Chinese Academy of Sciences). Based on CloudStack, an open source software, we set up the cloud computing environment for AstroCloud Project. It consists of five distributed nodes across the mainland of China. Users can use and analysis data in this cloud computing environment. Based on GlusterFS, we built a scalable cloud storage system. Each user has a private space, which can be shared among different virtual machines and desktop systems. With this environments, astronomer can access to astronomical data collected by different telescopes and data centers easily, and data producers can archive their datasets safely.
The Third International Cloud Condensation Nuclei Workshop. [conference
NASA Technical Reports Server (NTRS)
Kocmond, W. C.; Rogers, C. R. (Editor); Rea, S. W. (Editor)
1981-01-01
Twenty-five instruments were tested, including size characterization devices and two Aitken counters. The test aerosols were supplied to the instruments by an on-line generation system, thereby eliminating the need for storage bags. Cloud condensation chambers and haze chambers are highlighted.
Bio and health informatics meets cloud : BioVLab as an example.
Chae, Heejoon; Jung, Inuk; Lee, Hyungro; Marru, Suresh; Lee, Seong-Whan; Kim, Sun
2013-01-01
The exponential increase of genomic data brought by the advent of the next or the third generation sequencing (NGS) technologies and the dramatic drop in sequencing cost have driven biological and medical sciences to data-driven sciences. This revolutionary paradigm shift comes with challenges in terms of data transfer, storage, computation, and analysis of big bio/medical data. Cloud computing is a service model sharing a pool of configurable resources, which is a suitable workbench to address these challenges. From the medical or biological perspective, providing computing power and storage is the most attractive feature of cloud computing in handling the ever increasing biological data. As data increases in size, many research organizations start to experience the lack of computing power, which becomes a major hurdle in achieving research goals. In this paper, we review the features of publically available bio and health cloud systems in terms of graphical user interface, external data integration, security and extensibility of features. We then discuss about issues and limitations of current cloud systems and conclude with suggestion of a biological cloud environment concept, which can be defined as a total workbench environment assembling computational tools and databases for analyzing bio/medical big data in particular application domains.
What CFOs should know before venturing into the cloud.
Rajendran, Janakan
2013-05-01
There are three major trends in the use of cloud-based services for healthcare IT: Cloud computing involves the hosting of health IT applications in a service provider cloud. Cloud storage is a data storage service that can involve, for example, long-term storage and archival of information such as clinical data, medical images, and scanned documents. Data center colocation involves rental of secure space in the cloud from a vendor, an approach that allows a hospital to share power capacity and proven security protocols, reducing costs.
Point-Cloud Compression for Vehicle-Based Mobile Mapping Systems Using Portable Network Graphics
NASA Astrophysics Data System (ADS)
Kohira, K.; Masuda, H.
2017-09-01
A mobile mapping system is effective for capturing dense point-clouds of roads and roadside objects Point-clouds of urban areas, residential areas, and arterial roads are useful for maintenance of infrastructure, map creation, and automatic driving. However, the data size of point-clouds measured in large areas is enormously large. A large storage capacity is required to store such point-clouds, and heavy loads will be taken on network if point-clouds are transferred through the network. Therefore, it is desirable to reduce data sizes of point-clouds without deterioration of quality. In this research, we propose a novel point-cloud compression method for vehicle-based mobile mapping systems. In our compression method, point-clouds are mapped onto 2D pixels using GPS time and the parameters of the laser scanner. Then, the images are encoded in the Portable Networking Graphics (PNG) format and compressed using the PNG algorithm. In our experiments, our method could efficiently compress point-clouds without deteriorating the quality.
A Combination Therapy of JO-I and Chemotherapy in Ovarian Cancer Models
2013-10-01
which consists of a 3PAR storage backend and is sharing data via a highly available NetApp storage gateway and 2 high throughput commodity storage...Environment is configured as self- service Enterprise cloud and currently hosts more than 700 virtual machines. The network infrastructure consists of...technology infrastructure and information system applications designed to integrate, automate, and standardize operations. These systems fuse state of
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duro, Francisco Rodrigo; Blas, Javier Garcia; Isaila, Florin
The increasing volume of scientific data and the limited scalability and performance of storage systems are currently presenting a significant limitation for the productivity of the scientific workflows running on both high-performance computing (HPC) and cloud platforms. Clearly needed is better integration of storage systems and workflow engines to address this problem. This paper presents and evaluates a novel solution that leverages codesign principles for integrating Hercules—an in-memory data store—with a workflow management system. We consider four main aspects: workflow representation, task scheduling, task placement, and task termination. As a result, the experimental evaluation on both cloud and HPC systemsmore » demonstrates significant performance and scalability improvements over existing state-of-the-art approaches.« less
Maratt, Joseph D; Srinivasan, Ramesh C; Dahl, William J; Schilling, Peter L; Urquhart, Andrew G
2012-08-01
As digital radiography becomes more prevalent, several systems for digital preoperative planning have become available. The purpose of this study was to evaluate the accuracy and efficiency of an inexpensive, cloud-based digital templating system, which is comparable with acetate templating. However, cloud-based templating is substantially faster and more convenient than acetate templating or locally installed software. Although this is a practical solution for this particular medical application, regulatory changes are necessary before the tremendous advantages of cloud-based storage and computing can be realized in medical research and clinical practice. Copyright 2012, SLACK Incorporated.
cadcVOFS: A FUSE Based File System Layer for VOSpace
NASA Astrophysics Data System (ADS)
Kavelaars, J.; Dowler, P.; Jenkins, D.; Hill, N.; Damian, A.
2012-09-01
The CADC is now making extensive use of the VOSpace protocol for user managed storage. The VOSpace standard allows a diverse set of rich data services to be delivered to users via a simple protocol. We have recently developed the cadcVOFS, a FUSE based file-system layer for VOSpace. cadcVOFS provides a filesystem layer on-top of VOSpace so that standard Unix tools (such as ‘find’, ‘emacs’, ‘awk’ etc) can be used directly on the data objects stored in VOSpace. Once mounted the VOSpace appears as a network storage volume inside the operating system. Within the CADC Cloud Computing project (CANFAR) we have used VOSpace as the method for retrieving and storing processing inputs and products. The abstraction of storage is an important component of Cloud Computing and the high use level of our VOSpace service reflects this.
Partial Storage Optimization and Load Control Strategy of Cloud Data Centers
2015-01-01
We present a novel approach to solve the cloud storage issues and provide a fast load balancing algorithm. Our approach is based on partitioning and concurrent dual direction download of the files from multiple cloud nodes. Partitions of the files are saved on the cloud rather than the full files, which provide a good optimization to the cloud storage usage. Only partial replication is used in this algorithm to ensure the reliability and availability of the data. Our focus is to improve the performance and optimize the storage usage by providing the DaaS on the cloud. This algorithm solves the problem of having to fully replicate large data sets, which uses up a lot of precious space on the cloud nodes. Reducing the space needed will help in reducing the cost of providing such space. Moreover, performance is also increased since multiple cloud servers will collaborate to provide the data to the cloud clients in a faster manner. PMID:25973444
Partial storage optimization and load control strategy of cloud data centers.
Al Nuaimi, Klaithem; Mohamed, Nader; Al Nuaimi, Mariam; Al-Jaroodi, Jameela
2015-01-01
We present a novel approach to solve the cloud storage issues and provide a fast load balancing algorithm. Our approach is based on partitioning and concurrent dual direction download of the files from multiple cloud nodes. Partitions of the files are saved on the cloud rather than the full files, which provide a good optimization to the cloud storage usage. Only partial replication is used in this algorithm to ensure the reliability and availability of the data. Our focus is to improve the performance and optimize the storage usage by providing the DaaS on the cloud. This algorithm solves the problem of having to fully replicate large data sets, which uses up a lot of precious space on the cloud nodes. Reducing the space needed will help in reducing the cost of providing such space. Moreover, performance is also increased since multiple cloud servers will collaborate to provide the data to the cloud clients in a faster manner.
Adventures in Private Cloud: Balancing Cost and Capability at the CloudSat Data Processing Center
NASA Astrophysics Data System (ADS)
Partain, P.; Finley, S.; Fluke, J.; Haynes, J. M.; Cronk, H. Q.; Miller, S. D.
2016-12-01
Since the beginning of the CloudSat Mission in 2006, The CloudSat Data Processing Center (DPC) at the Cooperative Institute for Research in the Atmosphere (CIRA) has been ingesting data from the satellite and other A-Train sensors, producing data products, and distributing them to researchers around the world. The computing infrastructure was specifically designed to fulfill the requirements as specified at the beginning of what nominally was a two-year mission. The environment consisted of servers dedicated to specific processing tasks in a rigid workflow to generate the required products. To the benefit of science and with credit to the mission engineers, CloudSat has lasted well beyond its planned lifetime and is still collecting data ten years later. Over that period requirements of the data processing system have greatly expanded and opportunities for providing value-added services have presented themselves. But while demands on the system have increased, the initial design allowed for very little expansion in terms of scalability and flexibility. The design did change to include virtual machine processing nodes and distributed workflows but infrastructure management was still a time consuming task when system modification was required to run new tests or implement new processes. To address the scalability, flexibility, and manageability of the system Cloud computing methods and technologies are now being employed. The use of a public cloud like Amazon Elastic Compute Cloud or Google Compute Engine was considered but, among other issues, data transfer and storage cost becomes a problem especially when demand fluctuates as a result of reprocessing and the introduction of new products and services. Instead, the existing system was converted to an on premises private Cloud using the OpenStack computing platform and Ceph software defined storage to reap the benefits of the Cloud computing paradigm. This work details the decisions that were made, the benefits that have been realized, the difficulties that were encountered and issues that still exist.
Analysis and Research on Spatial Data Storage Model Based on Cloud Computing Platform
NASA Astrophysics Data System (ADS)
Hu, Yong
2017-12-01
In this paper, the data processing and storage characteristics of cloud computing are analyzed and studied. On this basis, a cloud computing data storage model based on BP neural network is proposed. In this data storage model, it can carry out the choice of server cluster according to the different attributes of the data, so as to complete the spatial data storage model with load balancing function, and have certain feasibility and application advantages.
Zhu, Lingyun; Li, Lianjie; Meng, Chunyan
2014-12-01
There have been problems in the existing multiple physiological parameter real-time monitoring system, such as insufficient server capacity for physiological data storage and analysis so that data consistency can not be guaranteed, poor performance in real-time, and other issues caused by the growing scale of data. We therefore pro posed a new solution which was with multiple physiological parameters and could calculate clustered background data storage and processing based on cloud computing. Through our studies, a batch processing for longitudinal analysis of patients' historical data was introduced. The process included the resource virtualization of IaaS layer for cloud platform, the construction of real-time computing platform of PaaS layer, the reception and analysis of data stream of SaaS layer, and the bottleneck problem of multi-parameter data transmission, etc. The results were to achieve in real-time physiological information transmission, storage and analysis of a large amount of data. The simulation test results showed that the remote multiple physiological parameter monitoring system based on cloud platform had obvious advantages in processing time and load balancing over the traditional server model. This architecture solved the problems including long turnaround time, poor performance of real-time analysis, lack of extensibility and other issues, which exist in the traditional remote medical services. Technical support was provided in order to facilitate a "wearable wireless sensor plus mobile wireless transmission plus cloud computing service" mode moving towards home health monitoring for multiple physiological parameter wireless monitoring.
Real-time terrain storage generation from multiple sensors towards mobile robot operation interface.
Song, Wei; Cho, Seoungjae; Xi, Yulong; Cho, Kyungeun; Um, Kyhyun
2014-01-01
A mobile robot mounted with multiple sensors is used to rapidly collect 3D point clouds and video images so as to allow accurate terrain modeling. In this study, we develop a real-time terrain storage generation and representation system including a nonground point database (PDB), ground mesh database (MDB), and texture database (TDB). A voxel-based flag map is proposed for incrementally registering large-scale point clouds in a terrain model in real time. We quantize the 3D point clouds into 3D grids of the flag map as a comparative table in order to remove the redundant points. We integrate the large-scale 3D point clouds into a nonground PDB and a node-based terrain mesh using the CPU. Subsequently, we program a graphics processing unit (GPU) to generate the TDB by mapping the triangles in the terrain mesh onto the captured video images. Finally, we produce a nonground voxel map and a ground textured mesh as a terrain reconstruction result. Our proposed methods were tested in an outdoor environment. Our results show that the proposed system was able to rapidly generate terrain storage and provide high resolution terrain representation for mobile mapping services and a graphical user interface between remote operators and mobile robots.
Real-Time Terrain Storage Generation from Multiple Sensors towards Mobile Robot Operation Interface
Cho, Seoungjae; Xi, Yulong; Cho, Kyungeun
2014-01-01
A mobile robot mounted with multiple sensors is used to rapidly collect 3D point clouds and video images so as to allow accurate terrain modeling. In this study, we develop a real-time terrain storage generation and representation system including a nonground point database (PDB), ground mesh database (MDB), and texture database (TDB). A voxel-based flag map is proposed for incrementally registering large-scale point clouds in a terrain model in real time. We quantize the 3D point clouds into 3D grids of the flag map as a comparative table in order to remove the redundant points. We integrate the large-scale 3D point clouds into a nonground PDB and a node-based terrain mesh using the CPU. Subsequently, we program a graphics processing unit (GPU) to generate the TDB by mapping the triangles in the terrain mesh onto the captured video images. Finally, we produce a nonground voxel map and a ground textured mesh as a terrain reconstruction result. Our proposed methods were tested in an outdoor environment. Our results show that the proposed system was able to rapidly generate terrain storage and provide high resolution terrain representation for mobile mapping services and a graphical user interface between remote operators and mobile robots. PMID:25101321
Suciu, George; Suciu, Victor; Martian, Alexandru; Craciunescu, Razvan; Vulpe, Alexandru; Marcu, Ioana; Halunga, Simona; Fratu, Octavian
2015-11-01
Big data storage and processing are considered as one of the main applications for cloud computing systems. Furthermore, the development of the Internet of Things (IoT) paradigm has advanced the research on Machine to Machine (M2M) communications and enabled novel tele-monitoring architectures for E-Health applications. However, there is a need for converging current decentralized cloud systems, general software for processing big data and IoT systems. The purpose of this paper is to analyze existing components and methods of securely integrating big data processing with cloud M2M systems based on Remote Telemetry Units (RTUs) and to propose a converged E-Health architecture built on Exalead CloudView, a search based application. Finally, we discuss the main findings of the proposed implementation and future directions.
The Metadata Cloud: The Last Piece of a Distributed Data System Model
NASA Astrophysics Data System (ADS)
King, T. A.; Cecconi, B.; Hughes, J. S.; Walker, R. J.; Roberts, D.; Thieman, J. R.; Joy, S. P.; Mafi, J. N.; Gangloff, M.
2012-12-01
Distributed data systems have existed ever since systems were networked together. Over the years the model for distributed data systems have evolved from basic file transfer to client-server to multi-tiered to grid and finally to cloud based systems. Initially metadata was tightly coupled to the data either by embedding the metadata in the same file containing the data or by co-locating the metadata in commonly named files. As the sources of data multiplied, data volumes have increased and services have specialized to improve efficiency; a cloud system model has emerged. In a cloud system computing and storage are provided as services with accessibility emphasized over physical location. Computation and data clouds are common implementations. Effectively using the data and computation capabilities requires metadata. When metadata is stored separately from the data; a metadata cloud is formed. With a metadata cloud information and knowledge about data resources can migrate efficiently from system to system, enabling services and allowing the data to remain efficiently stored until used. This is especially important with "Big Data" where movement of the data is limited by bandwidth. We examine how the metadata cloud completes a general distributed data system model, how standards play a role and relate this to the existing types of cloud computing. We also look at the major science data systems in existence and compare each to the generalized cloud system model.
Investigation of storage options for scientific computing on Grid and Cloud facilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garzoglio, Gabriele
In recent years, several new storage technologies, such as Lustre, Hadoop, OrangeFS, and BlueArc, have emerged. While several groups have run benchmarks to characterize them under a variety of configurations, more work is needed to evaluate these technologies for the use cases of scientific computing on Grid clusters and Cloud facilities. This paper discusses our evaluation of the technologies as deployed on a test bed at FermiCloud, one of the Fermilab infrastructure-as-a-service Cloud facilities. The test bed consists of 4 server-class nodes with 40 TB of disk space and up to 50 virtual machine clients, some running on the storagemore » server nodes themselves. With this configuration, the evaluation compares the performance of some of these technologies when deployed on virtual machines and on bare metal nodes. In addition to running standard benchmarks such as IOZone to check the sanity of our installation, we have run I/O intensive tests using physics-analysis applications. This paper presents how the storage solutions perform in a variety of realistic use cases of scientific computing. One interesting difference among the storage systems tested is found in a decrease in total read throughput with increasing number of client processes, which occurs in some implementations but not others.« less
Factors Influencing the Adoption of Cloud Storage by Information Technology Decision Makers
ERIC Educational Resources Information Center
Wheelock, Michael D.
2013-01-01
This dissertation uses a survey methodology to determine the factors behind the decision to adopt cloud storage. The dependent variable in the study is the intent to adopt cloud storage. Four independent variables are utilized including need, security, cost-effectiveness and reliability. The survey includes a pilot test, field test and statistical…
Prior-Based Quantization Bin Matching for Cloud Storage of JPEG Images.
Liu, Xianming; Cheung, Gene; Lin, Chia-Wen; Zhao, Debin; Gao, Wen
2018-07-01
Millions of user-generated images are uploaded to social media sites like Facebook daily, which translate to a large storage cost. However, there exists an asymmetry in upload and download data: only a fraction of the uploaded images are subsequently retrieved for viewing. In this paper, we propose a cloud storage system that reduces the storage cost of all uploaded JPEG photos, at the expense of a controlled increase in computation mainly during download of requested image subset. Specifically, the system first selectively re-encodes code blocks of uploaded JPEG images using coarser quantization parameters for smaller storage sizes. Then during download, the system exploits known signal priors-sparsity prior and graph-signal smoothness prior-for reverse mapping to recover original fine quantization bin indices, with either deterministic guarantee (lossless mode) or statistical guarantee (near-lossless mode). For fast reverse mapping, we use small dictionaries and sparse graphs that are tailored for specific clusters of similar blocks, which are classified via tree-structured vector quantizer. During image upload, cluster indices identifying the appropriate dictionaries and graphs for the re-quantized blocks are encoded as side information using a differential distributed source coding scheme to facilitate reverse mapping during image download. Experimental results show that our system can reap significant storage savings (up to 12.05%) at roughly the same image PSNR (within 0.18 dB).
Towards Efficient Scientific Data Management Using Cloud Storage
NASA Technical Reports Server (NTRS)
He, Qiming
2013-01-01
A software prototype allows users to backup and restore data to/from both public and private cloud storage such as Amazon's S3 and NASA's Nebula. Unlike other off-the-shelf tools, this software ensures user data security in the cloud (through encryption), and minimizes users operating costs by using space- and bandwidth-efficient compression and incremental backup. Parallel data processing utilities have also been developed by using massively scalable cloud computing in conjunction with cloud storage. One of the innovations in this software is using modified open source components to work with a private cloud like NASA Nebula. Another innovation is porting the complex backup to- cloud software to embedded Linux, running on the home networking devices, in order to benefit more users.
Applications integration in a hybrid cloud computing environment: modelling and platform
NASA Astrophysics Data System (ADS)
Li, Qing; Wang, Ze-yuan; Li, Wei-hua; Li, Jun; Wang, Cheng; Du, Rui-yang
2013-08-01
With the development of application services providers and cloud computing, more and more small- and medium-sized business enterprises use software services and even infrastructure services provided by professional information service companies to replace all or part of their information systems (ISs). These information service companies provide applications, such as data storage, computing processes, document sharing and even management information system services as public resources to support the business process management of their customers. However, no cloud computing service vendor can satisfy the full functional IS requirements of an enterprise. As a result, enterprises often have to simultaneously use systems distributed in different clouds and their intra enterprise ISs. Thus, this article presents a framework to integrate applications deployed in public clouds and intra ISs. A run-time platform is developed and a cross-computing environment process modelling technique is also developed to improve the feasibility of ISs under hybrid cloud computing environments.
The Best of Both Worlds: Developing a Hybrid Data System for the ASF DAAC
NASA Astrophysics Data System (ADS)
Arko, S. A.; Buechler, B.; Wolf, V. G.
2017-12-01
The Alaska Satellite Facility (ASF) at the University of Alaska Fairbanks hosts the NASA Distributed Active Archive Center (DAAC) specializing in synthetic aperture radar (SAR). Historically, the ASF DAAC has hosted hardware on-premises and developed DAAC-specific software to operate, manage, and maintain the DAAC data system. In the past year, ASF DAAC has been moving many of the standard DAAC operations into the Amazon Web Services (AWS) cloud. This includes data ingest, basic pre-processing, archiving, and distribution within the AWS environment. While the cloud offers nearly unbounded capacity for expansion and a great host of services, there also can be unexpected and unplanned costs for such. Additionally, these costs can be difficult to forecast even with historic data usage patterns and models for future usage. In an effort to maximize the effectiveness of the DAAC data system, while still managing and accurately forecasting costs, ASF DAAC has developed a hybrid, cloud and on-premises, data system. The goal of this project is to make extensive use of the AWS cloud, and when appropriate, utilize on-premises resources to help constrain costs. This hybrid system attempts to mimic, on premises, a cloud environment using Kubernetes container orchestration in order that software can be run in either location with little change. Combined with hybrid data storage architecture, the new data system makes use of the great capacity of the cloud while maintaining an on-premises options. This presentation will describe the development of the hybrid data system, including the micro-services architecture and design, the container orchestration, and hybrid storage. Additional we will highlight the lessons learned through the development process, cost forecasting for current and future SAR-mission operations, and provide a discussion of the pros and cons of hybrid architectures versus all-cloud deployments. This development effort has led to a system that is capable and flexible for the future while allowing ASF DAAC to continue supporting the SAR community with the highest level of services.
NASA Astrophysics Data System (ADS)
Wang, Xi Vincent; Wang, Lihui
2017-08-01
Cloud computing is the new enabling technology that offers centralised computing, flexible data storage and scalable services. In the manufacturing context, it is possible to utilise the Cloud technology to integrate and provide industrial resources and capabilities in terms of Cloud services. In this paper, a function block-based integration mechanism is developed to connect various types of production resources. A Cloud-based architecture is also deployed to offer a service pool which maintains these resources as production services. The proposed system provides a flexible and integrated information environment for the Cloud-based production system. As a specific type of manufacturing, Waste Electrical and Electronic Equipment (WEEE) remanufacturing experiences difficulties in system integration, information exchange and resource management. In this research, WEEE is selected as the example of Internet of Things to demonstrate how the obstacles and bottlenecks are overcome with the help of Cloud-based informatics approach. In the case studies, the WEEE recycle/recovery capabilities are also integrated and deployed as flexible Cloud services. Supporting mechanisms and technologies are presented and evaluated towards the end of the paper.
Stronger Consistency and Semantics for Low-Latency Geo-Replicated Storage
2013-06-01
Wallach, Mike Burrows , Tushar Chandra, Andrew Fikes, and Robert E. Gruber. Bigtable: A distributed storage system for structured data. ACM TOCS, 26(2...propagation for weakly consistent replication. In SOSP, October 1997. [60] Larry Peterson, Andy Bavier, and Sapan Bhatia. VICCI: A programmable cloud
Efficient Cryptography for the Next Generation Secure Cloud
ERIC Educational Resources Information Center
Kupcu, Alptekin
2010-01-01
Peer-to-peer (P2P) systems, and client-server type storage and computation outsourcing constitute some of the major applications that the next generation cloud schemes will address. Since these applications are just emerging, it is the perfect time to design them with security and privacy in mind. Furthermore, considering the high-churn…
NASA Technical Reports Server (NTRS)
Divinskaya, B. S.; Salman, Y. M.
1975-01-01
Peculiarities of the radar information about clouds are examined in comparison with visual data. An objective radar classification is presented and the relation of it to the meteorological classification is shown. The advisability of storage and summarization of the primary radar data for regime purposes is substantiated.
ERIC Educational Resources Information Center
Waters, John K.
2011-01-01
The vulnerability and inefficiency of backing up data on-site is prompting school districts to switch to more secure, less troublesome cloud-based options. District auditors are pushing for a better way to back up their data than the on-site, tape-based system that had been used for years. About three years ago, Hendrick School District in…
2015-07-01
Reactive kVAR Kilo Watts kW Lithium Ion Li Ion Lithium-Titanate Oxide nLTO Natural gas NG Performance Objectives PO Photovoltaic PV Power ...cloud covered) periods. The demonstration features a large (relative to the overall system power requirements) photovoltaic solar array, whose inverter...microgrid with less expensive power storage instead of large scale energy storage and that the renewable energy with small-scale power storage can
Emerging Security Mechanisms for Medical Cyber Physical Systems.
Kocabas, Ovunc; Soyata, Tolga; Aktas, Mehmet K
2016-01-01
The following decade will witness a surge in remote health-monitoring systems that are based on body-worn monitoring devices. These Medical Cyber Physical Systems (MCPS) will be capable of transmitting the acquired data to a private or public cloud for storage and processing. Machine learning algorithms running in the cloud and processing this data can provide decision support to healthcare professionals. There is no doubt that the security and privacy of the medical data is one of the most important concerns in designing an MCPS. In this paper, we depict the general architecture of an MCPS consisting of four layers: data acquisition, data aggregation, cloud processing, and action. Due to the differences in hardware and communication capabilities of each layer, different encryption schemes must be used to guarantee data privacy within that layer. We survey conventional and emerging encryption schemes based on their ability to provide secure storage, data sharing, and secure computation. Our detailed experimental evaluation of each scheme shows that while the emerging encryption schemes enable exciting new features such as secure sharing and secure computation, they introduce several orders-of-magnitude computational and storage overhead. We conclude our paper by outlining future research directions to improve the usability of the emerging encryption schemes in an MCPS.
NASA Update for Unidata Stratcomm
NASA Technical Reports Server (NTRS)
Lynnes, Chris
2017-01-01
The NASA representative to the Unidata Strategic Committee presented a semiannual update on NASAs work with and use of Unidata technologies. The talk updated Unidata on the program of cloud computing prototypes underway for the Earth Observing System Data and Information System (EOSDIS). Also discussed was a trade study on the use of the Open source Project for a Network Data Access Protocol (OPeNDAP) with Web Object Storage in the cloud.
Duro, Francisco Rodrigo; Blas, Javier Garcia; Isaila, Florin; ...
2016-10-06
The increasing volume of scientific data and the limited scalability and performance of storage systems are currently presenting a significant limitation for the productivity of the scientific workflows running on both high-performance computing (HPC) and cloud platforms. Clearly needed is better integration of storage systems and workflow engines to address this problem. This paper presents and evaluates a novel solution that leverages codesign principles for integrating Hercules—an in-memory data store—with a workflow management system. We consider four main aspects: workflow representation, task scheduling, task placement, and task termination. As a result, the experimental evaluation on both cloud and HPC systemsmore » demonstrates significant performance and scalability improvements over existing state-of-the-art approaches.« less
Efficient Retrieval of Massive Ocean Remote Sensing Images via a Cloud-Based Mean-Shift Algorithm.
Yang, Mengzhao; Song, Wei; Mei, Haibin
2017-07-23
The rapid development of remote sensing (RS) technology has resulted in the proliferation of high-resolution images. There are challenges involved in not only storing large volumes of RS images but also in rapidly retrieving the images for ocean disaster analysis such as for storm surges and typhoon warnings. In this paper, we present an efficient retrieval of massive ocean RS images via a Cloud-based mean-shift algorithm. Distributed construction method via the pyramid model is proposed based on the maximum hierarchical layer algorithm and used to realize efficient storage structure of RS images on the Cloud platform. We achieve high-performance processing of massive RS images in the Hadoop system. Based on the pyramid Hadoop distributed file system (HDFS) storage method, an improved mean-shift algorithm for RS image retrieval is presented by fusion with the canopy algorithm via Hadoop MapReduce programming. The results show that the new method can achieve better performance for data storage than HDFS alone and WebGIS-based HDFS. Speedup and scaleup are very close to linear changes with an increase of RS images, which proves that image retrieval using our method is efficient.
Efficient Retrieval of Massive Ocean Remote Sensing Images via a Cloud-Based Mean-Shift Algorithm
Song, Wei; Mei, Haibin
2017-01-01
The rapid development of remote sensing (RS) technology has resulted in the proliferation of high-resolution images. There are challenges involved in not only storing large volumes of RS images but also in rapidly retrieving the images for ocean disaster analysis such as for storm surges and typhoon warnings. In this paper, we present an efficient retrieval of massive ocean RS images via a Cloud-based mean-shift algorithm. Distributed construction method via the pyramid model is proposed based on the maximum hierarchical layer algorithm and used to realize efficient storage structure of RS images on the Cloud platform. We achieve high-performance processing of massive RS images in the Hadoop system. Based on the pyramid Hadoop distributed file system (HDFS) storage method, an improved mean-shift algorithm for RS image retrieval is presented by fusion with the canopy algorithm via Hadoop MapReduce programming. The results show that the new method can achieve better performance for data storage than HDFS alone and WebGIS-based HDFS. Speedup and scaleup are very close to linear changes with an increase of RS images, which proves that image retrieval using our method is efficient. PMID:28737699
Cryptonite: A Secure and Performant Data Repository on Public Clouds
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kumbhare, Alok; Simmhan, Yogesh; Prasanna, Viktor
2012-06-29
Cloud storage has become immensely popular for maintaining synchronized copies of files and for sharing documents with collaborators. However, there is heightened concern about the security and privacy of Cloud-hosted data due to the shared infrastructure model and an implicit trust in the service providers. Emerging needs of secure data storage and sharing for domains like Smart Power Grids, which deal with sensitive consumer data, require the persistence and availability of Cloud storage but with client-controlled security and encryption, low key management overhead, and minimal performance costs. Cryptonite is a secure Cloud storage repository that addresses these requirements using amore » StrongBox model for shared key management.We describe the Cryptonite service and desktop client, discuss performance optimizations, and provide an empirical analysis of the improvements. Our experiments shows that Cryptonite clients achieve a 40% improvement in file upload bandwidth over plaintext storage using the Azure Storage Client API despite the added security benefits, while our file download performance is 5 times faster than the baseline for files greater than 100MB.« less
A new data collaboration service based on cloud computing security
NASA Astrophysics Data System (ADS)
Ying, Ren; Li, Hua-Wei; Wang, Li na
2017-09-01
With the rapid development of cloud computing, the storage and usage of data have undergone revolutionary changes. Data owners can store data in the cloud. While bringing convenience, it also brings many new challenges to cloud data security. A key issue is how to support a secure data collaboration service that supports access and updates to cloud data. This paper proposes a secure, efficient and extensible data collaboration service, which prevents data leaks in cloud storage, supports one to many encryption mechanisms, and also enables cloud data writing and fine-grained access control.
The direction of cloud computing for Malaysian education sector in 21st century
NASA Astrophysics Data System (ADS)
Jaafar, Jazurainifariza; Rahman, M. Nordin A.; Kadir, M. Fadzil A.; Shamsudin, Syadiah Nor; Saany, Syarilla Iryani A.
2017-08-01
In 21st century, technology has turned learning environment into a new way of education to make learning systems more effective and systematic. Nowadays, education institutions are faced many challenges to ensure the teaching and learning process is running smoothly and manageable. Some of challenges in the current education management are lack of integrated systems, high cost of maintenance, difficulty of configuration and deployment as well as complexity of storage provision. Digital learning is an instructional practice that use technology to make learning experience more effective, provides education process more systematic and attractive. Digital learning can be considered as one of the prominent application that implemented under cloud computing environment. Cloud computing is a type of network resources that provides on-demands services where the users can access applications inside it at any location and no time border. It also promises for minimizing the cost of maintenance and provides a flexible of data storage capacity. The aim of this article is to review the definition and types of cloud computing for improving digital learning management as required in the 21st century education. The analysis of digital learning context focused on primary school in Malaysia. Types of cloud applications and services in education sector are also discussed in the article. Finally, gap analysis and direction of cloud computing in education sector for facing the 21st century challenges are suggested.
Searchable attribute-based encryption scheme with attribute revocation in cloud storage.
Wang, Shangping; Zhao, Duqiao; Zhang, Yaling
2017-01-01
Attribute based encryption (ABE) is a good way to achieve flexible and secure access control to data, and attribute revocation is the extension of the attribute-based encryption, and the keyword search is an indispensable part for cloud storage. The combination of both has an important application in the cloud storage. In this paper, we construct a searchable attribute-based encryption scheme with attribute revocation in cloud storage, the keyword search in our scheme is attribute based with access control, when the search succeeds, the cloud server returns the corresponding cipher text to user and the user can decrypt the cipher text definitely. Besides, our scheme supports multiple keywords search, which makes the scheme more practical. Under the assumption of decisional bilinear Diffie-Hellman exponent (q-BDHE) and decisional Diffie-Hellman (DDH) in the selective security model, we prove that our scheme is secure.
Ahmed, Abdulghani Ali; Xue Li, Chua
2018-01-01
Cloud storage service allows users to store their data online, so that they can remotely access, maintain, manage, and back up data from anywhere via the Internet. Although helpful, this storage creates a challenge to digital forensic investigators and practitioners in collecting, identifying, acquiring, and preserving evidential data. This study proposes an investigation scheme for analyzing data remnants and determining probative artifacts in a cloud environment. Using pCloud as a case study, this research collected the data remnants available on end-user device storage following the storing, uploading, and accessing of data in the cloud storage. Data remnants are collected from several sources, including client software files, directory listing, prefetch, registry, network PCAP, browser, and memory and link files. Results demonstrate that the collected remnants data are beneficial in determining a sufficient number of artifacts about the investigated cybercrime. © 2017 American Academy of Forensic Sciences.
Archive Management of NASA Earth Observation Data to Support Cloud Analysis
NASA Technical Reports Server (NTRS)
Lynnes, Christopher; Baynes, Kathleen; McInerney, Mark A.
2017-01-01
NASA collects, processes and distributes petabytes of Earth Observation (EO) data from satellites, aircraft, in situ instruments and model output, with an order of magnitude increase expected by 2024. Cloud-based web object storage (WOS) of these data can simplify the execution of such an increase. More importantly, it can also facilitate user analysis of those volumes by making the data available to the massively parallel computing power in the cloud. However, storing EO data in cloud WOS has a ripple effect throughout the NASA archive system with unexpected challenges and opportunities. One challenge is modifying data servicing software (such as Web Coverage Service servers) to access and subset data that are no longer on a directly accessible file system, but rather in cloud WOS. Opportunities include refactoring of the archive software to a cloud-native architecture; virtualizing data products by computing on demand; and reorganizing data to be more analysis-friendly.
Yang, Shu; Qiu, Yuyan; Shi, Bo
2016-09-01
This paper explores the methods of building the internet of things of a regional ECG monitoring, focused on the implementation of ECG monitoring center based on cloud computing platform. It analyzes implementation principles of automatic identifi cation in the types of arrhythmia. It also studies the system architecture and key techniques of cloud computing platform, including server load balancing technology, reliable storage of massive smalfi les and the implications of quick search function.
NASA Astrophysics Data System (ADS)
Gallagher, J. H. R.; Jelenak, A.; Potter, N.; Fulker, D. W.; Habermann, T.
2017-12-01
Providing data services based on cloud computing technology that is equivalent to those developed for traditional computing and storage systems is critical for successful migration to cloud-based architectures for data production, scientific analysis and storage. OPeNDAP Web-service capabilities (comprising the Data Access Protocol (DAP) specification plus open-source software for realizing DAP in servers and clients) are among the most widely deployed means for achieving data-as-service functionality in the Earth sciences. OPeNDAP services are especially common in traditional data center environments where servers offer access to datasets stored in (very large) file systems, and a preponderance of the source data for these services is being stored in the Hierarchical Data Format Version 5 (HDF5). Three candidate architectures for serving NASA satellite Earth Science HDF5 data via Hyrax running on Amazon Web Services (AWS) were developed and their performance examined for a set of representative use cases. The performance was based both on runtime and incurred cost. The three architectures differ in how HDF5 files are stored in the Amazon Simple Storage Service (S3) and how the Hyrax server (as an EC2 instance) retrieves their data. The results for both the serial and parallel access to HDF5 data in the S3 will be presented. While the study focused on HDF5 data, OPeNDAP and the Hyrax data server, the architectures are generic and the analysis can be extrapolated to many different data formats, web APIs, and data servers.
NASA Astrophysics Data System (ADS)
Sangaline, E.; Lauret, J.
2014-06-01
The quantity of information produced in Nuclear and Particle Physics (NPP) experiments necessitates the transmission and storage of data across diverse collections of computing resources. Robust solutions such as XRootD have been used in NPP, but as the usage of cloud resources grows, the difficulties in the dynamic configuration of these systems become a concern. Hadoop File System (HDFS) exists as a possible cloud storage solution with a proven track record in dynamic environments. Though currently not extensively used in NPP, HDFS is an attractive solution offering both elastic storage and rapid deployment. We will present the performance of HDFS in both canonical I/O tests and for a typical data analysis pattern within the RHIC/STAR experimental framework. These tests explore the scaling with different levels of redundancy and numbers of clients. Additionally, the performance of FUSE and NFS interfaces to HDFS were evaluated as a way to allow existing software to function without modification. Unfortunately, the complicated data structures in NPP are non-trivial to integrate with Hadoop and so many of the benefits of the MapReduce paradigm could not be directly realized. Despite this, our results indicate that using HDFS as a distributed filesystem offers reasonable performance and scalability and that it excels in its ease of configuration and deployment in a cloud environment.
DPM — efficient storage in diverse environments
NASA Astrophysics Data System (ADS)
Hellmich, Martin; Furano, Fabrizio; Smith, David; Brito da Rocha, Ricardo; Álvarez Ayllón, Alejandro; Manzi, Andrea; Keeble, Oliver; Calvet, Ivan; Regala, Miguel Antonio
2014-06-01
Recent developments, including low power devices, cluster file systems and cloud storage, represent an explosion in the possibilities for deploying and managing grid storage. In this paper we present how different technologies can be leveraged to build a storage service with differing cost, power, performance, scalability and reliability profiles, using the popular storage solution Disk Pool Manager (DPM/dmlite) as the enabling technology. The storage manager DPM is designed for these new environments, allowing users to scale up and down as they need it, and optimizing their computing centers energy efficiency and costs. DPM runs on high-performance machines, profiting from multi-core and multi-CPU setups. It supports separating the database from the metadata server, the head node, largely reducing its hard disk requirements. Since version 1.8.6, DPM is released in EPEL and Fedora, simplifying distribution and maintenance, but also supporting the ARM architecture beside i386 and x86_64, allowing it to run the smallest low-power machines such as the Raspberry Pi or the CuBox. This usage is facilitated by the possibility to scale horizontally using a main database and a distributed memcached-powered namespace cache. Additionally, DPM supports a variety of storage pools in the backend, most importantly HDFS, S3-enabled storage, and cluster file systems, allowing users to fit their DPM installation exactly to their needs. In this paper, we investigate the power-efficiency and total cost of ownership of various DPM configurations. We develop metrics to evaluate the expected performance of a setup both in terms of namespace and disk access considering the overall cost including equipment, power consumptions, or data/storage fees. The setups tested range from the lowest scale using Raspberry Pis with only 700MHz single cores and a 100Mbps network connections, over conventional multi-core servers to typical virtual machine instances in cloud settings. We evaluate the combinations of different name server setups, for example load-balanced clusters, with different storage setups, from using a classic local configuration to private and public clouds.
Could Blobs Fuel Storage-Based Convergence between HPC and Big Data?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matri, Pierre; Alforov, Yevhen; Brandon, Alvaro
The increasingly growing data sets processed on HPC platforms raise major challenges for the underlying storage layer. A promising alternative to POSIX-IO- compliant file systems are simpler blobs (binary large objects), or object storage systems. Such systems offer lower overhead and better performance at the cost of largely unused features such as file hierarchies or permissions. Similarly, blobs are increasingly considered for replacing distributed file systems for big data analytics or as a base for storage abstractions such as key-value stores or time-series databases. This growing interest in such object storage on HPC and big data platforms raises the question:more » Are blobs the right level of abstraction to enable storage-based convergence between HPC and Big Data? In this paper we study the impact of blob-based storage for real-world applications on HPC and cloud environments. The results show that blobbased storage convergence is possible, leading to a significant performance improvement on both platforms« less
Application research of Ganglia in Hadoop monitoring and management
NASA Astrophysics Data System (ADS)
Li, Gang; Ding, Jing; Zhou, Lixia; Yang, Yi; Liu, Lei; Wang, Xiaolei
2017-03-01
There are many applications of Hadoop System in the field of large data, cloud computing. The test bench of storage and application in seismic network at Earthquake Administration of Tianjin use with Hadoop system, which is used the open source software of Ganglia to operate and monitor. This paper reviews the function, installation and configuration process, application effect of operating and monitoring in Hadoop system of the Ganglia system. It briefly introduces the idea and effect of Nagios software monitoring Hadoop system. It is valuable for the industry in the monitoring system of cloud computing platform.
Pinheiro, Alexandre; Dias Canedo, Edna; de Sousa Junior, Rafael Timoteo; de Oliveira Albuquerque, Robson; García Villalba, Luis Javier; Kim, Tai-Hoon
2018-03-02
Cloud computing is considered an interesting paradigm due to its scalability, availability and virtually unlimited storage capacity. However, it is challenging to organize a cloud storage service (CSS) that is safe from the client point-of-view and to implement this CSS in public clouds since it is not advisable to blindly consider this configuration as fully trustworthy. Ideally, owners of large amounts of data should trust their data to be in the cloud for a long period of time, without the burden of keeping copies of the original data, nor of accessing the whole content for verifications regarding data preservation. Due to these requirements, integrity, availability, privacy and trust are still challenging issues for the adoption of cloud storage services, especially when losing or leaking information can bring significant damage, be it legal or business-related. With such concerns in mind, this paper proposes an architecture for periodically monitoring both the information stored in the cloud and the service provider behavior. The architecture operates with a proposed protocol based on trust and encryption concepts to ensure cloud data integrity without compromising confidentiality and without overloading storage services. Extensive tests and simulations of the proposed architecture and protocol validate their functional behavior and performance.
2018-01-01
Cloud computing is considered an interesting paradigm due to its scalability, availability and virtually unlimited storage capacity. However, it is challenging to organize a cloud storage service (CSS) that is safe from the client point-of-view and to implement this CSS in public clouds since it is not advisable to blindly consider this configuration as fully trustworthy. Ideally, owners of large amounts of data should trust their data to be in the cloud for a long period of time, without the burden of keeping copies of the original data, nor of accessing the whole content for verifications regarding data preservation. Due to these requirements, integrity, availability, privacy and trust are still challenging issues for the adoption of cloud storage services, especially when losing or leaking information can bring significant damage, be it legal or business-related. With such concerns in mind, this paper proposes an architecture for periodically monitoring both the information stored in the cloud and the service provider behavior. The architecture operates with a proposed protocol based on trust and encryption concepts to ensure cloud data integrity without compromising confidentiality and without overloading storage services. Extensive tests and simulations of the proposed architecture and protocol validate their functional behavior and performance. PMID:29498641
Extended outlook: description, utilization, and daily applications of cloud technology in radiology.
Gerard, Perry; Kapadia, Neil; Chang, Patricia T; Acharya, Jay; Seiler, Michael; Lefkovitz, Zvi
2013-12-01
The purpose of this article is to discuss the concept of cloud technology, its role in medical applications and radiology, the role of the radiologist in using and accessing these vast resources of information, and privacy concerns and HIPAA compliance strategies. Cloud computing is the delivery of shared resources, software, and information to computers and other devices as a metered service. This technology has a promising role in the sharing of patient medical information and appears to be particularly suited for application in radiology, given the field's inherent need for storage and access to large amounts of data. The radiology cloud has significant strengths, such as providing centralized storage and access, reducing unnecessary repeat radiologic studies, and potentially allowing radiologic second opinions more easily. There are significant cost advantages to cloud computing because of a decreased need for infrastructure and equipment by the institution. Private clouds may be used to ensure secure storage of data and compliance with HIPAA. In choosing a cloud service, there are important aspects, such as disaster recovery plans, uptime, and security audits, that must be considered. Given that the field of radiology has become almost exclusively digital in recent years, the future of secure storage and easy access to imaging studies lies within cloud computing technology.
Cloud based intelligent system for delivering health care as a service.
Kaur, Pankaj Deep; Chana, Inderveer
2014-01-01
The promising potential of cloud computing and its convergence with technologies such as mobile computing, wireless networks, sensor technologies allows for creation and delivery of newer type of cloud services. In this paper, we advocate the use of cloud computing for the creation and management of cloud based health care services. As a representative case study, we design a Cloud Based Intelligent Health Care Service (CBIHCS) that performs real time monitoring of user health data for diagnosis of chronic illness such as diabetes. Advance body sensor components are utilized to gather user specific health data and store in cloud based storage repositories for subsequent analysis and classification. In addition, infrastructure level mechanisms are proposed to provide dynamic resource elasticity for CBIHCS. Experimental results demonstrate that classification accuracy of 92.59% is achieved with our prototype system and the predicted patterns of CPU usage offer better opportunities for adaptive resource elasticity. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Cardiovascular imaging environment: will the future be cloud-based?
Kawel-Boehm, Nadine; Bluemke, David A
2017-07-01
In cardiovascular CT and MR imaging large datasets have to be stored, post-processed, analyzed and distributed. Beside basic assessment of volume and function in cardiac magnetic resonance imaging e.g., more sophisticated quantitative analysis is requested requiring specific software. Several institutions cannot afford various types of software and provide expertise to perform sophisticated analysis. Areas covered: Various cloud services exist related to data storage and analysis specifically for cardiovascular CT and MR imaging. Instead of on-site data storage, cloud providers offer flexible storage services on a pay-per-use basis. To avoid purchase and maintenance of specialized software for cardiovascular image analysis, e.g. to assess myocardial iron overload, MR 4D flow and fractional flow reserve, evaluation can be performed with cloud based software by the consumer or complete analysis is performed by the cloud provider. However, challenges to widespread implementation of cloud services include regulatory issues regarding patient privacy and data security. Expert commentary: If patient privacy and data security is guaranteed cloud imaging is a valuable option to cope with storage of large image datasets and offer sophisticated cardiovascular image analysis for institutions of all sizes.
NASA Astrophysics Data System (ADS)
Watari, S.; Morikawa, Y.; Yamamoto, K.; Inoue, S.; Tsubouchi, K.; Fukazawa, K.; Kimura, E.; Tatebe, O.; Kato, H.; Shimojo, S.; Murata, K. T.
2010-12-01
In the Solar-Terrestrial Physics (STP) field, spatio-temporal resolution of computer simulations is getting higher and higher because of tremendous advancement of supercomputers. A more advanced technology is Grid Computing that integrates distributed computational resources to provide scalable computing resources. In the simulation research, it is effective that a researcher oneself designs his physical model, performs calculations with a supercomputer, and analyzes and visualizes for consideration by a familiar method. A supercomputer is far from an analysis and visualization environment. In general, a researcher analyzes and visualizes in the workstation (WS) managed at hand because the installation and the operation of software in the WS are easy. Therefore, it is necessary to copy the data from the supercomputer to WS manually. Time necessary for the data transfer through long delay network disturbs high-accuracy simulations actually. In terms of usefulness, integrating a supercomputer and an analysis and visualization environment seamlessly with a researcher's familiar method is important. NICT has been developing a cloud computing environment (NICT Space Weather Cloud). In the NICT Space Weather Cloud, disk servers are located near its supercomputer and WSs for data analysis and visualization. They are connected to JGN2plus that is high-speed network for research and development. Distributed virtual high-capacity storage is also constructed by Grid Datafarm (Gfarm v2). Huge-size data output from the supercomputer is transferred to the virtual storage through JGN2plus. A researcher can concentrate on the research by a familiar method without regard to distance between a supercomputer and an analysis and visualization environment. Now, total 16 disk servers are setup in NICT headquarters (at Koganei, Tokyo), JGN2plus NOC (at Otemachi, Tokyo), Okinawa Subtropical Environment Remote-Sensing Center, and Cybermedia Center, Osaka University. They are connected on JGN2plus, and they constitute 1PB (physical size) virtual storage by Gfarm v2. These disk servers are connected with supercomputers of NICT and Osaka University. A system that data output from the supercomputers are automatically transferred to the virtual storage had been built up. Transfer rate is about 50 GB/hrs by actual measurement. It is estimated that the performance is reasonable for a certain simulation and analysis for reconstruction of coronal magnetic field. This research is assumed an experiment of the system, and the verification of practicality is advanced at the same time. Herein we introduce an overview of the space weather cloud system so far we have developed. We also demonstrate several scientific results using the space weather cloud system. We also introduce several web applications of the cloud as a service of the space weather cloud, which is named as "e-SpaceWeather" (e-SW). The e-SW provides with a variety of space weather online services from many aspects.
Federated data storage system prototype for LHC experiments and data intensive science
NASA Astrophysics Data System (ADS)
Kiryanov, A.; Klimentov, A.; Krasnopevtsev, D.; Ryabinkin, E.; Zarochentsev, A.
2017-10-01
Rapid increase of data volume from the experiments running at the Large Hadron Collider (LHC) prompted physics computing community to evaluate new data handling and processing solutions. Russian grid sites and universities’ clusters scattered over a large area aim at the task of uniting their resources for future productive work, at the same time giving an opportunity to support large physics collaborations. In our project we address the fundamental problem of designing a computing architecture to integrate distributed storage resources for LHC experiments and other data-intensive science applications and to provide access to data from heterogeneous computing facilities. Studies include development and implementation of federated data storage prototype for Worldwide LHC Computing Grid (WLCG) centres of different levels and University clusters within one National Cloud. The prototype is based on computing resources located in Moscow, Dubna, Saint Petersburg, Gatchina and Geneva. This project intends to implement a federated distributed storage for all kind of operations such as read/write/transfer and access via WAN from Grid centres, university clusters, supercomputers, academic and commercial clouds. The efficiency and performance of the system are demonstrated using synthetic and experiment-specific tests including real data processing and analysis workflows from ATLAS and ALICE experiments, as well as compute-intensive bioinformatics applications (PALEOMIX) running on supercomputers. We present topology and architecture of the designed system, report performance and statistics for different access patterns and show how federated data storage can be used efficiently by physicists and biologists. We also describe how sharing data on a widely distributed storage system can lead to a new computing model and reformations of computing style, for instance how bioinformatics program running on supercomputers can read/write data from the federated storage.
Digital Photograph Security: What Plastic Surgeons Need to Know.
Thomas, Virginia A; Rugeley, Patricia B; Lau, Frank H
2015-11-01
Sharing and storing digital patient photographs occur daily in plastic surgery. Two major risks associated with the practice, data theft and Health Insurance Portability and Accountability Act (HIPAA) violations, have been dramatically amplified by high-speed data connections and digital camera ubiquity. The authors review what plastic surgeons need to know to mitigate those risks and provide recommendations for implementing an ideal, HIPAA-compliant solution for plastic surgeons' digital photography needs: smartphones and cloud storage. Through informal discussions with plastic surgeons, the authors identified the most common photograph sharing and storage methods. For each method, a literature search was performed to identify the risks of data theft and HIPAA violations. HIPAA violation risks were confirmed by the second author (P.B.R.), a compliance liaison and privacy officer. A comprehensive review of HIPAA-compliant cloud storage services was performed. When possible, informal interviews with cloud storage services representatives were conducted. The most common sharing and storage methods are not HIPAA compliant, and several are prone to data theft. The authors' review of cloud storage services identified six HIPAA-compliant vendors that have strong to excellent security protocols and policies. These options are reasonably priced. Digital photography and technological advances offer major benefits to plastic surgeons but are not without risks. A proper understanding of data security and HIPAA regulations needs to be applied to these technologies to safely capture their benefits. Cloud storage services offer efficient photograph sharing and storage with layers of security to ensure HIPAA compliance and mitigate data theft risk.
Cloud Based Metalearning System for Predictive Modeling of Biomedical Data
Vukićević, Milan
2014-01-01
Rapid growth and storage of biomedical data enabled many opportunities for predictive modeling and improvement of healthcare processes. On the other side analysis of such large amounts of data is a difficult and computationally intensive task for most existing data mining algorithms. This problem is addressed by proposing a cloud based system that integrates metalearning framework for ranking and selection of best predictive algorithms for data at hand and open source big data technologies for analysis of biomedical data. PMID:24892101
Dynamic federation of grid and cloud storage
NASA Astrophysics Data System (ADS)
Furano, Fabrizio; Keeble, Oliver; Field, Laurence
2016-09-01
The Dynamic Federations project ("Dynafed") enables the deployment of scalable, distributed storage systems composed of independent storage endpoints. While the Uniform Generic Redirector at the heart of the project is protocol-agnostic, we have focused our effort on HTTP-based protocols, including S3 and WebDAV. The system has been deployed on testbeds covering the majority of the ATLAS and LHCb data, and supports geography-aware replica selection. The work done exploits the federation potential of HTTP to build systems that offer uniform, scalable, catalogue-less access to the storage and metadata ensemble and the possibility of seamless integration of other compatible resources such as those from cloud providers. Dynafed can exploit the potential of the S3 delegation scheme, effectively federating on the fly any number of S3 buckets from different providers and applying a uniform authorization to them. This feature has been used to deploy in production the BOINC Data Bridge, which uses the Uniform Generic Redirector with S3 buckets to harmonize the BOINC authorization scheme with the Grid/X509. The Data Bridge has been deployed in production with good results. We believe that the features of a loosely coupled federation of open-protocolbased storage elements open many possibilities of smoothly evolving the current computing models and of supporting new scientific computing projects that rely on massive distribution of data and that would appreciate systems that can more easily be interfaced with commercial providers and can work natively with Web browsers and clients.
Chelonia: A self-healing, replicated storage system
NASA Astrophysics Data System (ADS)
Kerr Nilsen, Jon; Toor, Salman; Nagy, Zsombor; Read, Alex
2011-12-01
Chelonia is a novel grid storage system designed to fill the requirements gap between those of large, sophisticated scientific collaborations which have adopted the grid paradigm for their distributed storage needs, and of corporate business communities gravitating towards the cloud paradigm. Chelonia is an integrated system of heterogeneous, geographically dispersed storage sites which is easily and dynamically expandable and optimized for high availability and scalability. The architecture and implementation in term of web-services running inside the Advanced Resource Connector Hosting Environment Dameon (ARC HED) are described and results of tests in both local -area and wide-area networks that demonstrate the fault tolerance, stability and scalability of Chelonia will be presented. In addition, example setups for production deployments for small and medium-sized VO's are described.
Research on phone contacts online status based on mobile cloud computing
NASA Astrophysics Data System (ADS)
Wang, Wen-jinga; Ge, Weib
2013-03-01
Because the limited ability of storage space, CPU processing on mobile phone, it is difficult to realize complex applications on mobile phones, but along with the development of cloud computing, we can place the computing and storage in the clouds, provide users with rich cloud services, helping users complete various function through the browser has become the trend for future mobile communication. This article is taking the mobile phone contacts online status as an example to analysis the development and application of mobile cloud computing.
NASA Astrophysics Data System (ADS)
Nguyen, L.; Chee, T.; Palikonda, R.; Smith, W. L., Jr.; Bedka, K. M.; Spangenberg, D.; Vakhnin, A.; Lutz, N. E.; Walter, J.; Kusterer, J.
2017-12-01
Cloud Computing offers new opportunities for large-scale scientific data producers to utilize Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) IT resources to process and deliver data products in an operational environment where timely delivery, reliability, and availability are critical. The NASA Langley Research Center Atmospheric Science Data Center (ASDC) is building and testing a private and public facing cloud for users in the Science Directorate to utilize as an everyday production environment. The NASA SatCORPS (Satellite ClOud and Radiation Property Retrieval System) team processes and derives near real-time (NRT) global cloud products from operational geostationary (GEO) satellite imager datasets. To deliver these products, we will utilize the public facing cloud and OpenShift to deploy a load-balanced webserver for data storage, access, and dissemination. The OpenStack private cloud will host data ingest and computational capabilities for SatCORPS processing. This paper will discuss the SatCORPS migration towards, and usage of, the ASDC Cloud Services in an operational environment. Detailed lessons learned from use of prior cloud providers, specifically the Amazon Web Services (AWS) GovCloud and the Government Cloud administered by the Langley Managed Cloud Environment (LMCE) will also be discussed.
CSNS computing environment Based on OpenStack
NASA Astrophysics Data System (ADS)
Li, Yakang; Qi, Fazhi; Chen, Gang; Wang, Yanming; Hong, Jianshu
2017-10-01
Cloud computing can allow for more flexible configuration of IT resources and optimized hardware utilization, it also can provide computing service according to the real need. We are applying this computing mode to the China Spallation Neutron Source(CSNS) computing environment. So, firstly, CSNS experiment and its computing scenarios and requirements are introduced in this paper. Secondly, the design and practice of cloud computing platform based on OpenStack are mainly demonstrated from the aspects of cloud computing system framework, network, storage and so on. Thirdly, some improvments to openstack we made are discussed further. Finally, current status of CSNS cloud computing environment are summarized in the ending of this paper.
NASA Astrophysics Data System (ADS)
Besnard, Laurent; Blain, Peter; Mancini, Sebastien; Proctor, Roger
2017-04-01
The Integrated Marine Observing System (IMOS) is a national project funded by the Australian government established to deliver ocean observations to the marine and climate science community. Now in its 10th year its mission is to undertake systematic and sustained observations and to turn them into data, products and analyses that can be freely used and reused for broad societal benefits. As IMOS has matured as an observing system expectation on the system's availability and reliability has also increased and IMOS is now seen as delivering 'operational' information. In responding to this expectation, IMOS has relocated its services to the commercial cloud service Amazon Web Services. This has enabled IMOS to improve the system architecture, utilizing more advanced features like object storage (S3 - Simple Storage Service) and autoscaling features, and introducing new checking procedures in a pipeline approach. This has improved data availability and resilience while protecting against human errors in data handling and providing a more efficient ingestion process.
Cloud-assisted mobile-access of health data with privacy and auditability.
Tong, Yue; Sun, Jinyuan; Chow, Sherman S M; Li, Pan
2014-03-01
Motivated by the privacy issues, curbing the adoption of electronic healthcare systems and the wild success of cloud service models, we propose to build privacy into mobile healthcare systems with the help of the private cloud. Our system offers salient features including efficient key management, privacy-preserving data storage, and retrieval, especially for retrieval at emergencies, and auditability for misusing health data. Specifically, we propose to integrate key management from pseudorandom number generator for unlinkability, a secure indexing method for privacy-preserving keyword search which hides both search and access patterns based on redundancy, and integrate the concept of attribute-based encryption with threshold signing for providing role-based access control with auditability to prevent potential misbehavior, in both normal and emergency cases.
A support architecture for reliable distributed computing systems
NASA Technical Reports Server (NTRS)
Mckendry, Martin S.
1986-01-01
The Clouds kernel design was through several design phases and is nearly complete. The object manager, the process manager, the storage manager, the communications manager, and the actions manager are examined.
Leveraging Cloud Computing to Improve Storage Durability, Availability, and Cost for MER Maestro
NASA Technical Reports Server (NTRS)
Chang, George W.; Powell, Mark W.; Callas, John L.; Torres, Recaredo J.; Shams, Khawaja S.
2012-01-01
The Maestro for MER (Mars Exploration Rover) software is the premiere operation and activity planning software for the Mars rovers, and it is required to deliver all of the processed image products to scientists on demand. These data span multiple storage arrays sized at 2 TB, and a backup scheme ensures data is not lost. In a catastrophe, these data would currently recover at 20 GB/hour, taking several days for a restoration. A seamless solution provides access to highly durable, highly available, scalable, and cost-effective storage capabilities. This approach also employs a novel technique that enables storage of the majority of data on the cloud and some data locally. This feature is used to store the most recent data locally in order to guarantee utmost reliability in case of an outage or disconnect from the Internet. This also obviates any changes to the software that generates the most recent data set as it still has the same interface to the file system as it did before updates
Using S3 cloud storage with ROOT and CvmFS
NASA Astrophysics Data System (ADS)
Arsuaga-Ríos, María; Heikkilä, Seppo S.; Duellmann, Dirk; Meusel, René; Blomer, Jakob; Couturier, Ben
2015-12-01
Amazon S3 is a widely adopted web API for scalable cloud storage that could also fulfill storage requirements of the high-energy physics community. CERN has been evaluating this option using some key HEP applications such as ROOT and the CernVM filesystem (CvmFS) with S3 back-ends. In this contribution we present an evaluation of two versions of the Huawei UDS storage system stressed with a large number of clients executing HEP software applications. The performance of concurrently storing individual objects is presented alongside with more complex data access patterns as produced by the ROOT data analysis framework. Both Huawei UDS generations show a successful scalability by supporting multiple byte-range requests in contrast with Amazon S3 or Ceph which do not support these commonly used HEP operations. We further report the S3 integration with recent CvmFS versions and summarize the experience with CvmFS/S3 for publishing daily releases of the full LHCb experiment software stack.
Edge-Based Efficient Search over Encrypted Data Mobile Cloud Storage
Liu, Fang; Cai, Zhiping; Xiao, Nong; Zhao, Ziming
2018-01-01
Smart sensor-equipped mobile devices sense, collect, and process data generated by the edge network to achieve intelligent control, but such mobile devices usually have limited storage and computing resources. Mobile cloud storage provides a promising solution owing to its rich storage resources, great accessibility, and low cost. But it also brings a risk of information leakage. The encryption of sensitive data is the basic step to resist the risk. However, deploying a high complexity encryption and decryption algorithm on mobile devices will greatly increase the burden of terminal operation and the difficulty to implement the necessary privacy protection algorithm. In this paper, we propose ENSURE (EfficieNt and SecURE), an efficient and secure encrypted search architecture over mobile cloud storage. ENSURE is inspired by edge computing. It allows mobile devices to offload the computation intensive task onto the edge server to achieve a high efficiency. Besides, to protect data security, it reduces the information acquisition of untrusted cloud by hiding the relevance between query keyword and search results from the cloud. Experiments on a real data set show that ENSURE reduces the computation time by 15% to 49% and saves the energy consumption by 38% to 69% per query. PMID:29652810
Edge-Based Efficient Search over Encrypted Data Mobile Cloud Storage.
Guo, Yeting; Liu, Fang; Cai, Zhiping; Xiao, Nong; Zhao, Ziming
2018-04-13
Smart sensor-equipped mobile devices sense, collect, and process data generated by the edge network to achieve intelligent control, but such mobile devices usually have limited storage and computing resources. Mobile cloud storage provides a promising solution owing to its rich storage resources, great accessibility, and low cost. But it also brings a risk of information leakage. The encryption of sensitive data is the basic step to resist the risk. However, deploying a high complexity encryption and decryption algorithm on mobile devices will greatly increase the burden of terminal operation and the difficulty to implement the necessary privacy protection algorithm. In this paper, we propose ENSURE (EfficieNt and SecURE), an efficient and secure encrypted search architecture over mobile cloud storage. ENSURE is inspired by edge computing. It allows mobile devices to offload the computation intensive task onto the edge server to achieve a high efficiency. Besides, to protect data security, it reduces the information acquisition of untrusted cloud by hiding the relevance between query keyword and search results from the cloud. Experiments on a real data set show that ENSURE reduces the computation time by 15% to 49% and saves the energy consumption by 38% to 69% per query.
Latif, Rabia; Abbas, Haider; Assar, Saïd
2014-11-01
Wireless Body Area Networks (WBANs) have emerged as a promising technology that has shown enormous potential in improving the quality of healthcare, and has thus found a broad range of medical applications from ubiquitous health monitoring to emergency medical response systems. The huge amount of highly sensitive data collected and generated by WBAN nodes requires an ascendable and secure storage and processing infrastructure. Given the limited resources of WBAN nodes for storage and processing, the integration of WBANs and cloud computing may provide a powerful solution. However, despite the benefits of cloud-assisted WBAN, several security issues and challenges remain. Among these, data availability is the most nagging security issue. The most serious threat to data availability is a distributed denial of service (DDoS) attack that directly affects the all-time availability of a patient's data. The existing solutions for standalone WBANs and sensor networks are not applicable in the cloud. The purpose of this review paper is to identify the most threatening types of DDoS attacks affecting the availability of a cloud-assisted WBAN and review the state-of-the-art detection mechanisms for the identified DDoS attacks.
Properties of the electron cloud in a high-energy positron and electron storage ring
Harkay, K. C.; Rosenberg, R. A.
2003-03-20
Low-energy, background electrons are ubiquitous in high-energy particle accelerators. Under certain conditions, interactions between this electron cloud and the high-energy beam can give rise to numerous effects that can seriously degrade the accelerator performance. These effects range from vacuum degradation to collective beam instabilities and emittance blowup. Although electron-cloud effects were first observed two decades ago in a few proton storage rings, they have in recent years been widely observed and intensely studied in positron and proton rings. Electron-cloud diagnostics developed at the Advanced Photon Source enabled for the first time detailed, direct characterization of the electron-cloud properties in amore » positron and electron storage ring. From in situ measurements of the electron flux and energy distribution at the vacuum chamber wall, electron-cloud production mechanisms and details of the beam-cloud interaction can be inferred. A significant longitudinal variation of the electron cloud is also observed, due primarily to geometrical details of the vacuum chamber. Furthermore, such experimental data can be used to provide realistic limits on key input parameters in modeling efforts, leading ultimately to greater confidence in predicting electron-cloud effects in future accelerators.« less
Optical fibre multi-parameter sensing with secure cloud based signal capture and processing
NASA Astrophysics Data System (ADS)
Newe, Thomas; O'Connell, Eoin; Meere, Damien; Yuan, Hongwei; Leen, Gabriel; O'Keeffe, Sinead; Lewis, Elfed
2016-05-01
Recent advancements in cloud computing technologies in the context of optical and optical fibre based systems are reported. The proliferation of real time and multi-channel based sensor systems represents significant growth in data volume. This coupled with a growing need for security presents many challenges and presents a huge opportunity for an evolutionary step in the widespread application of these sensing technologies. A tiered infrastructural system approach is adopted that is designed to facilitate the delivery of Optical Fibre-based "SENsing as a Service- SENaaS". Within this infrastructure, novel optical sensing platforms, deployed within different environments, are interfaced with a Cloud-based backbone infrastructure which facilitates the secure collection, storage and analysis of real-time data. Feedback systems, which harness this data to affect a change within the monitored location/environment/condition, are also discussed. The cloud based system presented here can also be used with chemical and physical sensors that require real-time data analysis, processing and feedback.
Archive Management of NASA Earth Observation Data to Support Cloud Analysis
NASA Technical Reports Server (NTRS)
Lynnes, Christopher; Baynes, Kathleen; McInerney, Mark
2017-01-01
NASA collects, processes and distributes petabytes of Earth Observation (EO) data from satellites, aircraft, in situ instruments and model output, with an order of magnitude increase expected by 2024. Cloud-based web object storage (WOS) of these data can simplify the execution of such an increase. More importantly, it can also facilitate user analysis of those volumes by making the data available to the massively parallel computing power in the cloud. However, storing EO data in cloud WOS has a ripple effect throughout the NASA archive system with unexpected challenges and opportunities. One challenge is modifying data servicing software (such as Web Coverage Service servers) to access and subset data that are no longer on a directly accessible file system, but rather in cloud WOS. Opportunities include refactoring of the archive software to a cloud-native architecture; virtualizing data products by computing on demand; and reorganizing data to be more analysis-friendly. Reviewed by Mark McInerney ESDIS Deputy Project Manager.
ERIC Educational Resources Information Center
Karabayeva, Kamilya Zhumartovna
2016-01-01
In the present article the author gives evidence of effective application of cloud storage and on-line applications in the educational process of the higher education institution, as well as considers the problems and prospects of using cloud technologies in the educational process, when creating a unified educational space in the foreign language…
Army Science Planning and Strategy Meeting: The Fog of Cyber War
2016-12-01
computing , which, depending upon the situation, some refer to as a fog rather than a cloud . These seemingly disparate notions of fog merge when one...Chiang M. CYRUS: towards client- defined cloud storage. Proceedings of the Tenth European Conference on Computer Systems; 2015 Apr 21; Bordeaux...Army Science Planning and Strategy Meeting: The Fog of Cyber War by Alexander Kott and Ananthram Swami Computational and Information Sciences
NASA Astrophysics Data System (ADS)
Hogenson, K.; Arko, S. A.; Buechler, B.; Hogenson, R.; Herrmann, J.; Geiger, A.
2016-12-01
A problem often faced by Earth science researchers is how to scale algorithms that were developed against few datasets and take them to regional or global scales. One significant hurdle can be the processing and storage resources available for such a task, not to mention the administration of those resources. As a processing environment, the cloud offers nearly unlimited potential for compute and storage, with limited administration required. The goal of the Hybrid Pluggable Processing Pipeline (HyP3) project was to demonstrate the utility of the Amazon cloud to process large amounts of data quickly and cost effectively, while remaining generic enough to incorporate new algorithms with limited administration time or expense. Principally built by three undergraduate students at the ASF DAAC, the HyP3 system relies on core Amazon services such as Lambda, the Simple Notification Service (SNS), Relational Database Service (RDS), Elastic Compute Cloud (EC2), Simple Storage Service (S3), and Elastic Beanstalk. The HyP3 user interface was written using elastic beanstalk, and the system uses SNS and Lamdba to handle creating, instantiating, executing, and terminating EC2 instances automatically. Data are sent to S3 for delivery to customers and removed using standard data lifecycle management rules. In HyP3 all data processing is ephemeral; there are no persistent processes taking compute and storage resources or generating added cost. When complete, HyP3 will leverage the automatic scaling up and down of EC2 compute power to respond to event-driven demand surges correlated with natural disaster or reprocessing efforts. Massive simultaneous processing within EC2 will be able match the demand spike in ways conventional physical computing power never could, and then tail off incurring no costs when not needed. This presentation will focus on the development techniques and technologies that were used in developing the HyP3 system. Data and process flow will be shown, highlighting the benefits of the cloud for each step. Finally, the steps for integrating a new processing algorithm will be demonstrated. This is the true power of HyP3; allowing people to upload their own algorithms and execute them at archive level scales.
Bao, Shunxing; Damon, Stephen M; Landman, Bennett A; Gokhale, Aniruddha
2016-02-27
Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical-Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for-use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.
NASA Astrophysics Data System (ADS)
Bao, Shunxing; Damon, Stephen M.; Landman, Bennett A.; Gokhale, Aniruddha
2016-03-01
Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical- Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for- use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.
Bao, Shunxing; Damon, Stephen M.; Landman, Bennett A.; Gokhale, Aniruddha
2016-01-01
Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical-Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for-use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline. PMID:27127335
Golberg, Alexander; Linshiz, Gregory; Kravets, Ilia; Stawski, Nina; Hillson, Nathan J; Yarmush, Martin L; Marks, Robert S; Konry, Tania
2014-01-01
We report an all-in-one platform - ScanDrop - for the rapid and specific capture, detection, and identification of bacteria in drinking water. The ScanDrop platform integrates droplet microfluidics, a portable imaging system, and cloud-based control software and data storage. The cloud-based control software and data storage enables robotic image acquisition, remote image processing, and rapid data sharing. These features form a "cloud" network for water quality monitoring. We have demonstrated the capability of ScanDrop to perform water quality monitoring via the detection of an indicator coliform bacterium, Escherichia coli, in drinking water contaminated with feces. Magnetic beads conjugated with antibodies to E. coli antigen were used to selectively capture and isolate specific bacteria from water samples. The bead-captured bacteria were co-encapsulated in pico-liter droplets with fluorescently-labeled anti-E. coli antibodies, and imaged with an automated custom designed fluorescence microscope. The entire water quality diagnostic process required 8 hours from sample collection to online-accessible results compared with 2-4 days for other currently available standard detection methods.
Lee, Im-Yeong
2014-01-01
Data outsourcing services have emerged with the increasing use of digital information. They can be used to store data from various devices via networks that are easy to access. Unlike existing removable storage systems, storage outsourcing is available to many users because it has no storage limit and does not require a local storage medium. However, the reliability of storage outsourcing has become an important topic because many users employ it to store large volumes of data. To protect against unethical administrators and attackers, a variety of cryptography systems are used, such as searchable encryption and proxy reencryption. However, existing searchable encryption technology is inconvenient for use in storage outsourcing environments where users upload their data to be shared with others as necessary. In addition, some existing schemes are vulnerable to collusion attacks and have computing cost inefficiencies. In this paper, we analyze existing proxy re-encryption with keyword search. PMID:24693240
Lee, Sun-Ho; Lee, Im-Yeong
2014-01-01
Data outsourcing services have emerged with the increasing use of digital information. They can be used to store data from various devices via networks that are easy to access. Unlike existing removable storage systems, storage outsourcing is available to many users because it has no storage limit and does not require a local storage medium. However, the reliability of storage outsourcing has become an important topic because many users employ it to store large volumes of data. To protect against unethical administrators and attackers, a variety of cryptography systems are used, such as searchable encryption and proxy reencryption. However, existing searchable encryption technology is inconvenient for use in storage outsourcing environments where users upload their data to be shared with others as necessary. In addition, some existing schemes are vulnerable to collusion attacks and have computing cost inefficiencies. In this paper, we analyze existing proxy re-encryption with keyword search.
Nosql for Storage and Retrieval of Large LIDAR Data Collections
NASA Astrophysics Data System (ADS)
Boehm, J.; Liu, K.
2015-08-01
Developments in LiDAR technology over the past decades have made LiDAR to become a mature and widely accepted source of geospatial information. This in turn has led to an enormous growth in data volume. The central idea for a file-centric storage of LiDAR point clouds is the observation that large collections of LiDAR data are typically delivered as large collections of files, rather than single files of terabyte size. This split of the dataset, commonly referred to as tiling, was usually done to accommodate a specific processing pipeline. It makes therefore sense to preserve this split. A document oriented NoSQL database can easily emulate this data partitioning, by representing each tile (file) in a separate document. The document stores the metadata of the tile. The actual files are stored in a distributed file system emulated by the NoSQL database. We demonstrate the use of MongoDB a highly scalable document oriented NoSQL database for storing large LiDAR files. MongoDB like any NoSQL database allows for queries on the attributes of the document. As a specialty MongoDB also allows spatial queries. Hence we can perform spatial queries on the bounding boxes of the LiDAR tiles. Inserting and retrieving files on a cloud-based database is compared to native file system and cloud storage transfer speed.
Visual Analysis of Cloud Computing Performance Using Behavioral Lines.
Muelder, Chris; Zhu, Biao; Chen, Wei; Zhang, Hongxin; Ma, Kwan-Liu
2016-02-29
Cloud computing is an essential technology to Big Data analytics and services. A cloud computing system is often comprised of a large number of parallel computing and storage devices. Monitoring the usage and performance of such a system is important for efficient operations, maintenance, and security. Tracing every application on a large cloud system is untenable due to scale and privacy issues. But profile data can be collected relatively efficiently by regularly sampling the state of the system, including properties such as CPU load, memory usage, network usage, and others, creating a set of multivariate time series for each system. Adequate tools for studying such large-scale, multidimensional data are lacking. In this paper, we present a visual based analysis approach to understanding and analyzing the performance and behavior of cloud computing systems. Our design is based on similarity measures and a layout method to portray the behavior of each compute node over time. When visualizing a large number of behavioral lines together, distinct patterns often appear suggesting particular types of performance bottleneck. The resulting system provides multiple linked views, which allow the user to interactively explore the data by examining the data or a selected subset at different levels of detail. Our case studies, which use datasets collected from two different cloud systems, show that this visual based approach is effective in identifying trends and anomalies of the systems.
Interoperating Cloud-based Virtual Farms
NASA Astrophysics Data System (ADS)
Bagnasco, S.; Colamaria, F.; Colella, D.; Casula, E.; Elia, D.; Franco, A.; Lusso, S.; Luparello, G.; Masera, M.; Miniello, G.; Mura, D.; Piano, S.; Vallero, S.; Venaruzzo, M.; Vino, G.
2015-12-01
The present work aims at optimizing the use of computing resources available at the grid Italian Tier-2 sites of the ALICE experiment at CERN LHC by making them accessible to interactive distributed analysis, thanks to modern solutions based on cloud computing. The scalability and elasticity of the computing resources via dynamic (“on-demand”) provisioning is essentially limited by the size of the computing site, reaching the theoretical optimum only in the asymptotic case of infinite resources. The main challenge of the project is to overcome this limitation by federating different sites through a distributed cloud facility. Storage capacities of the participating sites are seen as a single federated storage area, preventing the need of mirroring data across them: high data access efficiency is guaranteed by location-aware analysis software and storage interfaces, in a transparent way from an end-user perspective. Moreover, the interactive analysis on the federated cloud reduces the execution time with respect to grid batch jobs. The tests of the investigated solutions for both cloud computing and distributed storage on wide area network will be presented.
Scalable cloud without dedicated storage
NASA Astrophysics Data System (ADS)
Batkovich, D. V.; Kompaniets, M. V.; Zarochentsev, A. K.
2015-05-01
We present a prototype of a scalable computing cloud. It is intended to be deployed on the basis of a cluster without the separate dedicated storage. The dedicated storage is replaced by the distributed software storage. In addition, all cluster nodes are used both as computing nodes and as storage nodes. This solution increases utilization of the cluster resources as well as improves fault tolerance and performance of the distributed storage. Another advantage of this solution is high scalability with a relatively low initial and maintenance cost. The solution is built on the basis of the open source components like OpenStack, CEPH, etc.
NASA Cloud-Based Climate Data Services
NASA Astrophysics Data System (ADS)
McInerney, M. A.; Schnase, J. L.; Duffy, D. Q.; Tamkin, G. S.; Strong, S.; Ripley, W. D., III; Thompson, J. H.; Gill, R.; Jasen, J. E.; Samowich, B.; Pobre, Z.; Salmon, E. M.; Rumney, G.; Schardt, T. D.
2012-12-01
Cloud-based scientific data services are becoming an important part of NASA's mission. Our technological response is built around the concept of specialized virtual climate data servers, repetitive cloud provisioning, image-based deployment and distribution, and virtualization-as-a-service (VaaS). A virtual climate data server (vCDS) is an Open Archive Information System (OAIS) compliant, iRODS-based data server designed to support a particular type of scientific data collection. iRODS is data grid middleware that provides policy-based control over collection-building, managing, querying, accessing, and preserving large scientific data sets. We have deployed vCDS Version 1.0 in the Amazon EC2 cloud using S3 object storage and are using the system to deliver a subset of NASA's Intergovernmental Panel on Climate Change (IPCC) data products to the latest CentOS federated version of Earth System Grid Federation (ESGF), which is also running in the Amazon cloud. vCDS-managed objects are exposed to ESGF through FUSE (Filesystem in User Space), which presents a POSIX-compliant filesystem abstraction to applications such as the ESGF server that require such an interface. A vCDS manages data as a distinguished collection for a person, project, lab, or other logical unit. A vCDS can manage a collection across multiple storage resources using rules and microservices to enforce collection policies. And a vCDS can federate with other vCDSs to manage multiple collections over multiple resources, thereby creating what can be thought of as an ecosystem of managed collections. With the vCDS approach, we are trying to enable the full information lifecycle management of scientific data collections and make tractable the task of providing diverse climate data services. In this presentation, we describe our approach, experiences, lessons learned, and plans for the future.; (A) vCDS/ESG system stack. (B) Conceptual architecture for NASA cloud-based data services.
Optimizing Cloud Based Image Storage, Dissemination and Processing Through Use of Mrf and Lerc
NASA Astrophysics Data System (ADS)
Becker, Peter; Plesea, Lucian; Maurer, Thomas
2016-06-01
The volume and numbers of geospatial images being collected continue to increase exponentially with the ever increasing number of airborne and satellite imaging platforms, and the increasing rate of data collection. As a result, the cost of fast storage required to provide access to the imagery is a major cost factor in enterprise image management solutions to handle, process and disseminate the imagery and information extracted from the imagery. Cloud based object storage offers to provide significantly lower cost and elastic storage for this imagery, but also adds some disadvantages in terms of greater latency for data access and lack of traditional file access. Although traditional file formats geoTIF, JPEG2000 and NITF can be downloaded from such object storage, their structure and available compression are not optimum and access performance is curtailed. This paper provides details on a solution by utilizing a new open image formats for storage and access to geospatial imagery optimized for cloud storage and processing. MRF (Meta Raster Format) is optimized for large collections of scenes such as those acquired from optical sensors. The format enables optimized data access from cloud storage, along with the use of new compression options which cannot easily be added to existing formats. The paper also provides an overview of LERC a new image compression that can be used with MRF that provides very good lossless and controlled lossy compression.
Implementation of Grid Tier 2 and Tier 3 facilities on a Distributed OpenStack Cloud
NASA Astrophysics Data System (ADS)
Limosani, Antonio; Boland, Lucien; Coddington, Paul; Crosby, Sean; Huang, Joanna; Sevior, Martin; Wilson, Ross; Zhang, Shunde
2014-06-01
The Australian Government is making a AUD 100 million investment in Compute and Storage for the academic community. The Compute facilities are provided in the form of 30,000 CPU cores located at 8 nodes around Australia in a distributed virtualized Infrastructure as a Service facility based on OpenStack. The storage will eventually consist of over 100 petabytes located at 6 nodes. All will be linked via a 100 Gb/s network. This proceeding describes the development of a fully connected WLCG Tier-2 grid site as well as a general purpose Tier-3 computing cluster based on this architecture. The facility employs an extension to Torque to enable dynamic allocations of virtual machine instances. A base Scientific Linux virtual machine (VM) image is deployed in the OpenStack cloud and automatically configured as required using Puppet. Custom scripts are used to launch multiple VMs, integrate them into the dynamic Torque cluster and to mount remote file systems. We report on our experience in developing this nation-wide ATLAS and Belle II Tier 2 and Tier 3 computing infrastructure using the national Research Cloud and storage facilities.
An Intelligent Cloud Storage Gateway for Medical Imaging.
Viana-Ferreira, Carlos; Guerra, António; Silva, João F; Matos, Sérgio; Costa, Carlos
2017-09-01
Historically, medical imaging repositories have been supported by indoor infrastructures. However, the amount of diagnostic imaging procedures has continuously increased over the last decades, imposing several challenges associated with the storage volume, data redundancy and availability. Cloud platforms are focused on delivering hardware and software services over the Internet, becoming an appealing solution for repository outsourcing. Although this option may bring financial and technological benefits, it also presents new challenges. In medical imaging scenarios, communication latency is a critical issue that still hinders the adoption of this paradigm. This paper proposes an intelligent Cloud storage gateway that optimizes data access times. This is achieved through a new cache architecture that combines static rules and pattern recognition for eviction and prefetching. The evaluation results, obtained from experiments over a real-world dataset, show that cache hit ratios can reach around 80%, leading to reductions of image retrieval times by over 60%. The combined use of eviction and prefetching policies proposed can significantly reduce communication latency, even when using a small cache in comparison to the total size of the repository. Apart from the performance gains, the proposed system is capable of adjusting to specific workflows of different institutions.
EyeMIAS: a cloud-based ophthalmic image reading and auxiliary diagnosis system
NASA Astrophysics Data System (ADS)
Wu, Di; Zhao, Heming; Yu, Kai; Chen, Xinjian
2018-03-01
Relying solely on ophthalmic equipment is unable to meet the present health needs. It is urgent to find an efficient way to provide a quick screening and early diagnosis on diabetic retinopathy and other ophthalmic diseases. The purpose of this study is to develop a cloud-base system for medical image especially ophthalmic image to store, view and process and accelerate the screening and diagnosis. In this purpose the system with web application, upload client, storage dependency and algorithm support is implemented. After five alpha tests, the system bore the thousands of large traffic access and generated hundreds of reports with diagnosis.
Cloud Optimized Image Format and Compression
NASA Astrophysics Data System (ADS)
Becker, P.; Plesea, L.; Maurer, T.
2015-04-01
Cloud based image storage and processing requires revaluation of formats and processing methods. For the true value of the massive volumes of earth observation data to be realized, the image data needs to be accessible from the cloud. Traditional file formats such as TIF and NITF were developed in the hay day of the desktop and assumed fast low latency file access. Other formats such as JPEG2000 provide for streaming protocols for pixel data, but still require a server to have file access. These concepts no longer truly hold in cloud based elastic storage and computation environments. This paper will provide details of a newly evolving image storage format (MRF) and compression that is optimized for cloud environments. Although the cost of storage continues to fall for large data volumes, there is still significant value in compression. For imagery data to be used in analysis and exploit the extended dynamic range of the new sensors, lossless or controlled lossy compression is of high value. Compression decreases the data volumes stored and reduces the data transferred, but the reduced data size must be balanced with the CPU required to decompress. The paper also outlines a new compression algorithm (LERC) for imagery and elevation data that optimizes this balance. Advantages of the compression include its simple to implement algorithm that enables it to be efficiently accessed using JavaScript. Combing this new cloud based image storage format and compression will help resolve some of the challenges of big image data on the internet.
A Routing Mechanism for Cloud Outsourcing of Medical Imaging Repositories.
Godinho, Tiago Marques; Viana-Ferreira, Carlos; Bastião Silva, Luís A; Costa, Carlos
2016-01-01
Web-based technologies have been increasingly used in picture archive and communication systems (PACS), in services related to storage, distribution, and visualization of medical images. Nowadays, many healthcare institutions are outsourcing their repositories to the cloud. However, managing communications between multiple geo-distributed locations is still challenging due to the complexity of dealing with huge volumes of data and bandwidth requirements. Moreover, standard methodologies still do not take full advantage of outsourced archives, namely because their integration with other in-house solutions is troublesome. In order to improve the performance of distributed medical imaging networks, a smart routing mechanism was developed. This includes an innovative cache system based on splitting and dynamic management of digital imaging and communications in medicine objects. The proposed solution was successfully deployed in a regional PACS archive. The results obtained proved that it is better than conventional approaches, as it reduces remote access latency and also the required cache storage space.
Integrity Verification for Multiple Data Copies in Cloud Storage Based on Spatiotemporal Chaos
NASA Astrophysics Data System (ADS)
Long, Min; Li, You; Peng, Fei
Aiming to strike for a balance between the security, efficiency and availability of the data verification in cloud storage, a novel integrity verification scheme based on spatiotemporal chaos is proposed for multiple data copies. Spatiotemporal chaos is implemented for node calculation of the binary tree, and the location of the data in the cloud is verified. Meanwhile, dynamic operation can be made to the data. Furthermore, blind information is used to prevent a third-party auditor (TPA) leakage of the users’ data privacy in a public auditing process. Performance analysis and discussion indicate that it is secure and efficient, and it supports dynamic operation and the integrity verification of multiple copies of data. It has a great potential to be implemented in cloud storage services.
Infrastructures for Distributed Computing: the case of BESIII
NASA Astrophysics Data System (ADS)
Pellegrino, J.
2018-05-01
The BESIII is an electron-positron collision experiment hosted at BEPCII in Beijing and aimed to investigate Tau-Charm physics. Now BESIII has been running for several years and gathered more than 1PB raw data. In order to analyze these data and perform massive Monte Carlo simulations, a large amount of computing and storage resources is needed. The distributed computing system is based up on DIRAC and it is in production since 2012. It integrates computing and storage resources from different institutes and a variety of resource types such as cluster, grid, cloud or volunteer computing. About 15 sites from BESIII Collaboration from all over the world joined this distributed computing infrastructure, giving a significant contribution to the IHEP computing facility. Nowadays cloud computing is playing a key role in the HEP computing field, due to its scalability and elasticity. Cloud infrastructures take advantages of several tools, such as VMDirac, to manage virtual machines through cloud managers according to the job requirements. With the virtually unlimited resources from commercial clouds, the computing capacity could scale accordingly in order to deal with any burst demands. General computing models have been discussed in the talk and are addressed herewith, with particular focus on the BESIII infrastructure. Moreover new computing tools and upcoming infrastructures will be addressed.
In-Storage Embedded Accelerator for Sparse Pattern Processing
2016-09-13
computation . As a result, a very small processor could be used and still make full use of storage device bandwidth. When the host software sends...Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee et al. "A view of cloud computing ."Communications of the ACM 53, no. 4 (2010...Laboratory, * MIT Computer Science & Artificial Intelligence Laboratory Abstract— We present a novel system architecture for sparse pattern
System and Method for Providing a Climate Data Persistence Service
NASA Technical Reports Server (NTRS)
Schnase, John L. (Inventor); Ripley, III, William David (Inventor); Duffy, Daniel Q. (Inventor); Thompson, John H. (Inventor); Strong, Savannah L. (Inventor); McInerney, Mark (Inventor); Sinno, Scott (Inventor); Tamkin, Glenn S. (Inventor); Nadeau, Denis (Inventor)
2018-01-01
A system, method and computer-readable storage devices for providing a climate data persistence service. A system configured to provide the service can include a climate data server that performs data and metadata storage and management functions for climate data objects, a compute-storage platform that provides the resources needed to support a climate data server, provisioning software that allows climate data server instances to be deployed as virtual climate data servers in a cloud computing environment, and a service interface, wherein persistence service capabilities are invoked by software applications running on a client device. The climate data objects can be in various formats, such as International Organization for Standards (ISO) Open Archival Information System (OAIS) Reference Model Submission Information Packages, Archive Information Packages, and Dissemination Information Packages. The climate data server can enable scalable, federated storage, management, discovery, and access, and can be tailored for particular use cases.
The structure of the clouds distributed operating system
NASA Technical Reports Server (NTRS)
Dasgupta, Partha; Leblanc, Richard J., Jr.
1989-01-01
A novel system architecture, based on the object model, is the central structuring concept used in the Clouds distributed operating system. This architecture makes Clouds attractive over a wide class of machines and environments. Clouds is a native operating system, designed and implemented at Georgia Tech. and runs on a set of generated purpose computers connected via a local area network. The system architecture of Clouds is composed of a system-wide global set of persistent (long-lived) virtual address spaces, called objects that contain persistent data and code. The object concept is implemented at the operating system level, thus presenting a single level storage view to the user. Lightweight treads carry computational activity through the code stored in the objects. The persistent objects and threads gives rise to a programming environment composed of shared permanent memory, dispensing with the need for hardware-derived concepts such as the file systems and message systems. Though the hardware may be distributed and may have disks and networks, the Clouds provides the applications with a logically centralized system, based on a shared, structured, single level store. The current design of Clouds uses a minimalist philosophy with respect to both the kernel and the operating system. That is, the kernel and the operating system support a bare minimum of functionality. Clouds also adheres to the concept of separation of policy and mechanism. Most low-level operating system services are implemented above the kernel and most high level services are implemented at the user level. From the measured performance of using the kernel mechanisms, we are able to demonstrate that efficient implementations are feasible for the object model on commercially available hardware. Clouds provides a rich environment for conducting research in distributed systems. Some of the topics addressed in this paper include distributed programming environments, consistency of persistent data and fault-tolerance.
Electron-cloud updated simulation results for the PSR, and recent results for the SNS
NASA Astrophysics Data System (ADS)
Pivi, M.; Furman, M. A.
2002-05-01
Recent simulation results for the main features of the electron cloud in the storage ring of the Spallation Neutron Source (SNS) at Oak Ridge, and updated results for the Proton Storage Ring (PSR) at Los Alamos are presented in this paper. A refined model for the secondary emission process including the so called true secondary, rediffused and backscattered electrons has recently been included in the electron-cloud code.
Cloud Computing with iPlant Atmosphere.
McKay, Sheldon J; Skidmore, Edwin J; LaRose, Christopher J; Mercer, Andre W; Noutsos, Christos
2013-10-15
Cloud Computing refers to distributed computing platforms that use virtualization software to provide easy access to physical computing infrastructure and data storage, typically administered through a Web interface. Cloud-based computing provides access to powerful servers, with specific software and virtual hardware configurations, while eliminating the initial capital cost of expensive computers and reducing the ongoing operating costs of system administration, maintenance contracts, power consumption, and cooling. This eliminates a significant barrier to entry into bioinformatics and high-performance computing for many researchers. This is especially true of free or modestly priced cloud computing services. The iPlant Collaborative offers a free cloud computing service, Atmosphere, which allows users to easily create and use instances on virtual servers preconfigured for their analytical needs. Atmosphere is a self-service, on-demand platform for scientific computing. This unit demonstrates how to set up, access and use cloud computing in Atmosphere. Copyright © 2013 John Wiley & Sons, Inc.
2015-06-01
Hadoop Distributed File System (HDFS) without any integration with Accumulo-based Knowledge Stores based on OWL/RDF. 4. Cloud Based The Apache Software...BTW, 7(12), pp. 227–241. Godin, A. & Akins, D. (2014). Extending DCGS-N naval tactical clouds from in-storage to in-memory for the integrated fires...VISUALIZATIONS: A TOOL TO ACHIEVE OPTIMIZED OPERATIONAL DECISION MAKING AND DATA INTEGRATION by Paul C. Hudson Jeffrey A. Rzasa June 2015 Thesis
Two-Level Verification of Data Integrity for Data Storage in Cloud Computing
NASA Astrophysics Data System (ADS)
Xu, Guangwei; Chen, Chunlin; Wang, Hongya; Zang, Zhuping; Pang, Mugen; Jiang, Ping
Data storage in cloud computing can save capital expenditure and relive burden of storage management for users. As the lose or corruption of files stored may happen, many researchers focus on the verification of data integrity. However, massive users often bring large numbers of verifying tasks for the auditor. Moreover, users also need to pay extra fee for these verifying tasks beyond storage fee. Therefore, we propose a two-level verification of data integrity to alleviate these problems. The key idea is to routinely verify the data integrity by users and arbitrate the challenge between the user and cloud provider by the auditor according to the MACs and ϕ values. The extensive performance simulations show that the proposed scheme obviously decreases auditor's verifying tasks and the ratio of wrong arbitration.
Toward a Big Data Science: A challenge of "Science Cloud"
NASA Astrophysics Data System (ADS)
Murata, Ken T.; Watanabe, Hidenobu
2013-04-01
During these 50 years, along with appearance and development of high-performance computers (and super-computers), numerical simulation is considered to be a third methodology for science, following theoretical (first) and experimental and/or observational (second) approaches. The variety of data yielded by the second approaches has been getting more and more. It is due to the progress of technologies of experiments and observations. The amount of the data generated by the third methodologies has been getting larger and larger. It is because of tremendous development and programming techniques of super computers. Most of the data files created by both experiments/observations and numerical simulations are saved in digital formats and analyzed on computers. The researchers (domain experts) are interested in not only how to make experiments and/or observations or perform numerical simulations, but what information (new findings) to extract from the data. However, data does not usually tell anything about the science; sciences are implicitly hidden in the data. Researchers have to extract information to find new sciences from the data files. This is a basic concept of data intensive (data oriented) science for Big Data. As the scales of experiments and/or observations and numerical simulations get larger, new techniques and facilities are required to extract information from a large amount of data files. The technique is called as informatics as a fourth methodology for new sciences. Any methodologies must work on their facilities: for example, space environment are observed via spacecraft and numerical simulations are performed on super-computers, respectively in space science. The facility of the informatics, which deals with large-scale data, is a computational cloud system for science. This paper is to propose a cloud system for informatics, which has been developed at NICT (National Institute of Information and Communications Technology), Japan. The NICT science cloud, we named as OneSpaceNet (OSN), is the first open cloud system for scientists who are going to carry out their informatics for their own science. The science cloud is not for simple uses. Many functions are expected to the science cloud; such as data standardization, data collection and crawling, large and distributed data storage system, security and reliability, database and meta-database, data stewardship, long-term data preservation, data rescue and preservation, data mining, parallel processing, data publication and provision, semantic web, 3D and 4D visualization, out-reach and in-reach, and capacity buildings. Figure (not shown here) is a schematic picture of the NICT science cloud. Both types of data from observation and simulation are stored in the storage system in the science cloud. It should be noted that there are two types of data in observation. One is from archive site out of the cloud: this is a data to be downloaded through the Internet to the cloud. The other one is data from the equipment directly connected to the science cloud. They are often called as sensor clouds. In the present talk, we first introduce the NICT science cloud. We next demonstrate the efficiency of the science cloud, showing several scientific results which we achieved with this cloud system. Through the discussions and demonstrations, the potential performance of sciences cloud will be revealed for any research fields.
Cloud Computing Boosts Business Intelligence of Telecommunication Industry
NASA Astrophysics Data System (ADS)
Xu, Meng; Gao, Dan; Deng, Chao; Luo, Zhiguo; Sun, Shaoling
Business Intelligence becomes an attracting topic in today's data intensive applications, especially in telecommunication industry. Meanwhile, Cloud Computing providing IT supporting Infrastructure with excellent scalability, large scale storage, and high performance becomes an effective way to implement parallel data processing and data mining algorithms. BC-PDM (Big Cloud based Parallel Data Miner) is a new MapReduce based parallel data mining platform developed by CMRI (China Mobile Research Institute) to fit the urgent requirements of business intelligence in telecommunication industry. In this paper, the architecture, functionality and performance of BC-PDM are presented, together with the experimental evaluation and case studies of its applications. The evaluation result demonstrates both the usability and the cost-effectiveness of Cloud Computing based Business Intelligence system in applications of telecommunication industry.
Infrastructure Systems for Advanced Computing in E-science applications
NASA Astrophysics Data System (ADS)
Terzo, Olivier
2013-04-01
In the e-science field are growing needs for having computing infrastructure more dynamic and customizable with a model of use "on demand" that follow the exact request in term of resources and storage capacities. The integration of grid and cloud infrastructure solutions allows us to offer services that can adapt the availability in terms of up scaling and downscaling resources. The main challenges for e-sciences domains will on implement infrastructure solutions for scientific computing that allow to adapt dynamically the demands of computing resources with a strong emphasis on optimizing the use of computing resources for reducing costs of investments. Instrumentation, data volumes, algorithms, analysis contribute to increase the complexity for applications who require high processing power and storage for a limited time and often exceeds the computational resources that equip the majority of laboratories, research Unit in an organization. Very often it is necessary to adapt or even tweak rethink tools, algorithms, and consolidate existing applications through a phase of reverse engineering in order to adapt them to a deployment on Cloud infrastructure. For example, in areas such as rainfall monitoring, meteorological analysis, Hydrometeorology, Climatology Bioinformatics Next Generation Sequencing, Computational Electromagnetic, Radio occultation, the complexity of the analysis raises several issues such as the processing time, the scheduling of tasks of processing, storage of results, a multi users environment. For these reasons, it is necessary to rethink the writing model of E-Science applications in order to be already adapted to exploit the potentiality of cloud computing services through the uses of IaaS, PaaS and SaaS layer. An other important focus is on create/use hybrid infrastructure typically a federation between Private and public cloud, in fact in this way when all resources owned by the organization are all used it will be easy with a federate cloud infrastructure to add some additional resources form the Public cloud for following the needs in term of computational and storage resources and release them where process are finished. Following the hybrid model, the scheduling approach is important for managing both cloud models. Thanks to this model infrastructure every time resources are available for additional request in term of IT capacities that can used "on demand" for a limited time without having to proceed to purchase additional servers.
Data Privacy in Cloud-assisted Healthcare Systems: State of the Art and Future Challenges.
Sajid, Anam; Abbas, Haider
2016-06-01
The widespread deployment and utility of Wireless Body Area Networks (WBAN's) in healthcare systems required new technologies like Internet of Things (IoT) and cloud computing, that are able to deal with the storage and processing limitations of WBAN's. This amalgamation of WBAN-based healthcare systems to cloud-based healthcare systems gave rise to serious privacy concerns to the sensitive healthcare data. Hence, there is a need for the proactive identification and effective mitigation mechanisms for these patient's data privacy concerns that pose continuous threats to the integrity and stability of the healthcare environment. For this purpose, a systematic literature review has been conducted that presents a clear picture of the privacy concerns of patient's data in cloud-assisted healthcare systems and analyzed the mechanisms that are recently proposed by the research community. The methodology used for conducting the review was based on Kitchenham guidelines. Results from the review show that most of the patient's data privacy techniques do not fully address the privacy concerns and therefore require more efforts. The summary presented in this paper would help in setting research directions for the techniques and mechanisms that are needed to address the patient's data privacy concerns in a balanced and light-weight manner by considering all the aspects and limitations of the cloud-assisted healthcare systems.
Towards Dynamic Remote Data Auditing in Computational Clouds
Khurram Khan, Muhammad; Anuar, Nor Badrul
2014-01-01
Cloud computing is a significant shift of computational paradigm where computing as a utility and storing data remotely have a great potential. Enterprise and businesses are now more interested in outsourcing their data to the cloud to lessen the burden of local data storage and maintenance. However, the outsourced data and the computation outcomes are not continuously trustworthy due to the lack of control and physical possession of the data owners. To better streamline this issue, researchers have now focused on designing remote data auditing (RDA) techniques. The majority of these techniques, however, are only applicable for static archive data and are not subject to audit the dynamically updated outsourced data. We propose an effectual RDA technique based on algebraic signature properties for cloud storage system and also present a new data structure capable of efficiently supporting dynamic data operations like append, insert, modify, and delete. Moreover, this data structure empowers our method to be applicable for large-scale data with minimum computation cost. The comparative analysis with the state-of-the-art RDA schemes shows that the proposed scheme is secure and highly efficient in terms of the computation and communication overhead on the auditor and server. PMID:25121114
Towards dynamic remote data auditing in computational clouds.
Sookhak, Mehdi; Akhunzada, Adnan; Gani, Abdullah; Khurram Khan, Muhammad; Anuar, Nor Badrul
2014-01-01
Cloud computing is a significant shift of computational paradigm where computing as a utility and storing data remotely have a great potential. Enterprise and businesses are now more interested in outsourcing their data to the cloud to lessen the burden of local data storage and maintenance. However, the outsourced data and the computation outcomes are not continuously trustworthy due to the lack of control and physical possession of the data owners. To better streamline this issue, researchers have now focused on designing remote data auditing (RDA) techniques. The majority of these techniques, however, are only applicable for static archive data and are not subject to audit the dynamically updated outsourced data. We propose an effectual RDA technique based on algebraic signature properties for cloud storage system and also present a new data structure capable of efficiently supporting dynamic data operations like append, insert, modify, and delete. Moreover, this data structure empowers our method to be applicable for large-scale data with minimum computation cost. The comparative analysis with the state-of-the-art RDA schemes shows that the proposed scheme is secure and highly efficient in terms of the computation and communication overhead on the auditor and server.
Celesti, Antonio; Fazio, Maria; Romano, Agata; Bramanti, Alessia; Bramanti, Placido; Villari, Massimo
2018-05-01
The Open Archive Information System (OAIS) is a reference model for organizing people and resources in a system, and it is already adopted in care centers and medical systems to efficiently manage clinical data, medical personnel, and patients. Archival storage systems are typically implemented using traditional relational database systems, but the relation-oriented technology strongly limits the efficiency in the management of huge amount of patients' clinical data, especially in emerging cloud-based, that are distributed. In this paper, we present an OAIS healthcare architecture useful to manage a huge amount of HL7 clinical documents in a scalable way. Specifically, it is based on a NoSQL column-oriented Data Base Management System deployed in the cloud, thus to benefit from a big tables and wide rows available over a virtual distributed infrastructure. We developed a prototype of the proposed architecture at the IRCCS, and we evaluated its efficiency in a real case of study.
NASA Astrophysics Data System (ADS)
Piani, L.; Tachibana, S.; Hama, T.; Tanaka, H.; Endo, Y.; Sugawara, I.; Dessimoulie, L.; Kimura, Y.; Miyake, A.; Matsuno, J.; Tsuchiyama, A.; Fujita, K.; Nakatsubo, S.; Fukushi, H.; Mori, S.; Chigai, T.; Yurimoto, H.; Kouchi, A.
2017-03-01
Refractory organic compounds formed in molecular clouds are among the building blocks of the solar system objects and could be the precursors of organic matter found in primitive meteorites and cometary materials. However, little is known about the evolutionary pathways of molecular cloud organics from dense molecular clouds to planetary systems. In this study, we focus on the evolution of the morphological and viscoelastic properties of molecular cloud refractory organic matter. We found that the organic residue, experimentally synthesized at ˜10 K from UV-irradiated H2O-CH3OH-NH3 ice, changed significantly in terms of its nanometer- to micrometer-scale morphology and viscoelastic properties after UV irradiation at room temperature. The dose of this irradiation was equivalent to that experienced after short residence in diffuse clouds (≤104 years) or irradiation in outer protoplanetary disks. The irradiated organic residues became highly porous and more rigid and formed amorphous nanospherules. These nanospherules are morphologically similar to organic nanoglobules observed in the least-altered chondrites, chondritic porous interplanetary dust particles, and cometary samples, suggesting that irradiation of refractory organics could be a possible formation pathway for such nanoglobules. The storage modulus (elasticity) of photo-irradiated organic residues is ˜100 MPa irrespective of vibrational frequency, a value that is lower than the storage moduli of minerals and ice. Dust grains coated with such irradiated organics would therefore stick together efficiently, but growth to larger grains might be suppressed due to an increase in aggregate brittleness caused by the strong connections between grains.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Piani, L.; Tachibana, S.; Endo, Y.
Refractory organic compounds formed in molecular clouds are among the building blocks of the solar system objects and could be the precursors of organic matter found in primitive meteorites and cometary materials. However, little is known about the evolutionary pathways of molecular cloud organics from dense molecular clouds to planetary systems. In this study, we focus on the evolution of the morphological and viscoelastic properties of molecular cloud refractory organic matter. We found that the organic residue, experimentally synthesized at ∼10 K from UV-irradiated H{sub 2}O-CH{sub 3}OH-NH{sub 3} ice, changed significantly in terms of its nanometer- to micrometer-scale morphology andmore » viscoelastic properties after UV irradiation at room temperature. The dose of this irradiation was equivalent to that experienced after short residence in diffuse clouds (≤10{sup 4} years) or irradiation in outer protoplanetary disks. The irradiated organic residues became highly porous and more rigid and formed amorphous nanospherules. These nanospherules are morphologically similar to organic nanoglobules observed in the least-altered chondrites, chondritic porous interplanetary dust particles, and cometary samples, suggesting that irradiation of refractory organics could be a possible formation pathway for such nanoglobules. The storage modulus (elasticity) of photo-irradiated organic residues is ∼100 MPa irrespective of vibrational frequency, a value that is lower than the storage moduli of minerals and ice. Dust grains coated with such irradiated organics would therefore stick together efficiently, but growth to larger grains might be suppressed due to an increase in aggregate brittleness caused by the strong connections between grains.« less
sbtools: A package connecting R to cloud-based data for collaborative online research
Winslow, Luke; Chamberlain, Scott; Appling, Alison P.; Read, Jordan S.
2016-01-01
The adoption of high-quality tools for collaboration and reproducible research such as R and Github is becoming more common in many research fields. While Github and other version management systems are excellent resources, they were originally designed to handle code and scale poorly to large text-based or binary datasets. A number of scientific data repositories are coming online and are often focused on dataset archival and publication. To handle collaborative workflows using large scientific datasets, there is increasing need to connect cloud-based online data storage to R. In this article, we describe how the new R package sbtools enables direct access to the advanced online data functionality provided by ScienceBase, the U.S. Geological Survey’s online scientific data storage platform.
Opportunity and Challenges for Migrating Big Data Analytics in Cloud
NASA Astrophysics Data System (ADS)
Amitkumar Manekar, S.; Pradeepini, G., Dr.
2017-08-01
Big Data Analytics is a big word now days. As per demanding and more scalable process data generation capabilities, data acquisition and storage become a crucial issue. Cloud storage is a majorly usable platform; the technology will become crucial to executives handling data powered by analytics. Now a day’s trend towards “big data-as-a-service” is talked everywhere. On one hand, cloud-based big data analytics exactly tackle in progress issues of scale, speed, and cost. But researchers working to solve security and other real-time problem of big data migration on cloud based platform. This article specially focused on finding possible ways to migrate big data to cloud. Technology which support coherent data migration and possibility of doing big data analytics on cloud platform is demanding in natute for new era of growth. This article also gives information about available technology and techniques for migration of big data in cloud.
2016-04-01
the DOD will put DOD systems and data at a risk level comparable to that of their neighbors in the cloud. Just as a user browses a Web page on the...proxy servers for controlling user access to Web pages, and large-scale storage for data management. Each of these devices allows access to the...user to develop applications. Acunetics.com describes Web applications as “computer programs allowing Website visitors to submit and retrieve data
NASA Astrophysics Data System (ADS)
Cura, Rémi; Perret, Julien; Paparoditis, Nicolas
2017-05-01
In addition to more traditional geographical data such as images (rasters) and vectors, point cloud data are becoming increasingly available. Such data are appreciated for their precision and true three-Dimensional (3D) nature. However, managing point clouds can be difficult due to scaling problems and specificities of this data type. Several methods exist but are usually fairly specialised and solve only one aspect of the management problem. In this work, we propose a comprehensive and efficient point cloud management system based on a database server that works on groups of points (patches) rather than individual points. This system is specifically designed to cover the basic needs of point cloud users: fast loading, compressed storage, powerful patch and point filtering, easy data access and exporting, and integrated processing. Moreover, the proposed system fully integrates metadata (like sensor position) and can conjointly use point clouds with other geospatial data, such as images, vectors, topology and other point clouds. Point cloud (parallel) processing can be done in-base with fast prototyping capabilities. Lastly, the system is built on open source technologies; therefore it can be easily extended and customised. We test the proposed system with several billion points obtained from Lidar (aerial and terrestrial) and stereo-vision. We demonstrate loading speeds in the ˜50 million pts/h per process range, transparent-for-user and greater than 2 to 4:1 compression ratio, patch filtering in the 0.1 to 1 s range, and output in the 0.1 million pts/s per process range, along with classical processing methods, such as object detection.
Globally distributed software defined storage (proposal)
NASA Astrophysics Data System (ADS)
Shevel, A.; Khoruzhnikov, S.; Grudinin, V.; Sadov, O.; Kairkanov, A.
2017-10-01
The volume of the coming data in HEP is growing. The volume of the data to be held for a long time is growing as well. Large volume of data - big data - is distributed around the planet. The methods, approaches how to organize and manage the globally distributed data storage are required. The distributed storage has several examples for personal needs like own-cloud.org, pydio.com, seafile.com, sparkleshare.org. For enterprise-level there is a number of systems: SWIFT - distributed storage systems (part of Openstack), CEPH and the like which are mostly object storage. When several data center’s resources are integrated, the organization of data links becomes very important issue especially if several parallel data links between data centers are used. The situation in data centers and in data links may vary each hour. All that means each part of distributed data storage has to be able to rearrange usage of data links and storage servers in each data center. In addition, for each customer of distributed storage different requirements could appear. The above topics are planned to be discussed in data storage proposal.
Adaptive Resource Utilization Prediction System for Infrastructure as a Service Cloud.
Zia Ullah, Qazi; Hassan, Shahzad; Khan, Gul Muhammad
2017-01-01
Infrastructure as a Service (IaaS) cloud provides resources as a service from a pool of compute, network, and storage resources. Cloud providers can manage their resource usage by knowing future usage demand from the current and past usage patterns of resources. Resource usage prediction is of great importance for dynamic scaling of cloud resources to achieve efficiency in terms of cost and energy consumption while keeping quality of service. The purpose of this paper is to present a real-time resource usage prediction system. The system takes real-time utilization of resources and feeds utilization values into several buffers based on the type of resources and time span size. Buffers are read by R language based statistical system. These buffers' data are checked to determine whether their data follows Gaussian distribution or not. In case of following Gaussian distribution, Autoregressive Integrated Moving Average (ARIMA) is applied; otherwise Autoregressive Neural Network (AR-NN) is applied. In ARIMA process, a model is selected based on minimum Akaike Information Criterion (AIC) values. Similarly, in AR-NN process, a network with the lowest Network Information Criterion (NIC) value is selected. We have evaluated our system with real traces of CPU utilization of an IaaS cloud of one hundred and twenty servers.
Adaptive Resource Utilization Prediction System for Infrastructure as a Service Cloud
Hassan, Shahzad; Khan, Gul Muhammad
2017-01-01
Infrastructure as a Service (IaaS) cloud provides resources as a service from a pool of compute, network, and storage resources. Cloud providers can manage their resource usage by knowing future usage demand from the current and past usage patterns of resources. Resource usage prediction is of great importance for dynamic scaling of cloud resources to achieve efficiency in terms of cost and energy consumption while keeping quality of service. The purpose of this paper is to present a real-time resource usage prediction system. The system takes real-time utilization of resources and feeds utilization values into several buffers based on the type of resources and time span size. Buffers are read by R language based statistical system. These buffers' data are checked to determine whether their data follows Gaussian distribution or not. In case of following Gaussian distribution, Autoregressive Integrated Moving Average (ARIMA) is applied; otherwise Autoregressive Neural Network (AR-NN) is applied. In ARIMA process, a model is selected based on minimum Akaike Information Criterion (AIC) values. Similarly, in AR-NN process, a network with the lowest Network Information Criterion (NIC) value is selected. We have evaluated our system with real traces of CPU utilization of an IaaS cloud of one hundred and twenty servers. PMID:28811819
NASA Astrophysics Data System (ADS)
Snyder, P. L.; Brown, V. W.
2017-12-01
IBM has created a general purpose, data-agnostic solution that provides high performance, low data latency, high availability, scalability, and persistent access to the captured data, regardless of source or type. This capability is hosted on commercially available cloud environments and uses much faster, more efficient, reliable, and secure data transfer protocols than the more typically used FTP. The design incorporates completely redundant data paths at every level, including at the cloud data center level, in order to provide the highest assurance of data availability to the data consumers. IBM has been successful in building and testing a Proof of Concept instance on our IBM Cloud platform to receive and disseminate actual GOES-16 data as it is being downlinked. This solution leverages the inherent benefits of a cloud infrastructure configured and tuned for continuous, stable, high-speed data dissemination to data consumers worldwide at the downlink rate. It also is designed to ingest data from multiple simultaneous sources and disseminate data to multiple consumers. Nearly linear scalability is achieved by adding servers and storage.The IBM Proof of Concept system has been tested with our partners to achieve in excess of 5 Gigabits/second over public internet infrastructure. In tests with live GOES-16 data, the system routinely achieved 2.5 Gigabits/second pass-through to The Weather Company from the University of Wisconsin-Madison SSEC. Simulated data was also transferred from the Cooperative Institute for Climate and Satellites — North Carolina to The Weather Company, as well. The storage node allocated to our Proof of Concept system as tested was sized at 480 Terabytes of RAID protected disk as a worst case sizing to accommodate the data from four GOES-16 class satellites for 30 days in a circular buffer. This shows that an abundance of performance and capacity headroom exists in the IBM design that can be applied to additional missions.
Bioinformatics and Microarray Data Analysis on the Cloud.
Calabrese, Barbara; Cannataro, Mario
2016-01-01
High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.
Large-Scale, Parallel, Multi-Sensor Atmospheric Data Fusion Using Cloud Computing
NASA Astrophysics Data System (ADS)
Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E.
2013-05-01
NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration analyses of important climate variables presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over 10 years of data. To efficiently assemble such datasets, we are utilizing Elastic Computing in the Cloud and parallel map/reduce-based algorithms. However, these problems are Data Intensive computing so the data transfer times and storage costs (for caching) are key issues. SciReduce is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in the Cloud. Unlike Hadoop, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Figure 1 shows the architecture of the full computational system, with SciReduce at the core. Multi-year datasets are automatically "sharded" by time and space across a cluster of nodes so that years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the cached input and intermediate datasets. We are using SciReduce to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a NASA MEASURES grant. We will present the architecture of SciReduce, describe the achieved "clock time" speedups in fusing datasets on our own nodes and in the Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. We will also present a concept/prototype for staging NASA's A-Train Atmospheric datasets (Levels 2 & 3) in the Amazon Cloud so that any number of compute jobs can be executed "near" the multi-sensor data. Given such a system, multi-sensor climate studies over 10-20 years of data could be performed in an efficient way, with the researcher paying only his own Cloud compute bill.; Figure 1 -- Architecture.
Large-Scale, Parallel, Multi-Sensor Atmospheric Data Fusion Using Cloud Computing
NASA Astrophysics Data System (ADS)
Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E. J.
2013-12-01
NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the 'A-Train' platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration analyses of important climate variables presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (MERRA), stratify the comparisons using a classification of the 'cloud scenes' from CloudSat, and repeat the entire analysis over 10 years of data. To efficiently assemble such datasets, we are utilizing Elastic Computing in the Cloud and parallel map/reduce-based algorithms. However, these problems are Data Intensive computing so the data transfer times and storage costs (for caching) are key issues. SciReduce is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in the Cloud. Unlike Hadoop, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Figure 1 shows the architecture of the full computational system, with SciReduce at the core. Multi-year datasets are automatically 'sharded' by time and space across a cluster of nodes so that years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the cached input and intermediate datasets. We are using SciReduce to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a NASA MEASURES grant. We will present the architecture of SciReduce, describe the achieved 'clock time' speedups in fusing datasets on our own compute nodes and in the public Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. We will also present a concept/prototype for staging NASA's A-Train Atmospheric datasets (Levels 2 & 3) in the Amazon Cloud so that any number of compute jobs can be executed 'near' the multi-sensor data. Given such a system, multi-sensor climate studies over 10-20 years of data could be performed in an efficient way, with the researcher paying only his own Cloud compute bill. SciReduce Architecture
Cloud computing applications for biomedical science: A perspective.
Navale, Vivek; Bourne, Philip E
2018-06-01
Biomedical research has become a digital data-intensive endeavor, relying on secure and scalable computing, storage, and network infrastructure, which has traditionally been purchased, supported, and maintained locally. For certain types of biomedical applications, cloud computing has emerged as an alternative to locally maintained traditional computing approaches. Cloud computing offers users pay-as-you-go access to services such as hardware infrastructure, platforms, and software for solving common biomedical computational problems. Cloud computing services offer secure on-demand storage and analysis and are differentiated from traditional high-performance computing by their rapid availability and scalability of services. As such, cloud services are engineered to address big data problems and enhance the likelihood of data and analytics sharing, reproducibility, and reuse. Here, we provide an introductory perspective on cloud computing to help the reader determine its value to their own research.
Cloud computing applications for biomedical science: A perspective
2018-01-01
Biomedical research has become a digital data–intensive endeavor, relying on secure and scalable computing, storage, and network infrastructure, which has traditionally been purchased, supported, and maintained locally. For certain types of biomedical applications, cloud computing has emerged as an alternative to locally maintained traditional computing approaches. Cloud computing offers users pay-as-you-go access to services such as hardware infrastructure, platforms, and software for solving common biomedical computational problems. Cloud computing services offer secure on-demand storage and analysis and are differentiated from traditional high-performance computing by their rapid availability and scalability of services. As such, cloud services are engineered to address big data problems and enhance the likelihood of data and analytics sharing, reproducibility, and reuse. Here, we provide an introductory perspective on cloud computing to help the reader determine its value to their own research. PMID:29902176
NASA Astrophysics Data System (ADS)
Farroha, Bassam S.; Farroha, Deborah L.
2011-06-01
The new corporate approach to efficient processing and storage is migrating from in-house service-center services to the newly coined approach of Cloud Computing. This approach advocates thin clients and providing services by the service provider over time-shared resources. The concept is not new, however the implementation approach presents a strategic shift in the way organizations provision and manage their IT resources. The requirements on some of the data sets targeted to be run on the cloud vary depending on the data type, originator, user, and confidentiality level. Additionally, the systems that fuse such data would have to deal with the classifying the product and clearing the computing resources prior to allowing new application to be executed. This indicates that we could end up with a multi-level security system that needs to follow specific rules and can send the output to a protected network and systems in order not to have data spill or contaminated resources. The paper discusses these requirements and potential impact on the cloud architecture. Additionally, the paper discusses the unexpected advantages of the cloud framework providing a sophisticated environment for information sharing and data mining.
Investigation into Cloud Computing for More Robust Automated Bulk Image Geoprocessing
NASA Technical Reports Server (NTRS)
Brown, Richard B.; Smoot, James C.; Underwood, Lauren; Armstrong, C. Duane
2012-01-01
Geospatial resource assessments frequently require timely geospatial data processing that involves large multivariate remote sensing data sets. In particular, for disasters, response requires rapid access to large data volumes, substantial storage space and high performance processing capability. The processing and distribution of this data into usable information products requires a processing pipeline that can efficiently manage the required storage, computing utilities, and data handling requirements. In recent years, with the availability of cloud computing technology, cloud processing platforms have made available a powerful new computing infrastructure resource that can meet this need. To assess the utility of this resource, this project investigates cloud computing platforms for bulk, automated geoprocessing capabilities with respect to data handling and application development requirements. This presentation is of work being conducted by Applied Sciences Program Office at NASA-Stennis Space Center. A prototypical set of image manipulation and transformation processes that incorporate sample Unmanned Airborne System data were developed to create value-added products and tested for implementation on the "cloud". This project outlines the steps involved in creating and testing of open source software developed process code on a local prototype platform, and then transitioning this code with associated environment requirements into an analogous, but memory and processor enhanced cloud platform. A data processing cloud was used to store both standard digital camera panchromatic and multi-band image data, which were subsequently subjected to standard image processing functions such as NDVI (Normalized Difference Vegetation Index), NDMI (Normalized Difference Moisture Index), band stacking, reprojection, and other similar type data processes. Cloud infrastructure service providers were evaluated by taking these locally tested processing functions, and then applying them to a given cloud-enabled infrastructure to assesses and compare environment setup options and enabled technologies. This project reviews findings that were observed when cloud platforms were evaluated for bulk geoprocessing capabilities based on data handling and application development requirements.
Dynamic access control model for privacy preserving personalized healthcare in cloud environment.
Son, Jiseong; Kim, Jeong-Dong; Na, Hong-Seok; Baik, Doo-Kwon
2015-01-01
When sharing and storing healthcare data in a cloud environment, access control is a central issue for preserving data privacy as a patient's personal health data may be accessed without permission from many stakeholders. Specifically, dynamic authorization for the access of data is required because personal health data is stored in cloud storage via wearable devices. Therefore, we propose a dynamic access control model for preserving the privacy of personal healthcare data in a cloud environment. The proposed model considers context information for dynamic access. According to the proposed model, access control can be dynamically determined by changing the context information; this means that even for a subject with the same role in the cloud, access permission is defined differently depending on the context information and access condition. Furthermore, we experiment the ability of the proposed model to provide correct responses by representing a dynamic access decision with real-life personalized healthcare system scenarios.
Large-Scale, Multi-Sensor Atmospheric Data Fusion Using Hybrid Cloud Computing
NASA Astrophysics Data System (ADS)
Wilson, Brian; Manipon, Gerald; Hua, Hook; Fetzer, Eric
2014-05-01
NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration analyses of important climate variables presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over 10 years of data. To efficiently assemble such datasets, we are utilizing Elastic Computing in the Cloud and parallel map-reduce-based algorithms. However, these problems are Data Intensive computing so the data transfer times and storage costs (for caching) are key issues. SciReduce is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in a hybrid Cloud (private eucalyptus & public Amazon). Unlike Hadoop, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Multi-year datasets are automatically "sharded" by time and space across a cluster of nodes so that years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the cached input and intermediate datasets. We are using SciReduce to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a NASA MEASURES grant. We will present the architecture of SciReduce, describe the achieved "clock time" speedups in fusing datasets on our own nodes and in the Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. We will also present a concept and prototype for staging NASA's A-Train Atmospheric datasets (Levels 2 & 3) in the Amazon Cloud so that any number of compute jobs can be executed "near" the multi-sensor data. Given such a system, multi-sensor climate studies over 10-20 years of data could be perform
Sector and Sphere: the design and implementation of a high-performance data cloud
Gu, Yunhong; Grossman, Robert L.
2009-01-01
Cloud computing has demonstrated that processing very large datasets over commodity clusters can be done simply, given the right programming model and infrastructure. In this paper, we describe the design and implementation of the Sector storage cloud and the Sphere compute cloud. By contrast with the existing storage and compute clouds, Sector can manage data not only within a data centre, but also across geographically distributed data centres. Similarly, the Sphere compute cloud supports user-defined functions (UDFs) over data both within and across data centres. As a special case, MapReduce-style programming can be implemented in Sphere by using a Map UDF followed by a Reduce UDF. We describe some experimental studies comparing Sector/Sphere and Hadoop using the Terasort benchmark. In these studies, Sector is approximately twice as fast as Hadoop. Sector/Sphere is open source. PMID:19451100
Kravets, Ilia; Stawski, Nina; Hillson, Nathan J.; Yarmush, Martin L.; Marks, Robert S.; Konry, Tania
2014-01-01
We report an all-in-one platform – ScanDrop – for the rapid and specific capture, detection, and identification of bacteria in drinking water. The ScanDrop platform integrates droplet microfluidics, a portable imaging system, and cloud-based control software and data storage. The cloud-based control software and data storage enables robotic image acquisition, remote image processing, and rapid data sharing. These features form a “cloud” network for water quality monitoring. We have demonstrated the capability of ScanDrop to perform water quality monitoring via the detection of an indicator coliform bacterium, Escherichia coli, in drinking water contaminated with feces. Magnetic beads conjugated with antibodies to E. coli antigen were used to selectively capture and isolate specific bacteria from water samples. The bead-captured bacteria were co-encapsulated in pico-liter droplets with fluorescently-labeled anti-E. coli antibodies, and imaged with an automated custom designed fluorescence microscope. The entire water quality diagnostic process required 8 hours from sample collection to online-accessible results compared with 2–4 days for other currently available standard detection methods. PMID:24475107
Efficient operating system level virtualization techniques for cloud resources
NASA Astrophysics Data System (ADS)
Ansu, R.; Samiksha; Anju, S.; Singh, K. John
2017-11-01
Cloud computing is an advancing technology which provides the servcies of Infrastructure, Platform and Software. Virtualization and Computer utility are the keys of Cloud computing. The numbers of cloud users are increasing day by day. So it is the need of the hour to make resources available on demand to satisfy user requirements. The technique in which resources namely storage, processing power, memory and network or I/O are abstracted is known as Virtualization. For executing the operating systems various virtualization techniques are available. They are: Full System Virtualization and Para Virtualization. In Full Virtualization, the whole architecture of hardware is duplicated virtually. No modifications are required in Guest OS as the OS deals with the VM hypervisor directly. In Para Virtualization, modifications of OS is required to run in parallel with other OS. For the Guest OS to access the hardware, the host OS must provide a Virtual Machine Interface. OS virtualization has many advantages such as migrating applications transparently, consolidation of server, online maintenance of OS and providing security. This paper briefs both the virtualization techniques and discusses the issues in OS level virtualization.
NASA Astrophysics Data System (ADS)
Ham, J. M.
2016-12-01
New microprocessor boards, open-source sensors, and cloud infrastructure developed for the Internet of Things (IoT) can be used to create low-cost monitoring systems for environmental research. This project describes two applications in soil science and hydrology: 1) remote monitoring of the soil temperature regime near oil and gas operations to detect the thermal signature associated with the natural source zone degradation of hydrocarbon contaminants in the vadose zone, and 2) remote monitoring of soil water content near the surface as part of a global citizen science network. In both cases, prototype data collection systems were built around the cellular (2G/3G) "Electron" microcontroller (www.particle.io). This device allows connectivity to the cloud using a low-cost global SIM and data plan. The systems have cellular connectivity in over 100 countries and data can be logged to the cloud for storage. Users can view data real time over any internet connection or via their smart phone. For both projects, data logging, storage, and visualization was done using IoT services like Thingspeak (thingspeak.com). The soil thermal monitoring system was tested on experimental plots in Colorado USA to evaluate the accuracy and reliability of different temperature sensors and 3D printed housings. The soil water experiment included comparison opens-source capacitance-based sensors to commercial versions. Results demonstrate the power of leveraging IoT technology for field research.
Expeditionary Oblong Mezzanine
2016-03-01
Operating System OSI Open Systems Interconnection OS X Operating System Ten PDU Power Distribution Unit POE Power Over Ethernet xvii SAAS ...providing infrastructure as a service (IaaS) and software as a service ( SaaS ) cloud computing technologies. IaaS is a way of providing computing services...such as servers, storage, and network equipment services (Mell & Grance, 2009). SaaS is a means of providing software and applications as an on
A PACS archive architecture supported on cloud services.
Silva, Luís A Bastião; Costa, Carlos; Oliveira, José Luis
2012-05-01
Diagnostic imaging procedures have continuously increased over the last decade and this trend may continue in coming years, creating a great impact on storage and retrieval capabilities of current PACS. Moreover, many smaller centers do not have financial resources or requirements that justify the acquisition of a traditional infrastructure. Alternative solutions, such as cloud computing, may help address this emerging need. A tremendous amount of ubiquitous computational power, such as that provided by Google and Amazon, are used every day as a normal commodity. Taking advantage of this new paradigm, an architecture for a Cloud-based PACS archive that provides data privacy, integrity, and availability is proposed. The solution is independent from the cloud provider and the core modules were successfully instantiated in examples of two cloud computing providers. Operational metrics for several medical imaging modalities were tabulated and compared for Google Storage, Amazon S3, and LAN PACS. A PACS-as-a-Service archive that provides storage of medical studies using the Cloud was developed. The results show that the solution is robust and that it is possible to store, query, and retrieve all desired studies in a similar way as in a local PACS approach. Cloud computing is an emerging solution that promises high scalability of infrastructures, software, and applications, according to a "pay-as-you-go" business model. The presented architecture uses the cloud to setup medical data repositories and can have a significant impact on healthcare institutions by reducing IT infrastructures.
The JASMIN Cloud: specialised and hybrid to meet the needs of the Environmental Sciences Community
NASA Astrophysics Data System (ADS)
Kershaw, Philip; Lawrence, Bryan; Churchill, Jonathan; Pritchard, Matt
2014-05-01
Cloud computing provides enormous opportunities for the research community. The large public cloud providers provide near-limitless scaling capability. However, adapting Cloud to scientific workloads is not without its problems. The commodity nature of the public cloud infrastructure can be at odds with the specialist requirements of the research community. Issues such as trust, ownership of data, WAN bandwidth and costing models make additional barriers to more widespread adoption. Alongside the application of public cloud for scientific applications, a number of private cloud initiatives are underway in the research community of which the JASMIN Cloud is one example. Here, cloud service models are being effectively super-imposed over more established services such as data centres, compute cluster facilities and Grids. These have the potential to deliver the specialist infrastructure needed for the science community coupled with the benefits of a Cloud service model. The JASMIN facility based at the Rutherford Appleton Laboratory was established in 2012 to support the data analysis requirements of the climate and Earth Observation community. In its first year of operation, the 5PB of available storage capacity was filled and the hosted compute capability used extensively. JASMIN has modelled the concept of a centralised large-volume data analysis facility. Key characteristics have enabled success: peta-scale fast disk connected via low latency networks to compute resources and the use of virtualisation for effective management of the resources for a range of users. A second phase is now underway funded through NERC's (Natural Environment Research Council) Big Data initiative. This will see significant expansion to the resources available with a doubling of disk-based storage to 12PB and an increase of compute capacity by a factor of ten to over 3000 processing cores. This expansion is accompanied by a broadening in the scope for JASMIN, as a service available to the entire UK environmental science community. Experience with the first phase demonstrated the range of user needs. A trade-off is needed between access privileges to resources, flexibility of use and security. This has influenced the form and types of service under development for the new phase. JASMIN will deploy a specialised private cloud organised into "Managed" and "Unmanaged" components. In the Managed Cloud, users have direct access to the storage and compute resources for optimal performance but for reasons of security, via a more restrictive PaaS (Platform-as-a-Service) interface. The Unmanaged Cloud is deployed in an isolated part of the network but co-located with the rest of the infrastructure. This enables greater liberty to tenants - full IaaS (Infrastructure-as-a-Service) capability to provision customised infrastructure - whilst at the same time protecting more sensitive parts of the system from direct access using these elevated privileges. The private cloud will be augmented with cloud-bursting capability so that it can exploit the resources available from public clouds, making it effectively a hybrid solution. A single interface will overlay the functionality of both the private cloud and external interfaces to public cloud providers giving users the flexibility to migrate resources between infrastructures as requirements dictate.
Argonne wins four R&D 100 Awards | Argonne National Laboratory
. High-Energy Concentration-Gradient Cathode Material for Plug-in Hybrids and All-Electric Vehicles converting discovery science into innovative, high-impact products, processes and systems." Globus scientific facilities (such as supercomputing centers and high energy physics experiments), cloud storage
A Hadoop-based Molecular Docking System
NASA Astrophysics Data System (ADS)
Dong, Yueli; Guo, Quan; Sun, Bin
2017-10-01
Molecular docking always faces the challenge of managing tens of TB datasets. It is necessary to improve the efficiency of the storage and docking. We proposed the molecular docking platform based on Hadoop for virtual screening, it provides the preprocessing of ligand datasets and the analysis function of the docking results. A molecular cloud database that supports mass data management is constructed. Through this platform, the docking time is reduced, the data storage is efficient, and the management of the ligand datasets is convenient.
Rethinking key–value store for parallel I/O optimization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kougkas, Anthony; Eslami, Hassan; Sun, Xian-He
2015-01-26
Key-value stores are being widely used as the storage system for large-scale internet services and cloud storage systems. However, they are rarely used in HPC systems, where parallel file systems are the dominant storage solution. In this study, we examine the architecture differences and performance characteristics of parallel file systems and key-value stores. We propose using key-value stores to optimize overall Input/Output (I/O) performance, especially for workloads that parallel file systems cannot handle well, such as the cases with intense data synchronization or heavy metadata operations. We conducted experiments with several synthetic benchmarks, an I/O benchmark, and a real application.more » We modeled the performance of these two systems using collected data from our experiments, and we provide a predictive method to identify which system offers better I/O performance given a specific workload. The results show that we can optimize the I/O performance in HPC systems by utilizing key-value stores.« less
A Cost-Benefit Study of Doing Astrophysics On The Cloud: Production of Image Mosaics
NASA Astrophysics Data System (ADS)
Berriman, G. B.; Good, J. C. Deelman, E.; Singh, G. Livny, M.
2009-09-01
Utility grids such as the Amazon EC2 and Amazon S3 clouds offer computational and storage resources that can be used on-demand for a fee by compute- and data-intensive applications. The cost of running an application on such a cloud depends on the compute, storage and communication resources it will provision and consume. Different execution plans of the same application may result in significantly different costs. We studied via simulation the cost performance trade-offs of different execution and resource provisioning plans by creating, under the Amazon cloud fee structure, mosaics with the Montage image mosaic engine, a widely used data- and compute-intensive application. Specifically, we studied the cost of building mosaics of 2MASS data that have sizes of 1, 2 and 4 square degrees, and a 2MASS all-sky mosaic. These are examples of mosaics commonly generated by astronomers. We also study these trade-offs in the context of the storage and communication fees of Amazon S3 when used for long-term application data archiving. Our results show that by provisioning the right amount of storage and compute resources cost can be significantly reduced with no significant impact on application performance.
Cloud-based NEXRAD Data Processing and Analysis for Hydrologic Applications
NASA Astrophysics Data System (ADS)
Seo, B. C.; Demir, I.; Keem, M.; Goska, R.; Weber, J.; Krajewski, W. F.
2016-12-01
The real-time and full historical archive of NEXRAD Level II data, covering the entire United States from 1991 to present, recently became available on Amazon cloud S3. This provides a new opportunity to rebuild the Hydro-NEXRAD software system that enabled users to access vast amounts of NEXRAD radar data in support of a wide range of research. The system processes basic radar data (Level II) and delivers radar-rainfall products based on the user's custom selection of features such as space and time domain, river basin, rainfall product space and time resolution, and rainfall estimation algorithms. The cloud-based new system can eliminate prior challenges faced by Hydro-NEXRAD data acquisition and processing: (1) temporal and spatial limitation arising from the limited data storage; (2) archive (past) data ingestion and format conversion; and (3) separate data processing flow for the past and real-time Level II data. To enhance massive data processing and computational efficiency, the new system is implemented and tested for the Iowa domain. This pilot study begins by ingesting rainfall metadata and implementing Hydro-NEXRAD capabilities on the cloud using the new polarimetric features, as well as the existing algorithm modules and scripts. The authors address the reliability and feasibility of cloud computation and processing, followed by an assessment of response times from an interactive web-based system.
A keyword searchable attribute-based encryption scheme with attribute update for cloud storage.
Wang, Shangping; Ye, Jian; Zhang, Yaling
2018-01-01
Ciphertext-policy attribute-based encryption (CP-ABE) scheme is a new type of data encryption primitive, which is very suitable for data cloud storage for its fine-grained access control. Keyword-based searchable encryption scheme enables users to quickly find interesting data stored in the cloud server without revealing any information of the searched keywords. In this work, we provide a keyword searchable attribute-based encryption scheme with attribute update for cloud storage, which is a combination of attribute-based encryption scheme and keyword searchable encryption scheme. The new scheme supports the user's attribute update, especially in our new scheme when a user's attribute need to be updated, only the user's secret key related with the attribute need to be updated, while other user's secret key and the ciphertexts related with this attribute need not to be updated with the help of the cloud server. In addition, we outsource the operation with high computation cost to cloud server to reduce the user's computational burden. Moreover, our scheme is proven to be semantic security against chosen ciphertext-policy and chosen plaintext attack in the general bilinear group model. And our scheme is also proven to be semantic security against chosen keyword attack under bilinear Diffie-Hellman (BDH) assumption.
A keyword searchable attribute-based encryption scheme with attribute update for cloud storage
Wang, Shangping; Zhang, Yaling
2018-01-01
Ciphertext-policy attribute-based encryption (CP-ABE) scheme is a new type of data encryption primitive, which is very suitable for data cloud storage for its fine-grained access control. Keyword-based searchable encryption scheme enables users to quickly find interesting data stored in the cloud server without revealing any information of the searched keywords. In this work, we provide a keyword searchable attribute-based encryption scheme with attribute update for cloud storage, which is a combination of attribute-based encryption scheme and keyword searchable encryption scheme. The new scheme supports the user's attribute update, especially in our new scheme when a user's attribute need to be updated, only the user's secret key related with the attribute need to be updated, while other user's secret key and the ciphertexts related with this attribute need not to be updated with the help of the cloud server. In addition, we outsource the operation with high computation cost to cloud server to reduce the user's computational burden. Moreover, our scheme is proven to be semantic security against chosen ciphertext-policy and chosen plaintext attack in the general bilinear group model. And our scheme is also proven to be semantic security against chosen keyword attack under bilinear Diffie-Hellman (BDH) assumption. PMID:29795577
GTZ: a fast compression and cloud transmission tool optimized for FASTQ files.
Xing, Yuting; Li, Gen; Wang, Zhenguo; Feng, Bolun; Song, Zhuo; Wu, Chengkun
2017-12-28
The dramatic development of DNA sequencing technology is generating real big data, craving for more storage and bandwidth. To speed up data sharing and bring data to computing resource faster and cheaper, it is necessary to develop a compression tool than can support efficient compression and transmission of sequencing data onto the cloud storage. This paper presents GTZ, a compression and transmission tool, optimized for FASTQ files. As a reference-free lossless FASTQ compressor, GTZ treats different lines of FASTQ separately, utilizes adaptive context modelling to estimate their characteristic probabilities, and compresses data blocks with arithmetic coding. GTZ can also be used to compress multiple files or directories at once. Furthermore, as a tool to be used in the cloud computing era, it is capable of saving compressed data locally or transmitting data directly into cloud by choice. We evaluated the performance of GTZ on some diverse FASTQ benchmarks. Results show that in most cases, it outperforms many other tools in terms of the compression ratio, speed and stability. GTZ is a tool that enables efficient lossless FASTQ data compression and simultaneous data transmission onto to cloud. It emerges as a useful tool for NGS data storage and transmission in the cloud environment. GTZ is freely available online at: https://github.com/Genetalks/gtz .
Cloud-based crowd sensing: a framework for location-based crowd analyzer and advisor
NASA Astrophysics Data System (ADS)
Aishwarya, K. C.; Nambi, A.; Hudson, S.; Nadesh, R. K.
2017-11-01
Cloud computing is an emerging field of computer science to integrate and explore large and powerful computing systems and storages for personal and also for enterprise requirements. Mobile Cloud Computing is the inheritance of this concept towards mobile hand-held devices. Crowdsensing, or to be precise, Mobile Crowdsensing is the process of sharing resources from an available group of mobile handheld devices that support sharing of different resources such as data, memory and bandwidth to perform a single task for collective reasons. In this paper, we propose a framework to use Crowdsensing and perform a crowd analyzer and advisor whether the user can go to the place or not. This is an ongoing research and is a new concept to which the direction of cloud computing has shifted and is viable for more expansion in the near future.
Understanding the Performance and Potential of Cloud Computing for Scientific Applications
Sadooghi, Iman; Martin, Jesus Hernandez; Li, Tonglin; ...
2015-02-19
In this paper, commercial clouds bring a great opportunity to the scientific computing area. Scientific applications usually require significant resources, however not all scientists have access to sufficient high-end computing systems, may of which can be found in the Top500 list. Cloud Computing has gained the attention of scientists as a competitive resource to run HPC applications at a potentially lower cost. But as a different infrastructure, it is unclear whether clouds are capable of running scientific applications with a reasonable performance per money spent. This work studies the performance of public clouds and places this performance in context tomore » price. We evaluate the raw performance of different services of AWS cloud in terms of the basic resources, such as compute, memory, network and I/O. We also evaluate the performance of the scientific applications running in the cloud. This paper aims to assess the ability of the cloud to perform well, as well as to evaluate the cost of the cloud running scientific applications. We developed a full set of metrics and conducted a comprehensive performance evlauation over the Amazon cloud. We evaluated EC2, S3, EBS and DynamoDB among the many Amazon AWS services. We evaluated the memory sub-system performance with CacheBench, the network performance with iperf, processor and network performance with the HPL benchmark application, and shared storage with NFS and PVFS in addition to S3. We also evaluated a real scientific computing application through the Swift parallel scripting system at scale. Armed with both detailed benchmarks to gauge expected performance and a detailed monetary cost analysis, we expect this paper will be a recipe cookbook for scientists to help them decide where to deploy and run their scientific applications between public clouds, private clouds, or hybrid clouds.« less
Understanding the Performance and Potential of Cloud Computing for Scientific Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sadooghi, Iman; Martin, Jesus Hernandez; Li, Tonglin
In this paper, commercial clouds bring a great opportunity to the scientific computing area. Scientific applications usually require significant resources, however not all scientists have access to sufficient high-end computing systems, may of which can be found in the Top500 list. Cloud Computing has gained the attention of scientists as a competitive resource to run HPC applications at a potentially lower cost. But as a different infrastructure, it is unclear whether clouds are capable of running scientific applications with a reasonable performance per money spent. This work studies the performance of public clouds and places this performance in context tomore » price. We evaluate the raw performance of different services of AWS cloud in terms of the basic resources, such as compute, memory, network and I/O. We also evaluate the performance of the scientific applications running in the cloud. This paper aims to assess the ability of the cloud to perform well, as well as to evaluate the cost of the cloud running scientific applications. We developed a full set of metrics and conducted a comprehensive performance evlauation over the Amazon cloud. We evaluated EC2, S3, EBS and DynamoDB among the many Amazon AWS services. We evaluated the memory sub-system performance with CacheBench, the network performance with iperf, processor and network performance with the HPL benchmark application, and shared storage with NFS and PVFS in addition to S3. We also evaluated a real scientific computing application through the Swift parallel scripting system at scale. Armed with both detailed benchmarks to gauge expected performance and a detailed monetary cost analysis, we expect this paper will be a recipe cookbook for scientists to help them decide where to deploy and run their scientific applications between public clouds, private clouds, or hybrid clouds.« less
GIFT-Cloud: A data sharing and collaboration platform for medical imaging research.
Doel, Tom; Shakir, Dzhoshkun I; Pratt, Rosalind; Aertsen, Michael; Moggridge, James; Bellon, Erwin; David, Anna L; Deprest, Jan; Vercauteren, Tom; Ourselin, Sébastien
2017-02-01
Clinical imaging data are essential for developing research software for computer-aided diagnosis, treatment planning and image-guided surgery, yet existing systems are poorly suited for data sharing between healthcare and academia: research systems rarely provide an integrated approach for data exchange with clinicians; hospital systems are focused towards clinical patient care with limited access for external researchers; and safe haven environments are not well suited to algorithm development. We have established GIFT-Cloud, a data and medical image sharing platform, to meet the needs of GIFT-Surg, an international research collaboration that is developing novel imaging methods for fetal surgery. GIFT-Cloud also has general applicability to other areas of imaging research. GIFT-Cloud builds upon well-established cross-platform technologies. The Server provides secure anonymised data storage, direct web-based data access and a REST API for integrating external software. The Uploader provides automated on-site anonymisation, encryption and data upload. Gateways provide a seamless process for uploading medical data from clinical systems to the research server. GIFT-Cloud has been implemented in a multi-centre study for fetal medicine research. We present a case study of placental segmentation for pre-operative surgical planning, showing how GIFT-Cloud underpins the research and integrates with the clinical workflow. GIFT-Cloud simplifies the transfer of imaging data from clinical to research institutions, facilitating the development and validation of medical research software and the sharing of results back to the clinical partners. GIFT-Cloud supports collaboration between multiple healthcare and research institutions while satisfying the demands of patient confidentiality, data security and data ownership. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Platform for High-Assurance Cloud Computing
2016-06-01
to create today’s standard cloud computing applications and services. Additionally , our SuperCloud (a related but distinct project under the same... Additionally , our SuperCloud (a related but distinct project under the same MRC funding) reduces vendor lock-in and permits application to migrate, to follow...managing key- value storage with strong assurance properties. This first accomplishment allows us to climb the cloud technical stack, by offering
Above the cloud computing orbital services distributed data model
NASA Astrophysics Data System (ADS)
Straub, Jeremy
2014-05-01
Technology miniaturization and system architecture advancements have created an opportunity to significantly lower the cost of many types of space missions by sharing capabilities between multiple spacecraft. Historically, most spacecraft have been atomic entities that (aside from their communications with and tasking by ground controllers) operate in isolation. Several notable example exist; however, these are purpose-designed systems that collaborate to perform a single goal. The above the cloud computing (ATCC) concept aims to create ad-hoc collaboration between service provider and consumer craft. Consumer craft can procure processing, data transmission, storage, imaging and other capabilities from provider craft. Because of onboard storage limitations, communications link capability limitations and limited windows of communication, data relevant to or required for various operations may span multiple craft. This paper presents a model for the identification, storage and accessing of this data. This model includes appropriate identification features for this highly distributed environment. It also deals with business model constraints such as data ownership, retention and the rights of the storing craft to access, resell, transmit or discard the data in its possession. The model ensures data integrity and confidentiality (to the extent applicable to a given data item), deals with unique constraints of the orbital environment and tags data with business model (contractual) obligation data.
A Cloud Computing Based Patient Centric Medical Information System
NASA Astrophysics Data System (ADS)
Agarwal, Ankur; Henehan, Nathan; Somashekarappa, Vivek; Pandya, A. S.; Kalva, Hari; Furht, Borko
This chapter discusses an emerging concept of a cloud computing based Patient Centric Medical Information System framework that will allow various authorized users to securely access patient records from various Care Delivery Organizations (CDOs) such as hospitals, urgent care centers, doctors, laboratories, imaging centers among others, from any location. Such a system must seamlessly integrate all patient records including images such as CT-SCANS and MRI'S which can easily be accessed from any location and reviewed by any authorized user. In such a scenario the storage and transmission of medical records will have be conducted in a totally secure and safe environment with a very high standard of data integrity, protecting patient privacy and complying with all Health Insurance Portability and Accountability Act (HIPAA) regulations.
Dynamic Collaboration Infrastructure for Hydrologic Science
NASA Astrophysics Data System (ADS)
Tarboton, D. G.; Idaszak, R.; Castillo, C.; Yi, H.; Jiang, F.; Jones, N.; Goodall, J. L.
2016-12-01
Data and modeling infrastructure is becoming increasingly accessible to water scientists. HydroShare is a collaborative environment that currently offers water scientists the ability to access modeling and data infrastructure in support of data intensive modeling and analysis. It supports the sharing of and collaboration around "resources" which are social objects defined to include both data and models in a structured standardized format. Users collaborate around these objects via comments, ratings, and groups. HydroShare also supports web services and cloud based computation for the execution of hydrologic models and analysis and visualization of hydrologic data. However, the quantity and variety of data and modeling infrastructure available that can be accessed from environments like HydroShare is increasing. Storage infrastructure can range from one's local PC to campus or organizational storage to storage in the cloud. Modeling or computing infrastructure can range from one's desktop to departmental clusters to national HPC resources to grid and cloud computing resources. How does one orchestrate this vast number of data and computing infrastructure without needing to correspondingly learn each new system? A common limitation across these systems is the lack of efficient integration between data transport mechanisms and the corresponding high-level services to support large distributed data and compute operations. A scientist running a hydrology model from their desktop may require processing a large collection of files across the aforementioned storage and compute resources and various national databases. To address these community challenges a proof-of-concept prototype was created integrating HydroShare with RADII (Resource Aware Data-centric collaboration Infrastructure) to provide software infrastructure to enable the comprehensive and rapid dynamic deployment of what we refer to as "collaborative infrastructure." In this presentation we discuss the results of this proof-of-concept prototype which enabled HydroShare users to readily instantiate virtual infrastructure marshaling arbitrary combinations, varieties, and quantities of distributed data and computing infrastructure in addressing big problems in hydrology.
Cloud Engineering Principles and Technology Enablers for Medical Image Processing-as-a-Service.
Bao, Shunxing; Plassard, Andrew J; Landman, Bennett A; Gokhale, Aniruddha
2017-04-01
Traditional in-house, laboratory-based medical imaging studies use hierarchical data structures (e.g., NFS file stores) or databases (e.g., COINS, XNAT) for storage and retrieval. The resulting performance from these approaches is, however, impeded by standard network switches since they can saturate network bandwidth during transfer from storage to processing nodes for even moderate-sized studies. To that end, a cloud-based "medical image processing-as-a-service" offers promise in utilizing the ecosystem of Apache Hadoop, which is a flexible framework providing distributed, scalable, fault tolerant storage and parallel computational modules, and HBase, which is a NoSQL database built atop Hadoop's distributed file system. Despite this promise, HBase's load distribution strategy of region split and merge is detrimental to the hierarchical organization of imaging data (e.g., project, subject, session, scan, slice). This paper makes two contributions to address these concerns by describing key cloud engineering principles and technology enhancements we made to the Apache Hadoop ecosystem for medical imaging applications. First, we propose a row-key design for HBase, which is a necessary step that is driven by the hierarchical organization of imaging data. Second, we propose a novel data allocation policy within HBase to strongly enforce collocation of hierarchically related imaging data. The proposed enhancements accelerate data processing by minimizing network usage and localizing processing to machines where the data already exist. Moreover, our approach is amenable to the traditional scan, subject, and project-level analysis procedures, and is compatible with standard command line/scriptable image processing software. Experimental results for an illustrative sample of imaging data reveals that our new HBase policy results in a three-fold time improvement in conversion of classic DICOM to NiFTI file formats when compared with the default HBase region split policy, and nearly a six-fold improvement over a commonly available network file system (NFS) approach even for relatively small file sets. Moreover, file access latency is lower than network attached storage.
Thermal buffering of receivers for parabolic dish solar thermal power plants
NASA Technical Reports Server (NTRS)
Manvi, R.; Fujita, T.; Gajanana, B. C.; Marcus, C. J.
1980-01-01
A parabolic dish solar thermal power plant comprises a field of parabolic dish power modules where each module is composed of a two-axis tracking parabolic dish concentrator which reflects sunlight (insolation) into the aperture of a cavity receiver at the focal point of the dish. The heat generated by the solar flux entering the receiver is removed by a heat transfer fluid. In the dish power module, this heat is used to drive a small heat engine/generator assembly which is directly connected to the cavity receiver at the focal point. A computer analysis is performed to assess the thermal buffering characteristics of receivers containing sensible and latent heat thermal energy storage. Parametric variations of the thermal inertia of the integrated receiver-buffer storage systems coupled with different fluid flow rate control strategies are carried out to delineate the effect of buffer storage, the transient response of the receiver-storage systems and corresponding fluid outlet temperature. It is concluded that addition of phase change buffer storage will substantially improve system operational characteristics during periods of rapidly fluctuating insolation due to cloud passage.
Move It or Lose It: Cloud-Based Data Storage
ERIC Educational Resources Information Center
Waters, John K.
2010-01-01
There was a time when school districts showed little interest in storing or backing up their data to remote servers. Nothing seemed less secure than handing off data to someone else. But in the last few years the buzz around cloud storage has grown louder, and the idea that data backup could be provided as a service has begun to gain traction in…
WE-B-BRD-01: Innovation in Radiation Therapy Planning II: Cloud Computing in RT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moore, K; Kagadis, G; Xing, L
As defined by the National Institute of Standards and Technology, cloud computing is “a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” Despite the omnipresent role of computers in radiotherapy, cloud computing has yet to achieve widespread adoption in clinical or research applications, though the transition to such “on-demand” access is underway. As this transition proceeds, new opportunities for aggregate studies and efficient use of computational resources are set againstmore » new challenges in patient privacy protection, data integrity, and management of clinical informatics systems. In this Session, current and future applications of cloud computing and distributed computational resources will be discussed in the context of medical imaging, radiotherapy research, and clinical radiation oncology applications. Learning Objectives: Understand basic concepts of cloud computing. Understand how cloud computing could be used for medical imaging applications. Understand how cloud computing could be employed for radiotherapy research.4. Understand how clinical radiotherapy software applications would function in the cloud.« less
Provenance based data integrity checking and verification in cloud environments
Haq, Inam Ul; Jan, Bilal; Khan, Fakhri Alam; Ahmad, Awais
2017-01-01
Cloud computing is a recent tendency in IT that moves computing and data away from desktop and hand-held devices into large scale processing hubs and data centers respectively. It has been proposed as an effective solution for data outsourcing and on demand computing to control the rising cost of IT setups and management in enterprises. However, with Cloud platforms user’s data is moved into remotely located storages such that users lose control over their data. This unique feature of the Cloud is facing many security and privacy challenges which need to be clearly understood and resolved. One of the important concerns that needs to be addressed is to provide the proof of data integrity, i.e., correctness of the user’s data stored in the Cloud storage. The data in Clouds is physically not accessible to the users. Therefore, a mechanism is required where users can check if the integrity of their valuable data is maintained or compromised. For this purpose some methods are proposed like mirroring, checksumming and using third party auditors amongst others. However, these methods use extra storage space by maintaining multiple copies of data or the presence of a third party verifier is required. In this paper, we address the problem of proving data integrity in Cloud computing by proposing a scheme through which users are able to check the integrity of their data stored in Clouds. In addition, users can track the violation of data integrity if occurred. For this purpose, we utilize a relatively new concept in the Cloud computing called “Data Provenance”. Our scheme is capable to reduce the need of any third party services, additional hardware support and the replication of data items on client side for integrity checking. PMID:28545151
Provenance based data integrity checking and verification in cloud environments.
Imran, Muhammad; Hlavacs, Helmut; Haq, Inam Ul; Jan, Bilal; Khan, Fakhri Alam; Ahmad, Awais
2017-01-01
Cloud computing is a recent tendency in IT that moves computing and data away from desktop and hand-held devices into large scale processing hubs and data centers respectively. It has been proposed as an effective solution for data outsourcing and on demand computing to control the rising cost of IT setups and management in enterprises. However, with Cloud platforms user's data is moved into remotely located storages such that users lose control over their data. This unique feature of the Cloud is facing many security and privacy challenges which need to be clearly understood and resolved. One of the important concerns that needs to be addressed is to provide the proof of data integrity, i.e., correctness of the user's data stored in the Cloud storage. The data in Clouds is physically not accessible to the users. Therefore, a mechanism is required where users can check if the integrity of their valuable data is maintained or compromised. For this purpose some methods are proposed like mirroring, checksumming and using third party auditors amongst others. However, these methods use extra storage space by maintaining multiple copies of data or the presence of a third party verifier is required. In this paper, we address the problem of proving data integrity in Cloud computing by proposing a scheme through which users are able to check the integrity of their data stored in Clouds. In addition, users can track the violation of data integrity if occurred. For this purpose, we utilize a relatively new concept in the Cloud computing called "Data Provenance". Our scheme is capable to reduce the need of any third party services, additional hardware support and the replication of data items on client side for integrity checking.
NASA Astrophysics Data System (ADS)
Arko, S. A.; Hogenson, R.; Geiger, A.; Herrmann, J.; Buechler, B.; Hogenson, K.
2016-12-01
In the coming years there will be an unprecedented amount of SAR data available on a free and open basis to research and operational users around the globe. The Alaska Satellite Facility (ASF) DAAC hosts, through an international agreement, data from the Sentinel-1 spacecraft and will be hosting data from the upcoming NASA ISRO SAR (NISAR) mission. To more effectively manage and exploit these vast datasets, ASF DAAC has begun moving portions of the archive to the cloud and utilizing cloud services to provide higher-level processing on the data. The Hybrid Pluggable Processing Pipeline (HyP3) project is designed to support higher-level data processing in the cloud and extend the capabilities of researchers to larger scales. Built upon a set of core Amazon cloud services, the HyP3 system allows users to request data processing using a number of canned algorithms or their own algorithms once they have been uploaded to the cloud. The HyP3 system automatically accesses the ASF cloud-based archive through the DAAC RESTful application programming interface and processes the data on Amazon's elastic compute cluster (EC2). Final products are distributed through Amazon's simple storage service (S3) and are available for user download. This presentation will provide an overview of ASF DAAC's activities moving the Sentinel-1 archive into the cloud and developing the integrated HyP3 system, covering both the benefits and difficulties of working in the cloud. Additionally, we will focus on the utilization of HyP3 for higher-level processing of SAR data. Two example algorithms, for sea-ice tracking and change detection, will be discussed as well as the mechanism for integrating new algorithms into the pipeline for community use.
Security and privacy preserving approaches in the eHealth clouds with disaster recovery plan.
Sahi, Aqeel; Lai, David; Li, Yan
2016-11-01
Cloud computing was introduced as an alternative storage and computing model in the health sector as well as other sectors to handle large amounts of data. Many healthcare companies have moved their electronic data to the cloud in order to reduce in-house storage, IT development and maintenance costs. However, storing the healthcare records in a third-party server may cause serious storage, security and privacy issues. Therefore, many approaches have been proposed to preserve security as well as privacy in cloud computing projects. Cryptographic-based approaches were presented as one of the best ways to ensure the security and privacy of healthcare data in the cloud. Nevertheless, the cryptographic-based approaches which are used to transfer health records safely remain vulnerable regarding security, privacy, or the lack of any disaster recovery strategy. In this paper, we review the related work on security and privacy preserving as well as disaster recovery in the eHealth cloud domain. Then we propose two approaches, the Security-Preserving approach and the Privacy-Preserving approach, and a disaster recovery plan. The Security-Preserving approach is a robust means of ensuring the security and integrity of Electronic Health Records, and the Privacy-Preserving approach is an efficient authentication approach which protects the privacy of Personal Health Records. Finally, we discuss how the integrated approaches and the disaster recovery plan can ensure the reliability and security of cloud projects. Copyright © 2016 Elsevier Ltd. All rights reserved.
Charting a Security Landscape in the Clouds: Data Protection and Collaboration in Cloud Storage
2016-07-01
cloud computing is perhaps the most revolutionary force in the information technology industry today. This field encompasses many different domains...characteristic shared by all cloud computing tasks is that they involve storing data in the cloud . In this report, we therefore aim to describe and rank the...CONCLUSION The advent of cloud computing has caused government organizations to rethink their IT architectures so that they can take advantage of the
Toward a web-based real-time radiation treatment planning system in a cloud computing environment.
Na, Yong Hum; Suh, Tae-Suk; Kapp, Daniel S; Xing, Lei
2013-09-21
To exploit the potential dosimetric advantages of intensity modulated radiation therapy (IMRT) and volumetric modulated arc therapy (VMAT), an in-depth approach is required to provide efficient computing methods. This needs to incorporate clinically related organ specific constraints, Monte Carlo (MC) dose calculations, and large-scale plan optimization. This paper describes our first steps toward a web-based real-time radiation treatment planning system in a cloud computing environment (CCE). The Amazon Elastic Compute Cloud (EC2) with a master node (named m2.xlarge containing 17.1 GB of memory, two virtual cores with 3.25 EC2 Compute Units each, 420 GB of instance storage, 64-bit platform) is used as the backbone of cloud computing for dose calculation and plan optimization. The master node is able to scale the workers on an 'on-demand' basis. MC dose calculation is employed to generate accurate beamlet dose kernels by parallel tasks. The intensity modulation optimization uses total-variation regularization (TVR) and generates piecewise constant fluence maps for each initial beam direction in a distributed manner over the CCE. The optimized fluence maps are segmented into deliverable apertures. The shape of each aperture is iteratively rectified to be a sequence of arcs using the manufacture's constraints. The output plan file from the EC2 is sent to the simple storage service. Three de-identified clinical cancer treatment plans have been studied for evaluating the performance of the new planning platform with 6 MV flattening filter free beams (40 × 40 cm(2)) from the Varian TrueBeam(TM) STx linear accelerator. A CCE leads to speed-ups of up to 14-fold for both dose kernel calculations and plan optimizations in the head and neck, lung, and prostate cancer cases considered in this study. The proposed system relies on a CCE that is able to provide an infrastructure for parallel and distributed computing. The resultant plans from the cloud computing are identical to PC-based IMRT and VMAT plans, confirming the reliability of the cloud computing platform. This cloud computing infrastructure has been established for a radiation treatment planning. It substantially improves the speed of inverse planning and makes future on-treatment adaptive re-planning possible.
Toward a web-based real-time radiation treatment planning system in a cloud computing environment
NASA Astrophysics Data System (ADS)
Hum Na, Yong; Suh, Tae-Suk; Kapp, Daniel S.; Xing, Lei
2013-09-01
To exploit the potential dosimetric advantages of intensity modulated radiation therapy (IMRT) and volumetric modulated arc therapy (VMAT), an in-depth approach is required to provide efficient computing methods. This needs to incorporate clinically related organ specific constraints, Monte Carlo (MC) dose calculations, and large-scale plan optimization. This paper describes our first steps toward a web-based real-time radiation treatment planning system in a cloud computing environment (CCE). The Amazon Elastic Compute Cloud (EC2) with a master node (named m2.xlarge containing 17.1 GB of memory, two virtual cores with 3.25 EC2 Compute Units each, 420 GB of instance storage, 64-bit platform) is used as the backbone of cloud computing for dose calculation and plan optimization. The master node is able to scale the workers on an ‘on-demand’ basis. MC dose calculation is employed to generate accurate beamlet dose kernels by parallel tasks. The intensity modulation optimization uses total-variation regularization (TVR) and generates piecewise constant fluence maps for each initial beam direction in a distributed manner over the CCE. The optimized fluence maps are segmented into deliverable apertures. The shape of each aperture is iteratively rectified to be a sequence of arcs using the manufacture’s constraints. The output plan file from the EC2 is sent to the simple storage service. Three de-identified clinical cancer treatment plans have been studied for evaluating the performance of the new planning platform with 6 MV flattening filter free beams (40 × 40 cm2) from the Varian TrueBeamTM STx linear accelerator. A CCE leads to speed-ups of up to 14-fold for both dose kernel calculations and plan optimizations in the head and neck, lung, and prostate cancer cases considered in this study. The proposed system relies on a CCE that is able to provide an infrastructure for parallel and distributed computing. The resultant plans from the cloud computing are identical to PC-based IMRT and VMAT plans, confirming the reliability of the cloud computing platform. This cloud computing infrastructure has been established for a radiation treatment planning. It substantially improves the speed of inverse planning and makes future on-treatment adaptive re-planning possible.
Calibration of LOFAR data on the cloud
NASA Astrophysics Data System (ADS)
Sabater, J.; Sánchez-Expósito, S.; Best, P.; Garrido, J.; Verdes-Montenegro, L.; Lezzi, D.
2017-04-01
New scientific instruments are starting to generate an unprecedented amount of data. The Low Frequency Array (LOFAR), one of the Square Kilometre Array (SKA) pathfinders, is already producing data on a petabyte scale. The calibration of these data presents a huge challenge for final users: (a) extensive storage and computing resources are required; (b) the installation and maintenance of the software required for the processing is not trivial; and (c) the requirements of calibration pipelines, which are experimental and under development, are quickly evolving. After encountering some limitations in classical infrastructures like dedicated clusters, we investigated the viability of cloud infrastructures as a solution. We found that the installation and operation of LOFAR data calibration pipelines is not only possible, but can also be efficient in cloud infrastructures. The main advantages were: (1) the ease of software installation and maintenance, and the availability of standard APIs and tools, widely used in the industry; this reduces the requirement for significant manual intervention, which can have a highly negative impact in some infrastructures; (2) the flexibility to adapt the infrastructure to the needs of the problem, especially as those demands change over time; (3) the on-demand consumption of (shared) resources. We found that a critical factor (also in other infrastructures) is the availability of scratch storage areas of an appropriate size. We found no significant impediments associated with the speed of data transfer, the use of virtualization, the use of external block storage, or the memory available (provided a minimum threshold is reached). Finally, we considered the cost-effectiveness of a commercial cloud like Amazon Web Services. While a cloud solution is more expensive than the operation of a large, fully-utilized cluster completely dedicated to LOFAR data reduction, we found that its costs are competitive if the number of datasets to be analysed is not high, or if the costs of maintaining a system capable of calibrating LOFAR data become high. Coupled with the advantages discussed above, this suggests that a cloud infrastructure may be favourable for many users.
Making the most of cloud storage - a toolkit for exploitation by WLCG experiments
NASA Astrophysics Data System (ADS)
Alvarez Ayllon, Alejandro; Arsuaga Rios, Maria; Bitzes, Georgios; Furano, Fabrizio; Keeble, Oliver; Manzi, Andrea
2017-10-01
Understanding how cloud storage can be effectively used, either standalone or in support of its associated compute, is now an important consideration for WLCG. We report on a suite of extensions to familiar tools targeted at enabling the integration of cloud object stores into traditional grid infrastructures and workflows. Notable updates include support for a number of object store flavours in FTS3, Davix and gfal2, including mitigations for lack of vector reads; the extension of Dynafed to operate as a bridge between grid and cloud domains; protocol translation in FTS3; the implementation of extensions to DPM (also implemented by the dCache project) to allow 3rd party transfers over HTTP. The result is a toolkit which facilitates data movement and access between grid and cloud infrastructures, broadening the range of workflows suitable for cloud. We report on deployment scenarios and prototype experience, explaining how, for example, an Amazon S3 or Azure allocation can be exploited by grid workflows.
Parallel processing optimization strategy based on MapReduce model in cloud storage environment
NASA Astrophysics Data System (ADS)
Cui, Jianming; Liu, Jiayi; Li, Qiuyan
2017-05-01
Currently, a large number of documents in the cloud storage process employed the way of packaging after receiving all the packets. From the local transmitter this stored procedure to the server, packing and unpacking will consume a lot of time, and the transmission efficiency is low as well. A new parallel processing algorithm is proposed to optimize the transmission mode. According to the operation machine graphs model work, using MPI technology parallel execution Mapper and Reducer mechanism. It is good to use MPI technology to implement Mapper and Reducer parallel mechanism. After the simulation experiment of Hadoop cloud computing platform, this algorithm can not only accelerate the file transfer rate, but also shorten the waiting time of the Reducer mechanism. It will break through traditional sequential transmission constraints and reduce the storage coupling to improve the transmission efficiency.
Cloud access to interoperable IVOA-compliant VOSpace storage
NASA Astrophysics Data System (ADS)
Bertocco, S.; Dowler, P.; Gaudet, S.; Major, B.; Pasian, F.; Taffoni, G.
2018-07-01
Handling, processing and archiving the huge amount of data produced by the new generation of experiments and instruments in Astronomy and Astrophysics are among the more exciting challenges to address in designing the future data management infrastructures and computing services. We investigated the feasibility of a data management and computation infrastructure, available world-wide, with the aim of merging the FAIR data management provided by IVOA standards with the efficiency and reliability of a cloud approach. Our work involved the Canadian Advanced Network for Astronomy Research (CANFAR) infrastructure and the European EGI federated cloud (EFC). We designed and deployed a pilot data management and computation infrastructure that provides IVOA-compliant VOSpace storage resources and wide access to interoperable federated clouds. In this paper, we detail the main user requirements covered, the technical choices and the implemented solutions and we describe the resulting Hybrid cloud Worldwide infrastructure, its benefits and limitations.
Capturing and analyzing wheelchair maneuvering patterns with mobile cloud computing.
Fu, Jicheng; Hao, Wei; White, Travis; Yan, Yuqing; Jones, Maria; Jan, Yih-Kuen
2013-01-01
Power wheelchairs have been widely used to provide independent mobility to people with disabilities. Despite great advancements in power wheelchair technology, research shows that wheelchair related accidents occur frequently. To ensure safe maneuverability, capturing wheelchair maneuvering patterns is fundamental to enable other research, such as safe robotic assistance for wheelchair users. In this study, we propose to record, store, and analyze wheelchair maneuvering data by means of mobile cloud computing. Specifically, the accelerometer and gyroscope sensors in smart phones are used to record wheelchair maneuvering data in real-time. Then, the recorded data are periodically transmitted to the cloud for storage and analysis. The analyzed results are then made available to various types of users, such as mobile phone users, traditional desktop users, etc. The combination of mobile computing and cloud computing leverages the advantages of both techniques and extends the smart phone's capabilities of computing and data storage via the Internet. We performed a case study to implement the mobile cloud computing framework using Android smart phones and Google App Engine, a popular cloud computing platform. Experimental results demonstrated the feasibility of the proposed mobile cloud computing framework.
The JINR Tier1 Site Simulation for Research and Development Purposes
NASA Astrophysics Data System (ADS)
Korenkov, V.; Nechaevskiy, A.; Ososkov, G.; Pryahina, D.; Trofimov, V.; Uzhinskiy, A.; Voytishin, N.
2016-02-01
Distributed complex computing systems for data storage and processing are in common use in the majority of modern scientific centers. The design of such systems is usually based on recommendations obtained via a preliminary simulated model used and executed only once. However big experiments last for years and decades, and the development of their computing system is going on, not only quantitatively but also qualitatively. Even with the substantial efforts invested in the design phase to understand the systems configuration, it would be hard enough to develop a system without additional research of its future evolution. The developers and operators face the problem of the system behaviour predicting after the planned modifications. A system for grid and cloud services simulation is developed at LIT (JINR, Dubna). This simulation system is focused on improving the effciency of the grid/cloud structures development by using the work quality indicators of some real system. The development of such kind of software is very important for making a new grid/cloud infrastructure for such big scientific experiments like the JINR Tier1 site for WLCG. The simulation of some processes of the Tier1 site is considered as an example of our application approach.
A Secure and Verifiable Outsourced Access Control Scheme in Fog-Cloud Computing.
Fan, Kai; Wang, Junxiong; Wang, Xin; Li, Hui; Yang, Yintang
2017-07-24
With the rapid development of big data and Internet of things (IOT), the number of networking devices and data volume are increasing dramatically. Fog computing, which extends cloud computing to the edge of the network can effectively solve the bottleneck problems of data transmission and data storage. However, security and privacy challenges are also arising in the fog-cloud computing environment. Ciphertext-policy attribute-based encryption (CP-ABE) can be adopted to realize data access control in fog-cloud computing systems. In this paper, we propose a verifiable outsourced multi-authority access control scheme, named VO-MAACS. In our construction, most encryption and decryption computations are outsourced to fog devices and the computation results can be verified by using our verification method. Meanwhile, to address the revocation issue, we design an efficient user and attribute revocation method for it. Finally, analysis and simulation results show that our scheme is both secure and highly efficient.
Sharing Planetary-Scale Data in the Cloud
NASA Astrophysics Data System (ADS)
Sundwall, J.; Flasher, J.
2016-12-01
On 19 March 2015, Amazon Web Services (AWS) announced Landsat on AWS, an initiative to make data from the U.S. Geological Survey's Landsat satellite program freely available in the cloud. Because of Landsat's global coverage and long history, it has become a reference point for all Earth observation work and is considered the gold standard of natural resource satellite imagery. Within the first year of Landsat on AWS, the service served over a billion requests for Landsat imagery and metadata, globally. Availability of the data in the cloud has led to new product development by companies and startups including Mapbox, Esri, CartoDB, MathWorks, Development Seed, Trimble, Astro Digital, Blue Raster and Timbr.io. The model of staging data for analysis in the cloud established by Landsat on AWS has since been applied to high resolution radar data, European Space Agency satellite imagery, global elevation data and EPA air quality models. This session will provide an overview of lessons learned throughout these projects. It will demonstrate how cloud-based object storage is democratizing access to massive publicly-funded data sets that have previously only been available to people with access to large amounts of storage, bandwidth, and computing power. Technical discussion points will include: The differences between staging data for analysis using object storage versus file storage Using object stores to design simple RESTful APIs through thoughtful file naming conventions, header fields, and HTTP Range Requests Managing costs through data architecture and Amazon S3's "requester pays" feature Building tools that allow users to take their algorithm to the data in the cloud Using serverless technologies to display dynamic frontends for massive data sets
Impact of Data Placement on Resilience in Large-Scale Object Storage Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carns, Philip; Harms, Kevin; Jenkins, John
Distributed object storage architectures have become the de facto standard for high-performance storage in big data, cloud, and HPC computing. Object storage deployments using commodity hardware to reduce costs often employ object replication as a method to achieve data resilience. Repairing object replicas after failure is a daunting task for systems with thousands of servers and billions of objects, however, and it is increasingly difficult to evaluate such scenarios at scale on realworld systems. Resilience and availability are both compromised if objects are not repaired in a timely manner. In this work we leverage a high-fidelity discrete-event simulation model tomore » investigate replica reconstruction on large-scale object storage systems with thousands of servers, billions of objects, and petabytes of data. We evaluate the behavior of CRUSH, a well-known object placement algorithm, and identify configuration scenarios in which aggregate rebuild performance is constrained by object placement policies. After determining the root cause of this bottleneck, we then propose enhancements to CRUSH and the usage policies atop it to enable scalable replica reconstruction. We use these methods to demonstrate a simulated aggregate rebuild rate of 410 GiB/s (within 5% of projected ideal linear scaling) on a 1,024-node commodity storage system. We also uncover an unexpected phenomenon in rebuild performance based on the characteristics of the data stored on the system.« less
Files synchronization from a large number of insertions and deletions
NASA Astrophysics Data System (ADS)
Ellappan, Vijayan; Kumari, Savera
2017-11-01
Synchronization between different versions of files is becoming a major issue that most of the applications are facing. To make the applications more efficient a economical algorithm is developed from the previously used algorithm of “File Loading Algorithm”. I am extending this algorithm in three ways: First, dealing with non-binary files, Second backup is generated for uploaded files and lastly each files are synchronized with insertions and deletions. User can reconstruct file from the former file with minimizing the error and also provides interactive communication by eliminating the frequency without any disturbance. The drawback of previous system is overcome by using synchronization, in which multiple copies of each file/record is created and stored in backup database and is efficiently restored in case of any unwanted deletion or loss of data. That is, to introduce a protocol that user B may use to reconstruct file X from file Y with suitably low probability of error. Synchronization algorithms find numerous areas of use, including data storage, file sharing, source code control systems, and cloud applications. For example, cloud storage services such as Drop box synchronize between local copies and cloud backups each time users make changes to local versions. Similarly, synchronization tools are necessary in mobile devices. Specialized synchronization algorithms are used for video and sound editing. Synchronization tools are also capable of performing data duplication.
NASA Astrophysics Data System (ADS)
Tiari, Saeed
A desirable feature of concentrated solar power (CSP) with integrated thermal energy storage (TES) unit is to provide electricity in a dispatchable manner during cloud transient and non-daylight hours. Latent heat thermal energy storage (LHTES) offers many advantages such as higher energy storage density, wider range of operating temperature and nearly isothermal heat transfer relative to sensible heat thermal energy storage (SHTES), which is the current standard for trough and tower CSP systems. Despite the advantages mentioned above, LHTES systems performance is often limited by low thermal conductivity of commonly used, low cost phase change materials (PCMs). Research and development of passive heat transfer devices, such as heat pipes (HPs) to enhance the heat transfer in the PCM has received considerable attention. Due to its high effective thermal conductivity, heat pipe can transport large amounts of heat with relatively small temperature difference. The objective of this research is to study the charging and discharging processes of heat pipe-assisted LHTES systems using computational fluid dynamics (CFD) and experimental testing to develop a method for more efficient energy storage system design. The results revealed that the heat pipe network configurations and the quantities of heat pipes integrated in a thermal energy storage system have a profound effect on the thermal response of the system. The optimal placement of heat pipes in the system can significantly enhance the thermal performance. It was also found that the inclusion of natural convection heat transfer in the CFD simulation of the system is necessary to have a realistic prediction of a latent heat thermal storage system performance. In addition, the effects of geometrical features and quantity of fins attached to the HPs have been studied.
An Interactive Web-Based Analysis Framework for Remote Sensing Cloud Computing
NASA Astrophysics Data System (ADS)
Wang, X. Z.; Zhang, H. M.; Zhao, J. H.; Lin, Q. H.; Zhou, Y. C.; Li, J. H.
2015-07-01
Spatiotemporal data, especially remote sensing data, are widely used in ecological, geographical, agriculture, and military research and applications. With the development of remote sensing technology, more and more remote sensing data are accumulated and stored in the cloud. An effective way for cloud users to access and analyse these massive spatiotemporal data in the web clients becomes an urgent issue. In this paper, we proposed a new scalable, interactive and web-based cloud computing solution for massive remote sensing data analysis. We build a spatiotemporal analysis platform to provide the end-user with a safe and convenient way to access massive remote sensing data stored in the cloud. The lightweight cloud storage system used to store public data and users' private data is constructed based on open source distributed file system. In it, massive remote sensing data are stored as public data, while the intermediate and input data are stored as private data. The elastic, scalable, and flexible cloud computing environment is built using Docker, which is a technology of open-source lightweight cloud computing container in the Linux operating system. In the Docker container, open-source software such as IPython, NumPy, GDAL, and Grass GIS etc., are deployed. Users can write scripts in the IPython Notebook web page through the web browser to process data, and the scripts will be submitted to IPython kernel to be executed. By comparing the performance of remote sensing data analysis tasks executed in Docker container, KVM virtual machines and physical machines respectively, we can conclude that the cloud computing environment built by Docker makes the greatest use of the host system resources, and can handle more concurrent spatial-temporal computing tasks. Docker technology provides resource isolation mechanism in aspects of IO, CPU, and memory etc., which offers security guarantee when processing remote sensing data in the IPython Notebook. Users can write complex data processing code on the web directly, so they can design their own data processing algorithm.
Securing the Data Storage and Processing in Cloud Computing Environment
ERIC Educational Resources Information Center
Owens, Rodney
2013-01-01
Organizations increasingly utilize cloud computing architectures to reduce costs and energy consumption both in the data warehouse and on mobile devices by better utilizing the computing resources available. However, the security and privacy issues with publicly available cloud computing infrastructures have not been studied to a sufficient depth…
Dynamic federations: storage aggregation using open tools and protocols
NASA Astrophysics Data System (ADS)
Furano, Fabrizio; Brito da Rocha, Ricardo; Devresse, Adrien; Keeble, Oliver; Álvarez Ayllón, Alejandro; Fuhrmann, Patrick
2012-12-01
A number of storage elements now offer standard protocol interfaces like NFS 4.1/pNFS and WebDAV, for access to their data repositories, in line with the standardization effort of the European Middleware Initiative (EMI). Also the LCG FileCatalogue (LFC) can offer such features. Here we report on work that seeks to exploit the federation potential of these protocols and build a system that offers a unique view of the storage and metadata ensemble and the possibility of integration of other compatible resources such as those from cloud providers. The challenge, here undertaken by the providers of dCache and DPM, and pragmatically open to other Grid and Cloud storage solutions, is to build such a system while being able to accommodate name translations from existing catalogues (e.g. LFCs), experiment-based metadata catalogues, or stateless algorithmic name translations, also known as “trivial file catalogues”. Such so-called storage federations of standard protocols-based storage elements give a unique view of their content, thus promoting simplicity in accessing the data they contain and offering new possibilities for resilience and data placement strategies. The goal is to consider HTTP and NFS4.1-based storage elements and metadata catalogues and make them able to cooperate through an architecture that properly feeds the redirection mechanisms that they are based upon, thus giving the functionalities of a “loosely coupled” storage federation. One of the key requirements is to use standard clients (provided by OS'es or open source distributions, e.g. Web browsers) to access an already aggregated system; this approach is quite different from aggregating the repositories at the client side through some wrapper API, like for instance GFAL, or by developing new custom clients. Other technical challenges that will determine the success of this initiative include performance, latency and scalability, and the ability to create worldwide storage federations that are able to redirect clients to repositories that they can efficiently access, for instance trying to choose the endpoints that are closer or applying other criteria. We believe that the features of a loosely coupled federation of open-protocols-based storage elements will open many possibilities of evolving the current computing models without disrupting them, and, at the same time, will be able to operate with the existing infrastructures, follow their evolution path and add storage centers that can be acquired as a third-party service.
A novel mobile-cloud system for capturing and analyzing wheelchair maneuvering data: A pilot study.
Fu, Jicheng; Jones, Maria; Liu, Tao; Hao, Wei; Yan, Yuqing; Qian, Gang; Jan, Yih-Kuen
2016-01-01
The purpose of this pilot study was to provide a new approach for capturing and analyzing wheelchair maneuvering data, which are critical for evaluating wheelchair users' activity levels. We proposed a mobile-cloud (MC) system, which incorporated the emerging mobile and cloud computing technologies. The MC system employed smartphone sensors to collect wheelchair maneuvering data and transmit them to the cloud for storage and analysis. A k-nearest neighbor (KNN) machine-learning algorithm was developed to mitigate the impact of sensor noise and recognize wheelchair maneuvering patterns. We conducted 30 trials in an indoor setting, where each trial contained 10 bouts (i.e., periods of continuous wheelchair movement). We also verified our approach in a different building. Different from existing approaches that require sensors to be attached to wheelchairs' wheels, we placed the smartphone into a smartphone holder attached to the wheelchair. Experimental results illustrate that our approach correctly identified all 300 bouts. Compared to existing approaches, our approach was easier to use while achieving similar accuracy in analyzing the accumulated movement time and maximum period of continuous movement (p > 0.8). Overall, the MC system provided a feasible way to ease the data collection process and generated accurate analysis results for evaluating activity levels.
A Novel Mobile-Cloud System for Capturing and Analyzing Wheelchair Maneuvering Data: A Pilot Study
Fu, Jicheng; Jones, Maria; Liu, Tao; Hao, Wei; Yan, Yuqing; Qian, Gang; Jan, Yih-Kuen
2016-01-01
The purpose of this pilot study was to provide a new approach for capturing and analyzing wheelchair maneuvering data, which are critical for evaluating wheelchair users’ activity levels. We proposed a mobile-cloud (MC) system, which incorporated the emerging mobile and cloud computing technologies. The MC system employed smartphone sensors to collect wheelchair maneuvering data and transmit them to the cloud for storage and analysis. A K-Nearest-Neighbor (KNN) machine-learning algorithm was developed to mitigate the impact of sensor noise and recognize wheelchair maneuvering patterns. We conducted 30 trials in an indoor setting, where each trial contained 10 bouts (i.e., periods of continuous wheelchair movement). We also verified our approach in a different building. Different from existing approaches that require sensors to be attached to wheelchairs’ wheels, we placed the smartphone into a smartphone holder attached to the wheelchair. Experimental results illustrate that our approach correctly identified all 300 bouts. Compared to existing approaches, our approach was easier to use while achieving similar accuracy in analyzing the accumulated movement time and maximum period of continuous movement (p > 0.8). Overall, the MC system provided a feasible way to ease the data collection process, and generated accurate analysis results for evaluating activity levels. PMID:26479684
The Role of Standards in Cloud-Computing Interoperability
2012-10-01
services are not shared outside the organization. CloudStack, Eucalyptus, HP, Microsoft, OpenStack , Ubuntu, and VMWare provide tools for building...center requirements • Developing usage models for cloud ven- dors • Independent IT consortium OpenStack http://www.openstack.org • Open-source...software for running private clouds • Currently consists of three core software projects: OpenStack Compute (Nova), OpenStack Object Storage (Swift
Strategic Implications of Cloud Computing for Modeling and Simulation (Briefing)
2016-04-01
of Promises with Cloud • Cost efficiency • Unlimited storage • Backup and recovery • Automatic software integration • Easy access to information...activities that wrap the actual exercise itself (e.g., travel for exercise support, data collection, integration , etc.). Cloud -based simulation would...requiring quick delivery rather than fewer large messages requiring high bandwidth. Cloud environments tend to be better at providing high-bandwidth
VidCat: an image and video analysis service for personal media management
NASA Astrophysics Data System (ADS)
Begeja, Lee; Zavesky, Eric; Liu, Zhu; Gibbon, David; Gopalan, Raghuraman; Shahraray, Behzad
2013-03-01
Cloud-based storage and consumption of personal photos and videos provides increased accessibility, functionality, and satisfaction for mobile users. One cloud service frontier that is recently growing is that of personal media management. This work presents a system called VidCat that assists users in the tagging, organization, and retrieval of their personal media by faces and visual content similarity, time, and date information. Evaluations for the effectiveness of the copy detection and face recognition algorithms on standard datasets are also discussed. Finally, the system includes a set of application programming interfaces (API's) allowing content to be uploaded, analyzed, and retrieved on any client with simple HTTP-based methods as demonstrated with a prototype developed on the iOS and Android mobile platforms.
Beam position monitoring system at CESR
NASA Astrophysics Data System (ADS)
Billing, M. G.; Bergan, W. F.; Forster, M. J.; Meller, R. E.; Rendina, M. C.; Rider, N. T.; Sagan, D. C.; Shanks, J.; Sikora, J. P.; Stedinger, M. G.; Strohman, C. R.; Palmer, M. A.; Holtzapple, R. L.
2017-09-01
The Cornell Electron-positron Storage Ring (CESR) has been converted from a High Energy Physics electron-positron collider to operate as a dedicated synchrotron light source for the Cornell High Energy Synchrotron Source (CHESS) and to conduct accelerator physics research as a test accelerator, capable of studying topics relevant to future damping rings, colliders and light sources. Some of the specific topics that were targeted for the initial phase of operation of the storage ring in this mode, labeled CESRTA (CESR as a Test Accelerator), included 1) tuning techniques to produce low emittance beams, 2) the study of electron cloud development in a storage ring and 3) intra-beam scattering effects. The complete conversion of CESR to CESRTA occurred over a several year period and is described elsewhere. As a part of this conversion the CESR beam position monitoring (CBPM) system was completely upgraded to provide the needed instrumental capabilities for these studies. This paper describes the new CBPM system hardware, its function and representative measurements performed by the upgraded system.
Low latency network and distributed storage for next generation HPC systems: the ExaNeSt project
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Cretaro, P.; Frezza, O.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Paolucci, P. S.; Pastorelli, E.; Pisani, F.; Simula, F.; Vicini, P.; Navaridas, J.; Chaix, F.; Chrysos, N.; Katevenis, M.; Papaeustathiou, V.
2017-10-01
With processor architecture evolution, the HPC market has undergone a paradigm shift. The adoption of low-cost, Linux-based clusters extended the reach of HPC from its roots in modelling and simulation of complex physical systems to a broader range of industries, from biotechnology, cloud computing, computer analytics and big data challenges to manufacturing sectors. In this perspective, the near future HPC systems can be envisioned as composed of millions of low-power computing cores, densely packed — meaning cooling by appropriate technology — with a tightly interconnected, low latency and high performance network and equipped with a distributed storage architecture. Each of these features — dense packing, distributed storage and high performance interconnect — represents a challenge, made all the harder by the need to solve them at the same time. These challenges lie as stumbling blocks along the road towards Exascale-class systems; the ExaNeSt project acknowledges them and tasks itself with investigating ways around them.
Migrating Educational Data and Services to Cloud Computing: Exploring Benefits and Challenges
ERIC Educational Resources Information Center
Lahiri, Minakshi; Moseley, James L.
2013-01-01
"Cloud computing" is currently the "buzzword" in the Information Technology field. Cloud computing facilitates convenient access to information and software resources as well as easy storage and sharing of files and data, without the end users being aware of the details of the computing technology behind the process. This…
Estimates of forest canopy height and aboveground biomass using ICESat.
Michael A. Lefsky; David J. Harding; Michael Keller; Warren B. Cohen; Claudia C. Carabajal; Fernando Del Bom Espirito-Santo; Maria O. Hunter; Raimundo de Oliveira Jr.
2005-01-01
Exchange of carbon between forests and the atmosphere is a vital component of the global carbon cycle. Satellite laser altimetry has a unique capability for estimating forest canopy height, which has a direct and increasingly well understood relationship to aboveground carbon storage. While the Geoscience Laser Altimeter System (GLAS) onboard the Ice, Cloud and land...
An adaptive process-based cloud infrastructure for space situational awareness applications
NASA Astrophysics Data System (ADS)
Liu, Bingwei; Chen, Yu; Shen, Dan; Chen, Genshe; Pham, Khanh; Blasch, Erik; Rubin, Bruce
2014-06-01
Space situational awareness (SSA) and defense space control capabilities are top priorities for groups that own or operate man-made spacecraft. Also, with the growing amount of space debris, there is an increase in demand for contextual understanding that necessitates the capability of collecting and processing a vast amount sensor data. Cloud computing, which features scalable and flexible storage and computing services, has been recognized as an ideal candidate that can meet the large data contextual challenges as needed by SSA. Cloud computing consists of physical service providers and middleware virtual machines together with infrastructure, platform, and software as service (IaaS, PaaS, SaaS) models. However, the typical Virtual Machine (VM) abstraction is on a per operating systems basis, which is at too low-level and limits the flexibility of a mission application architecture. In responding to this technical challenge, a novel adaptive process based cloud infrastructure for SSA applications is proposed in this paper. In addition, the details for the design rationale and a prototype is further examined. The SSA Cloud (SSAC) conceptual capability will potentially support space situation monitoring and tracking, object identification, and threat assessment. Lastly, the benefits of a more granular and flexible cloud computing resources allocation are illustrated for data processing and implementation considerations within a representative SSA system environment. We show that the container-based virtualization performs better than hypervisor-based virtualization technology in an SSA scenario.
Public Auditing with Privacy Protection in a Multi-User Model of Cloud-Assisted Body Sensor Networks
Li, Song; Cui, Jie; Zhong, Hong; Liu, Lu
2017-01-01
Wireless Body Sensor Networks (WBSNs) are gaining importance in the era of the Internet of Things (IoT). The modern medical system is a particular area where the WBSN techniques are being increasingly adopted for various fundamental operations. Despite such increasing deployments of WBSNs, issues such as the infancy in the size, capabilities and limited data processing capacities of the sensor devices restrain their adoption in resource-demanding applications. Though providing computing and storage supplements from cloud servers can potentially enrich the capabilities of the WBSNs devices, data security is one of the prevailing issues that affects the reliability of cloud-assisted services. Sensitive applications such as modern medical systems demand assurance of the privacy of the users’ medical records stored in distant cloud servers. Since it is economically impossible to set up private cloud servers for every client, auditing data security managed in the remote servers has necessarily become an integral requirement of WBSNs’ applications relying on public cloud servers. To this end, this paper proposes a novel certificateless public auditing scheme with integrated privacy protection. The multi-user model in our scheme supports groups of users to store and share data, thus exhibiting the potential for WBSNs’ deployments within community environments. Furthermore, our scheme enriches user experiences by offering public verifiability, forward security mechanisms and revocation of illegal group members. Experimental evaluations demonstrate the security effectiveness of our proposed scheme under the Random Oracle Model (ROM) by outperforming existing cloud-assisted WBSN models. PMID:28475110
Li, Song; Cui, Jie; Zhong, Hong; Liu, Lu
2017-05-05
Wireless Body Sensor Networks (WBSNs) are gaining importance in the era of the Internet of Things (IoT). The modern medical system is a particular area where the WBSN techniques are being increasingly adopted for various fundamental operations. Despite such increasing deployments of WBSNs, issues such as the infancy in the size, capabilities and limited data processing capacities of the sensor devices restrain their adoption in resource-demanding applications. Though providing computing and storage supplements from cloud servers can potentially enrich the capabilities of the WBSNs devices, data security is one of the prevailing issues that affects the reliability of cloud-assisted services. Sensitive applications such as modern medical systems demand assurance of the privacy of the users' medical records stored in distant cloud servers. Since it is economically impossible to set up private cloud servers for every client, auditing data security managed in the remote servers has necessarily become an integral requirement of WBSNs' applications relying on public cloud servers. To this end, this paper proposes a novel certificateless public auditing scheme with integrated privacy protection. The multi-user model in our scheme supports groups of users to store and share data, thus exhibiting the potential for WBSNs' deployments within community environments. Furthermore, our scheme enriches user experiences by offering public verifiability, forward security mechanisms and revocation of illegal group members. Experimental evaluations demonstrate the security effectiveness of our proposed scheme under the Random Oracle Model (ROM) by outperforming existing cloud-assisted WBSN models.
Global EOS: exploring the 300-ms-latency region
NASA Astrophysics Data System (ADS)
Mascetti, L.; Jericho, D.; Hsu, C.-Y.
2017-10-01
EOS, the CERN open-source distributed disk storage system, provides the highperformance storage solution for HEP analysis and the back-end for various work-flows. Recently EOS became the back-end of CERNBox, the cloud synchronisation service for CERN users. EOS can be used to take advantage of wide-area distributed installations: for the last few years CERN EOS uses a common deployment across two computer centres (Geneva-Meyrin and Budapest-Wigner) about 1,000 km apart (∼20-ms latency) with about 200 PB of disk (JBOD). In late 2015, the CERN-IT Storage group and AARNET (Australia) set-up a challenging R&D project: a single EOS instance between CERN and AARNET with more than 300ms latency (16,500 km apart). This paper will report about the success in deploy and run a distributed storage system between Europe (Geneva, Budapest), Australia (Melbourne) and later in Asia (ASGC Taipei), allowing different type of data placement and data access across these four sites.
Space Science Cloud: a Virtual Space Science Research Platform Based on Cloud Model
NASA Astrophysics Data System (ADS)
Hu, Xiaoyan; Tong, Jizhou; Zou, Ziming
Through independent and co-operational science missions, Strategic Pioneer Program (SPP) on Space Science, the new initiative of space science program in China which was approved by CAS and implemented by National Space Science Center (NSSC), dedicates to seek new discoveries and new breakthroughs in space science, thus deepen the understanding of universe and planet earth. In the framework of this program, in order to support the operations of space science missions and satisfy the demand of related research activities for e-Science, NSSC is developing a virtual space science research platform based on cloud model, namely the Space Science Cloud (SSC). In order to support mission demonstration, SSC integrates interactive satellite orbit design tool, satellite structure and payloads layout design tool, payload observation coverage analysis tool, etc., to help scientists analyze and verify space science mission designs. Another important function of SSC is supporting the mission operations, which runs through the space satellite data pipelines. Mission operators can acquire and process observation data, then distribute the data products to other systems or issue the data and archives with the services of SSC. In addition, SSC provides useful data, tools and models for space researchers. Several databases in the field of space science are integrated and an efficient retrieve system is developing. Common tools for data visualization, deep processing (e.g., smoothing and filtering tools), analysis (e.g., FFT analysis tool and minimum variance analysis tool) and mining (e.g., proton event correlation analysis tool) are also integrated to help the researchers to better utilize the data. The space weather models on SSC include magnetic storm forecast model, multi-station middle and upper atmospheric climate model, solar energetic particle propagation model and so on. All the services above-mentioned are based on the e-Science infrastructures of CAS e.g. cloud storage and cloud computing. SSC provides its users with self-service storage and computing resources at the same time.At present, the prototyping of SSC is underway and the platform is expected to be put into trial operation in August 2014. We hope that as SSC develops, our vision of Digital Space may come true someday.
Remote-Sensing Data Distribution and Processing in the Cloud at the ASF DAAC
NASA Astrophysics Data System (ADS)
Stoner, C.; Arko, S. A.; Nicoll, J. B.; Labelle-Hamer, A. L.
2016-12-01
The Alaska Satellite Facility (ASF) Distributed Active Archive Center (DAAC) has been tasked to archive and distribute data from both SENTINEL-1 satellites and from the NASA-ISRO Synthetic Aperture Radar (NISAR) satellite in a cost effective manner. In order to best support processing and distribution of these large data sets for users, the ASF DAAC enhanced our data system in a number of ways that will be detailed in this presentation.The SENTINEL-1 mission comprises a constellation of two polar-orbiting satellites, operating day and night performing C-band Synthetic Aperture Radar (SAR) imaging, enabling them to acquire imagery regardless of the weather. SENTINEL-1A was launched by the European Space Agency (ESA) in April 2014. SENTINEL-1B is scheduled to launch in April 2016.The NISAR satellite is designed to observe and take measurements of some of the planet's most complex processes, including ecosystem disturbances, ice-sheet collapse, and natural hazards such as earthquakes, tsunamis, volcanoes and landslides. NISAR will employ radar imaging, polarimetry, and interferometry techniques using the SweepSAR technology employed for full-resolution wide-swath imaging. NISAR data files are large, making storage and processing a challenge for conventional store and download systems.To effectively process, store, and distribute petabytes of data in a High-performance computing environment, ASF took a long view with regard to technology choices and picked a path of most flexibility and Software re-use. To that end, this Software tools and services presentation will cover Web Object Storage (WOS) and the ability to seamlessly move from local sunk cost hardware to public cloud, such as Amazon Web Services (AWS). A prototype of SENTINEL-1A system that is in AWS, as well as a local hardware solution, will be examined to explain the pros and cons of each. In preparation for NISAR files which will be even larger than SENTINEL-1A, ASF has embarked on a number of cloud initiatives, including processing in the cloud at scale, processing data on-demand, and processing end-user computations on DAAC data in the cloud.
NASA Astrophysics Data System (ADS)
Nguyen, L.; Chee, T.; Minnis, P.; Palikonda, R.; Smith, W. L., Jr.; Spangenberg, D.
2016-12-01
The NASA LaRC Satellite ClOud and Radiative Property retrieval System (SatCORPS) processes and derives near real-time (NRT) global cloud products from operational geostationary satellite imager datasets. These products are being used in NRT to improve forecast model, aircraft icing warnings, and support aircraft field campaigns. Next generation satellites, such as the Japanese Himawari-8 and the upcoming NOAA GOES-R, present challenges for NRT data processing and product dissemination due to the increase in temporal and spatial resolution. The volume of data is expected to increase to approximately 10 folds. This increase in data volume will require additional IT resources to keep up with the processing demands to satisfy NRT requirements. In addition, these resources are not readily available due to cost and other technical limitations. To anticipate and meet these computing resource requirements, we have employed a hybrid cloud computing environment to augment the generation of SatCORPS products. This paper will describe the workflow to ingest, process, and distribute SatCORPS products and the technologies used. Lessons learn from working on both AWS Clouds and GovCloud will be discussed: benefits, similarities, and differences that could impact decision to use cloud computing and storage. A detail cost analysis will be presented. In addition, future cloud utilization, parallelization, and architecture layout will be discussed for GOES-R.
Portable Map-Reduce Utility for MIT SuperCloud Environment
2015-09-17
Reuther, A. Rosa, C. Yee, “Driving Big Data With Big Compute,” IEEE HPEC, Sep 10-12, 2012, Waltham, MA. [6] Apache Hadoop 1.2.1 Documentation: HDFS... big data architecture, which is designed to address these challenges, is made of the computing resources, scheduler, central storage file system...databases, analytics software and web interfaces [1]. These components are common to many big data and supercomputing systems. The platform is
Cloud-based adaptive exon prediction for DNA analysis.
Putluri, Srinivasareddy; Zia Ur Rahman, Md; Fathima, Shaik Yasmeen
2018-02-01
Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database.
FRIEDA: Flexible Robust Intelligent Elastic Data Management Framework
Ghoshal, Devarshi; Hendrix, Valerie; Fox, William; ...
2017-02-01
Scientific applications are increasingly using cloud resources for their data analysis workflows. However, managing data effectively and efficiently over these cloud resources is challenging due to the myriad storage choices with different performance, cost trade-offs, complex application choices and complexity associated with elasticity, failure rates in these environments. The different data access patterns for data-intensive scientific applications require a more flexible and robust data management solution than the ones currently in existence. FRIEDA is a Flexible Robust Intelligent Elastic Data Management framework that employs a range of data management strategies in cloud environments. FRIEDA can manage storage and data lifecyclemore » of applications in cloud environments. There are four different stages in the data management lifecycle of FRIEDA – (i) storage planning, (ii) provisioning and preparation, (iii) data placement, and (iv) execution. FRIEDA defines a data control plane and an execution plane. The data control plane defines the data partition and distribution strategy, whereas the execution plane manages the execution of the application using a master-worker paradigm. FRIEDA also provides different data management strategies, either to partition the data in real-time, or predetermine the data partitions prior to application execution.« less
A compressive sensing based secure watermark detection and privacy preserving storage framework.
Qia Wang; Wenjun Zeng; Jun Tian
2014-03-01
Privacy is a critical issue when the data owners outsource data storage or processing to a third party computing service, such as the cloud. In this paper, we identify a cloud computing application scenario that requires simultaneously performing secure watermark detection and privacy preserving multimedia data storage. We then propose a compressive sensing (CS)-based framework using secure multiparty computation (MPC) protocols to address such a requirement. In our framework, the multimedia data and secret watermark pattern are presented to the cloud for secure watermark detection in a CS domain to protect the privacy. During CS transformation, the privacy of the CS matrix and the watermark pattern is protected by the MPC protocols under the semi-honest security model. We derive the expected watermark detection performance in the CS domain, given the target image, watermark pattern, and the size of the CS matrix (but without the CS matrix itself). The correctness of the derived performance has been validated by our experiments. Our theoretical analysis and experimental results show that secure watermark detection in the CS domain is feasible. Our framework can also be extended to other collaborative secure signal processing and data-mining applications in the cloud.
FRIEDA: Flexible Robust Intelligent Elastic Data Management Framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ghoshal, Devarshi; Hendrix, Valerie; Fox, William
Scientific applications are increasingly using cloud resources for their data analysis workflows. However, managing data effectively and efficiently over these cloud resources is challenging due to the myriad storage choices with different performance, cost trade-offs, complex application choices and complexity associated with elasticity, failure rates in these environments. The different data access patterns for data-intensive scientific applications require a more flexible and robust data management solution than the ones currently in existence. FRIEDA is a Flexible Robust Intelligent Elastic Data Management framework that employs a range of data management strategies in cloud environments. FRIEDA can manage storage and data lifecyclemore » of applications in cloud environments. There are four different stages in the data management lifecycle of FRIEDA – (i) storage planning, (ii) provisioning and preparation, (iii) data placement, and (iv) execution. FRIEDA defines a data control plane and an execution plane. The data control plane defines the data partition and distribution strategy, whereas the execution plane manages the execution of the application using a master-worker paradigm. FRIEDA also provides different data management strategies, either to partition the data in real-time, or predetermine the data partitions prior to application execution.« less
Improved Modeling Tools Development for High Penetration Solar
DOE Office of Scientific and Technical Information (OSTI.GOV)
Washom, Byron; Meagher, Kevin
2014-12-11
One of the significant objectives of the High Penetration solar research is to help the DOE understand, anticipate, and minimize grid operation impacts as more solar resources are added to the electric power system. For Task 2.2, an effective, reliable approach to predicting solar energy availability for energy generation forecasts using the University of California, San Diego (UCSD) Sky Imager technology has been demonstrated. Granular cloud and ramp forecasts for the next 5 to 20 minutes over an area of 10 square miles were developed. Sky images taken every 30 seconds are processed to determine cloud locations and cloud motionmore » vectors yielding future cloud shadow locations respective to distributed generation or utility solar power plants in the area. The performance of the method depends on cloud characteristics. On days with more advective cloud conditions, the developed method outperforms persistence forecasts by up to 30% (based on mean absolute error). On days with dynamic conditions, the method performs worse than persistence. Sky Imagers hold promise for ramp forecasting and ramp mitigation in conjunction with inverter controls and energy storage. The pre-commercial Sky Imager solar forecasting algorithm was documented with licensing information and was a Sunshot website highlight.« less
Cloud Engineering Principles and Technology Enablers for Medical Image Processing-as-a-Service
Bao, Shunxing; Plassard, Andrew J.; Landman, Bennett A.; Gokhale, Aniruddha
2017-01-01
Traditional in-house, laboratory-based medical imaging studies use hierarchical data structures (e.g., NFS file stores) or databases (e.g., COINS, XNAT) for storage and retrieval. The resulting performance from these approaches is, however, impeded by standard network switches since they can saturate network bandwidth during transfer from storage to processing nodes for even moderate-sized studies. To that end, a cloud-based “medical image processing-as-a-service” offers promise in utilizing the ecosystem of Apache Hadoop, which is a flexible framework providing distributed, scalable, fault tolerant storage and parallel computational modules, and HBase, which is a NoSQL database built atop Hadoop’s distributed file system. Despite this promise, HBase’s load distribution strategy of region split and merge is detrimental to the hierarchical organization of imaging data (e.g., project, subject, session, scan, slice). This paper makes two contributions to address these concerns by describing key cloud engineering principles and technology enhancements we made to the Apache Hadoop ecosystem for medical imaging applications. First, we propose a row-key design for HBase, which is a necessary step that is driven by the hierarchical organization of imaging data. Second, we propose a novel data allocation policy within HBase to strongly enforce collocation of hierarchically related imaging data. The proposed enhancements accelerate data processing by minimizing network usage and localizing processing to machines where the data already exist. Moreover, our approach is amenable to the traditional scan, subject, and project-level analysis procedures, and is compatible with standard command line/scriptable image processing software. Experimental results for an illustrative sample of imaging data reveals that our new HBase policy results in a three-fold time improvement in conversion of classic DICOM to NiFTI file formats when compared with the default HBase region split policy, and nearly a six-fold improvement over a commonly available network file system (NFS) approach even for relatively small file sets. Moreover, file access latency is lower than network attached storage. PMID:28884169
How to Cloud for Earth Scientists: An Introduction
NASA Technical Reports Server (NTRS)
Lynnes, Chris
2018-01-01
This presentation is a tutorial on getting started with cloud computing for the purposes of Earth Observation datasets. We first discuss some of the main advantages that cloud computing can provide for the Earth scientist: copious processing power, immense and affordable data storage, and rapid startup time. We also talk about some of the challenges of getting the most out of cloud computing: re-organizing the way data are analyzed, handling node failures and attending.
A Secure and Verifiable Outsourced Access Control Scheme in Fog-Cloud Computing
Fan, Kai; Wang, Junxiong; Wang, Xin; Li, Hui; Yang, Yintang
2017-01-01
With the rapid development of big data and Internet of things (IOT), the number of networking devices and data volume are increasing dramatically. Fog computing, which extends cloud computing to the edge of the network can effectively solve the bottleneck problems of data transmission and data storage. However, security and privacy challenges are also arising in the fog-cloud computing environment. Ciphertext-policy attribute-based encryption (CP-ABE) can be adopted to realize data access control in fog-cloud computing systems. In this paper, we propose a verifiable outsourced multi-authority access control scheme, named VO-MAACS. In our construction, most encryption and decryption computations are outsourced to fog devices and the computation results can be verified by using our verification method. Meanwhile, to address the revocation issue, we design an efficient user and attribute revocation method for it. Finally, analysis and simulation results show that our scheme is both secure and highly efficient. PMID:28737733
Risk in the Clouds?: Security Issues Facing Government Use of Cloud Computing
NASA Astrophysics Data System (ADS)
Wyld, David C.
Cloud computing is poised to become one of the most important and fundamental shifts in how computing is consumed and used. Forecasts show that government will play a lead role in adopting cloud computing - for data storage, applications, and processing power, as IT executives seek to maximize their returns on limited procurement budgets in these challenging economic times. After an overview of the cloud computing concept, this article explores the security issues facing public sector use of cloud computing and looks to the risk and benefits of shifting to cloud-based models. It concludes with an analysis of the challenges that lie ahead for government use of cloud resources.
NASA Technical Reports Server (NTRS)
Hung, R. J.; Tsao, Y. D.
1988-01-01
Rawinsonde data and geosynchronous satellite imagery were used to investigate the life cycles of St. Anthony, Minnesota's severe convective storms. It is found that the fully developed storm clouds, with overshooting cloud tops penetrating above the tropopause, collapsed about three minutes before the touchdown of the tornadoes. Results indicate that the probability of producing an outbreak of tornadoes causing greater damage increases when there are higher values of potential energy storage per unit area for overshooting cloud tops penetrating the tropopause. It is also found that there is less chance for clouds with a lower moisture content to be outgrown as a storm cloud than clouds with a higher moisture content.
High Performance Molecular Visualization: In-Situ and Parallel Rendering with EGL.
Stone, John E; Messmer, Peter; Sisneros, Robert; Schulten, Klaus
2016-05-01
Large scale molecular dynamics simulations produce terabytes of data that is impractical to transfer to remote facilities. It is therefore necessary to perform visualization tasks in-situ as the data are generated, or by running interactive remote visualization sessions and batch analyses co-located with direct access to high performance storage systems. A significant challenge for deploying visualization software within clouds, clusters, and supercomputers involves the operating system software required to initialize and manage graphics acceleration hardware. Recently, it has become possible for applications to use the Embedded-system Graphics Library (EGL) to eliminate the requirement for windowing system software on compute nodes, thereby eliminating a significant obstacle to broader use of high performance visualization applications. We outline the potential benefits of this approach in the context of visualization applications used in the cloud, on commodity clusters, and supercomputers. We discuss the implementation of EGL support in VMD, a widely used molecular visualization application, and we outline benefits of the approach for molecular visualization tasks on petascale computers, clouds, and remote visualization servers. We then provide a brief evaluation of the use of EGL in VMD, with tests using developmental graphics drivers on conventional workstations and on Amazon EC2 G2 GPU-accelerated cloud instance types. We expect that the techniques described here will be of broad benefit to many other visualization applications.
High Performance Molecular Visualization: In-Situ and Parallel Rendering with EGL
Stone, John E.; Messmer, Peter; Sisneros, Robert; Schulten, Klaus
2016-01-01
Large scale molecular dynamics simulations produce terabytes of data that is impractical to transfer to remote facilities. It is therefore necessary to perform visualization tasks in-situ as the data are generated, or by running interactive remote visualization sessions and batch analyses co-located with direct access to high performance storage systems. A significant challenge for deploying visualization software within clouds, clusters, and supercomputers involves the operating system software required to initialize and manage graphics acceleration hardware. Recently, it has become possible for applications to use the Embedded-system Graphics Library (EGL) to eliminate the requirement for windowing system software on compute nodes, thereby eliminating a significant obstacle to broader use of high performance visualization applications. We outline the potential benefits of this approach in the context of visualization applications used in the cloud, on commodity clusters, and supercomputers. We discuss the implementation of EGL support in VMD, a widely used molecular visualization application, and we outline benefits of the approach for molecular visualization tasks on petascale computers, clouds, and remote visualization servers. We then provide a brief evaluation of the use of EGL in VMD, with tests using developmental graphics drivers on conventional workstations and on Amazon EC2 G2 GPU-accelerated cloud instance types. We expect that the techniques described here will be of broad benefit to many other visualization applications. PMID:27747137
Seamless personal health information system in cloud computing.
Chung, Wan-Young; Fong, Ee May
2014-01-01
Noncontact ECG measurement has gained popularity these days due to its noninvasive and conveniences to be applied on daily life. This approach does not require any direct contact between patient's skin and sensor for physiological signal measurement. The noncontact ECG measurement is integrated with mobile healthcare system for health status monitoring. Mobile phone acts as the personal health information system displaying health status and body mass index (BMI) tracking. Besides that, it plays an important role being the medical guidance providing medical knowledge database including symptom checker and health fitness guidance. At the same time, the system also features some unique medical functions that cater to the living demand of the patients or users, including regular medication reminders, alert alarm, medical guidance, appointment scheduling. Lastly, we demonstrate mobile healthcare system with web application for extended uses, thus health data are clouded into web server system and web database storage. This allows remote health status monitoring easily and so forth it promotes a cost effective personal healthcare system.
INDIGO-DataCloud solutions for Earth Sciences
NASA Astrophysics Data System (ADS)
Aguilar Gómez, Fernando; de Lucas, Jesús Marco; Fiore, Sandro; Monna, Stephen; Chen, Yin
2017-04-01
INDIGO-DataCloud (https://www.indigo-datacloud.eu/) is a European Commission funded project aiming to develop a data and computing platform targeting scientific communities, deployable on multiple hardware and provisioned over hybrid (private or public) e-infrastructures. The development of INDIGO solutions covers the different layers in cloud computing (IaaS, PaaS, SaaS), and provides tools to exploit resources like HPC or GPGPUs. INDIGO is oriented to support European Scientific research communities, that are well represented in the project. Twelve different Case Studies have been analyzed in detail from different fields: Biological & Medical sciences, Social sciences & Humanities, Environmental and Earth sciences and Physics & Astrophysics. INDIGO-DataCloud provides solutions to emerging challenges in Earth Science like: -Enabling an easy deployment of community services at different cloud sites. Many Earth Science research infrastructures often involve distributed observation stations across countries, and also have distributed data centers to support the corresponding data acquisition and curation. There is a need to easily deploy new data center services while the research infrastructure continuous spans. As an example: LifeWatch (ESFRI, Ecosystems and Biodiversity) uses INDIGO solutions to manage the deployment of services to perform complex hydrodynamics and water quality modelling over a Cloud Computing environment, predicting algae blooms, using the Docker technology: TOSCA requirement description, Docker repository, Orchestrator for deployment, AAI (AuthN, AuthZ) and OneData (Distributed Storage System). -Supporting Big Data Analysis. Nowadays, many Earth Science research communities produce large amounts of data and and are challenged by the difficulties of processing and analysing it. A climate models intercomparison data analysis case study for the European Network for Earth System Modelling (ENES) community has been setup, based on the Ophidia big data analysis framework and the Kepler workflow management system. Such services normally involve a large and distributed set of data and computing resources. In this regard, this case study exploits the INDIGO PaaS for a flexible and dynamic allocation of the resources at the infrastructural level. -Providing Distributed Data Storage Solutions. In order to allow scientific communities to perform heavy computation on huge datasets, INDIGO provides global data access solutions allowing researchers to access data in a distributed environment like fashion regardless of its location, and also to publish and share their research results with public or close communities. INDIGO solutions that support the access to distributed data storage (OneData) are being tested on EMSO infrastructure (Ocean Sciences and Geohazards) data. Another aspect of interest for the EMSO community is in efficient data processing by exploiting INDIGO services like PaaS Orchestrator. Further, for HPC exploitation, a new solution named Udocker has been implemented, enabling users to execute docker containers in supercomputers, without requiring administration privileges. This presentation will overview INDIGO solutions that are interesting and useful for Earth science communities and will show how they can be applied to other Case Studies.
NASA Astrophysics Data System (ADS)
Jiang, Guodong; Fan, Ming; Li, Lihua
2016-03-01
Mammography is the gold standard for breast cancer screening, reducing mortality by about 30%. The application of a computer-aided detection (CAD) system to assist a single radiologist is important to further improve mammographic sensitivity for breast cancer detection. In this study, a design and realization of the prototype for remote diagnosis system in mammography based on cloud platform were proposed. To build this system, technologies were utilized including medical image information construction, cloud infrastructure and human-machine diagnosis model. Specifically, on one hand, web platform for remote diagnosis was established by J2EE web technology. Moreover, background design was realized through Hadoop open-source framework. On the other hand, storage system was built up with Hadoop distributed file system (HDFS) technology which enables users to easily develop and run on massive data application, and give full play to the advantages of cloud computing which is characterized by high efficiency, scalability and low cost. In addition, the CAD system was realized through MapReduce frame. The diagnosis module in this system implemented the algorithms of fusion of machine and human intelligence. Specifically, we combined results of diagnoses from doctors' experience and traditional CAD by using the man-machine intelligent fusion model based on Alpha-Integration and multi-agent algorithm. Finally, the applications on different levels of this system in the platform were also discussed. This diagnosis system will have great importance for the balanced health resource, lower medical expense and improvement of accuracy of diagnosis in basic medical institutes.
NASA Astrophysics Data System (ADS)
Basri, M.; Mawengkang, H.; Zamzami, E. M.
2018-03-01
Limitations of storage sources is one option to switch to cloud storage. Confidentiality and security of data stored on the cloud is very important. To keep up the confidentiality and security of such data can be done one of them by using cryptography techniques. Data Encryption Standard (DES) is one of the block cipher algorithms used as standard symmetric encryption algorithm. This DES will produce 8 blocks of ciphers combined into one ciphertext, but the ciphertext are weak against brute force attacks. Therefore, the last 8 block cipher will be converted into 8 random images using Least Significant Bit (LSB) algorithm which later draws the result of cipher of DES algorithm to be merged into one.
2010-09-01
Cloud computing describes a new distributed computing paradigm for IT data and services that involves over-the-Internet provision of dynamically scalable and often virtualized resources. While cost reduction and flexibility in storage, services, and maintenance are important considerations when deciding on whether or how to migrate data and applications to the cloud, large organizations like the Department of Defense need to consider the organization and structure of data on the cloud and the operations on such data in order to reap the full benefit of cloud
Long Read Alignment with Parallel MapReduce Cloud Platform
Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki
2015-01-01
Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms. PMID:26839887
Long Read Alignment with Parallel MapReduce Cloud Platform.
Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki
2015-01-01
Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms.
ERIC Educational Resources Information Center
Aaron, Lynn S.; Roche, Catherine M.
2012-01-01
"Cloud computing" refers to the use of computing resources on the Internet instead of on individual personal computers. The field is expanding and has significant potential value for educators. This is discussed with a focus on four main functions: file storage, file synchronization, document creation, and collaboration--each of which has…
NASA Astrophysics Data System (ADS)
Moro, A. C.; Nadesh, R. K.
2017-11-01
The cloud computing paradigm has transformed the way we do business in today’s world. Services on cloud have come a long way since just providing basic storage or software on demand. One of the fastest growing factor in this is mobile cloud computing. With the option of offloading now available to mobile users, mobile users can offload entire applications onto cloudlets. With the problems regarding availability and limited-storage capacity of these mobile cloudlets, it becomes difficult to decide for the mobile user when to use his local memory or the cloudlets. Hence, we take a look at a fast algorithm that decides whether the mobile user should go for cloudlet or rely on local memory based on an offloading probability. We have partially implemented the algorithm which decides whether the task can be carried out locally or given to a cloudlet. But as it becomes a burden on the mobile devices to perform the complete computation, so we look to offload this on to a cloud in our paper. Also further we use a file compression technique before sending the file onto the cloud to further reduce the load.
Fine-grained Database Field Search Using Attribute-Based Encryption for E-Healthcare Clouds.
Guo, Cheng; Zhuang, Ruhan; Jie, Yingmo; Ren, Yizhi; Wu, Ting; Choo, Kim-Kwang Raymond
2016-11-01
An effectively designed e-healthcare system can significantly enhance the quality of access and experience of healthcare users, including facilitating medical and healthcare providers in ensuring a smooth delivery of services. Ensuring the security of patients' electronic health records (EHRs) in the e-healthcare system is an active research area. EHRs may be outsourced to a third-party, such as a community healthcare cloud service provider for storage due to cost-saving measures. Generally, encrypting the EHRs when they are stored in the system (i.e. data-at-rest) or prior to outsourcing the data is used to ensure data confidentiality. Searchable encryption (SE) scheme is a promising technique that can ensure the protection of private information without compromising on performance. In this paper, we propose a novel framework for controlling access to EHRs stored in semi-trusted cloud servers (e.g. a private cloud or a community cloud). To achieve fine-grained access control for EHRs, we leverage the ciphertext-policy attribute-based encryption (CP-ABE) technique to encrypt tables published by hospitals, including patients' EHRs, and the table is stored in the database with the primary key being the patient's unique identity. Our framework can enable different users with different privileges to search on different database fields. Differ from previous attempts to secure outsourcing of data, we emphasize the control of the searches of the fields within the database. We demonstrate the utility of the scheme by evaluating the scheme using datasets from the University of California, Irvine.
Building a cloud based distributed active archive data center
NASA Astrophysics Data System (ADS)
Ramachandran, Rahul; Baynes, Katie; Murphy, Kevin
2017-04-01
NASA's Earth Science Data System (ESDS) Program serves as a central cog in facilitating the implementation of NASA's Earth Science strategic plan. Since 1994, the ESDS Program has committed to the full and open sharing of Earth science data obtained from NASA instruments to all users. One of the key responsibilities of the ESDS Program is to continuously evolve the entire data and information system to maximize returns on the collected NASA data. An independent review was conducted in 2015 to holistically review the EOSDIS in order to identify gaps. The review recommendations were to investigate two areas: one, whether commercial cloud providers offer potential for storage, processing, and operational efficiencies, and two, the potential development of new data access and analysis paradigms. In response, ESDS has initiated several prototypes investigating the advantages and risks of leveraging cloud computing. This poster will provide an overview of one such prototyping activity, "Cumulus". Cumulus is being designed and developed as a "native" cloud-based data ingest, archive and management system that can be used for all future NASA Earth science data streams. The long term vision for Cumulus, its requirements, overall architecture, and implementation details, as well as lessons learned from the completion of the first phase of this prototype will be covered. We envision Cumulus will foster design of new analysis/visualization tools to leverage collocated data from all of the distributed DAACs as well as elastic cloud computing resources to open new research opportunities.
Buffering PV output during cloud transients with energy storage
NASA Astrophysics Data System (ADS)
Moumouni, Yacouba
Consideration of the use of the major types of energy storage is attempted in this thesis in order to mitigate the effects of power output transients associated with grid-tied CPV systems due to fast-moving cloud coverage. The approach presented here is to buffer intermittency of CPV output power with an energy storage device (used batteries) purchased cheaply from EV owners or battery leasers. When the CPV is connected to the grid with the proper energy storage, the main goal is to smooth out the intermittent solar power and fluctuant load of the grid with a convenient control strategy. This thesis provides a detailed analysis with appropriate Matlab codes to put onto the grid during the day time a constant amount of power on one hand and on the other, shift the less valuable off-peak electricity to the on-peak time, i.e. between 1pm to 7pm, where the electricity price is much better. In this study, a range of base constant power levels were assumed including 15kW, 20kW, 21kW, 22kW, 23kW, 24kW and 25kW. The hypothesis based on an iterative solution was that the capacity of the battery was increased by steps of 5 while the base supply was decreased by the same step size until satisfactorily results were achieved. Hence, it turned out with the chosen battery capacity of 54kWh coupled to the data from the Amonix CPV 7700 unit for Las Vegas for a 3-month period, it was found that 20kW was the largest constant load the system can supply uninterruptedly to the utility company. Simulated results are presented to show the feasibility of the proposed scheme.
Privacy-preserving public auditing for data integrity in cloud
NASA Astrophysics Data System (ADS)
Shaik Saleem, M.; Murali, M.
2018-04-01
Cloud computing which has collected extent concentration from communities of research and with industry research development, a large pool of computing resources using virtualized sharing method like storage, processing power, applications and services. The users of cloud are vend with on demand resources as they want in the cloud computing. Outsourced file of the cloud user can easily tampered as it is stored at the third party service providers databases, so there is no integrity of cloud users data as it has no control on their data, therefore providing security assurance to the users data has become one of the primary concern for the cloud service providers. Cloud servers are not responsible for any data loss as it doesn’t provide the security assurance to the cloud user data. Remote data integrity checking (RDIC) licenses an information to data storage server, to determine that it is really storing an owners data truthfully. RDIC is composed of security model and ID-based RDIC where it is responsible for the security of every server and make sure the data privacy of cloud user against the third party verifier. Generally, by running a two-party Remote data integrity checking (RDIC) protocol the clients would themselves be able to check the information trustworthiness of their cloud. Within the two party scenario the verifying result is given either from the information holder or the cloud server may be considered as one-sided. Public verifiability feature of RDIC gives the privilege to all its users to verify whether the original data is modified or not. To ensure the transparency of the publicly verifiable RDIC protocols, Let’s figure out there exists a TPA who is having knowledge and efficiency to verify the work to provide the condition clearly by publicly verifiable RDIC protocols.
NASA Astrophysics Data System (ADS)
Eilers, J.
2013-09-01
The interface analysis from an observer of space objects makes a standard necessary. This standardized dataset serves as input for a cloud based service, which aimed for a near real-time Space Situational Awareness (SSA) system. The system contains all advantages of a cloud based solution, like redundancy, scalability and an easy way to distribute information. For the standard based on the interface analysis of the observer, the information can be separated in three parts. One part is the information about the observer e.g. a ground station. The next part is the information about the sensors that are used by the observer. And the last part is the data from the detected object. Backbone of the SSA System is the cloud based service which includes the consistency check for the observed objects, a database for the objects, the algorithms and analysis as well as the visualization of the results. This paper also provides an approximation of the needed computational power, data storage and a financial approach to deliver this service to a broad community. In this context cloud means, neither the user nor the observer has to think about the infrastructure of the calculation environment. The decision if the IT-infrastructure will be built by a conglomerate of different nations or rented on the marked should be based on an efficiency analysis. Also combinations are possible like starting on a rented cloud and then go to a private cloud owned by the government. One of the advantages of a cloud solution is the scalability. There are about 3000 satellites in space, 900 of them are active, and in total there are about ~17.000 detected space objects orbiting earth. But for the computation it is not a N(active) to N problem it is more N(active) to N(apo peri) quantity of N(all). Instead of 15.3 million possible collisions to calculate a computation of only approx. 2.3 million possible collisions must be done. In general, this Space Situational Awareness System can be used as a tool for satellite system owner for collision avoidance.
Manganaris, George A; Drogoudi, Pavlina; Goulas, Vlasios; Tanou, Georgia; Georgiadou, Egli C; Pantelidis, George E; Paschalidis, Konstantinos A; Fotopoulos, Vasileios; Manganaris, Athanasios
2017-10-01
The aim of this study was to understand the antioxidant metabolic changes of peach (cvs. 'Royal Glory', 'Red Haven' and 'Sun Cloud') and nectarine fruits (cv. 'Big Top') exposed to different combinations of low-temperature storage (0, 2, 4 weeks storage at 0 °C, 90% R.H.) and additional ripening at room temperature (1, 3 and 5 d, shelf life, 20 °C) with an array of analytical, biochemical and molecular approaches. Initially, harvested fruit of the examined cultivars were segregated non-destructively at advanced and less pronounced maturity stages and qualitative traits, physiological parameters, phytochemical composition and antioxidant capacity were determined. 'Big Top' and 'Royal Glory' fruits were characterized by slower softening rate and less pronounced ripening-related alterations. The coupling of HPLC fingerprints, consisted of 7 phenolic compounds (chlorogenic, neochlorogenic acid, catechin, epicatechin, rutin, quecetin-3-O-glucoside, procyanidin B1) and spectrophotometric methods disclosed a great impact of genotype on peach bioactive composition, with 'Sun Cloud' generally displaying the highest contents. Maturity stage at harvest did not seem to affect fruit phenolic composition and no general guidelines for the impact of cold storage and shelf-life on individual phenolic compounds can be extrapolated. Subsequently, fruit of less pronounced maturity at harvest were used for further molecular analysis. 'Sun Cloud' was proven efficient in protecting plasmid pBR322 DNA against ROO attack throughout the experimental period and against HO attack after 2 and 4 weeks of cold storage. Interestingly, a general down-regulation of key genes implicated in the antioxidant apparatus with the prolongation of storage period was recorded; this was more evident for CAT, cAPX, Cu/ZnSOD2, perAPX3 and GPX8 genes. Higher antioxidant capacity of 'Sun Cloud' fruit could potentially be linked with compounds other than enzymatic antioxidants that further regulate peach fruit ripening. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Billing, M. G.; Conway, J. V.; Crittenden, J. A.
Cornell's electron/positron storage ring (CESR) was modified over a series of accelerator shutdowns beginning in May 2008, which substantially improves its capability for research and development for particle accelerators. CESR's energy span from 1.8 to 5.6 GeV with both electrons and positrons makes it ideal for the study of a wide spectrum of accelerator physics issues and instrumentation related to present light sources and future lepton damping rings. Additionally a number of these are also relevant for the beam physics of proton accelerators. This paper is the third in a series of four describing the conversion of CESR to themore » test accelerator, CESRTA. The first two papers discuss the overall plan for the conversion of the storage ring to an instrument capable of studying advanced accelerator physics issues [1] and the details of the vacuum system upgrades [2]. This paper focuses on the necessary development of new instrumentation, situated in four dedicated experimental regions, capable of studying such phenomena as electron clouds (ECs) and methods to mitigate EC effects. The fourth paper in this series describes the vacuum system modifications of the superconducting wigglers to accommodate the diagnostic instrumentation for the study of EC behavior within wigglers. Lastly, while the initial studies of CESRTA focused on questions related to the International Linear Collider damping ring design, CESRTA is a very versatile storage ring, capable of studying a wide range of accelerator physics and instrumentation questions.« less
Billing, M. G.; Conway, J. V.; Crittenden, J. A.; ...
2016-04-28
Cornell's electron/positron storage ring (CESR) was modified over a series of accelerator shutdowns beginning in May 2008, which substantially improves its capability for research and development for particle accelerators. CESR's energy span from 1.8 to 5.6 GeV with both electrons and positrons makes it ideal for the study of a wide spectrum of accelerator physics issues and instrumentation related to present light sources and future lepton damping rings. Additionally a number of these are also relevant for the beam physics of proton accelerators. This paper is the third in a series of four describing the conversion of CESR to themore » test accelerator, CESRTA. The first two papers discuss the overall plan for the conversion of the storage ring to an instrument capable of studying advanced accelerator physics issues [1] and the details of the vacuum system upgrades [2]. This paper focuses on the necessary development of new instrumentation, situated in four dedicated experimental regions, capable of studying such phenomena as electron clouds (ECs) and methods to mitigate EC effects. The fourth paper in this series describes the vacuum system modifications of the superconducting wigglers to accommodate the diagnostic instrumentation for the study of EC behavior within wigglers. Lastly, while the initial studies of CESRTA focused on questions related to the International Linear Collider damping ring design, CESRTA is a very versatile storage ring, capable of studying a wide range of accelerator physics and instrumentation questions.« less
Large-Scale, Multi-Sensor Atmospheric Data Fusion Using Hybrid Cloud Computing
NASA Astrophysics Data System (ADS)
Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E. J.
2015-12-01
NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, MODIS, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over 10 years of data. HySDS is a Hybrid-Cloud Science Data System that has been developed and applied under NASA AIST, MEaSUREs, and ACCESS grants. HySDS uses the SciFlow workflow engine to partition analysis workflows into parallel tasks (e.g. segmenting by time or space) that are pushed into a durable job queue. The tasks are "pulled" from the queue by worker Virtual Machines (VM's) and executed in an on-premise Cloud (Eucalyptus or OpenStack) or at Amazon in the public Cloud or govCloud. In this way, years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the transferred data. We are using HySDS to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a MEASURES grant. We will present the architecture of HySDS, describe the achieved "clock time" speedups in fusing datasets on our own nodes and in the Amazon Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. Our system demonstrates how one can pull A-Train variables (Levels 2 & 3) on-demand into the Amazon Cloud, and cache only those variables that are heavily used, so that any number of compute jobs can be executed "near" the multi-sensor data. Decade-long, multi-sensor studies can be performed without pre-staging data, with the researcher paying only his own Cloud compute bill.
Design and implementation of website information disclosure assessment system.
Cho, Ying-Chiang; Pan, Jen-Yi
2015-01-01
Internet application technologies, such as cloud computing and cloud storage, have increasingly changed people's lives. Websites contain vast amounts of personal privacy information. In order to protect this information, network security technologies, such as database protection and data encryption, attract many researchers. The most serious problems concerning web vulnerability are e-mail address and network database leakages. These leakages have many causes. For example, malicious users can steal database contents, taking advantage of mistakes made by programmers and administrators. In order to mitigate this type of abuse, a website information disclosure assessment system is proposed in this study. This system utilizes a series of technologies, such as web crawler algorithms, SQL injection attack detection, and web vulnerability mining, to assess a website's information disclosure. Thirty websites, randomly sampled from the top 50 world colleges, were used to collect leakage information. This testing showed the importance of increasing the security and privacy of website information for academic websites.
On the Modeling and Management of Cloud Data Analytics
NASA Astrophysics Data System (ADS)
Castillo, Claris; Tantawi, Asser; Steinder, Malgorzata; Pacifici, Giovanni
A new era is dawning where vast amount of data is subjected to intensive analysis in a cloud computing environment. Over the years, data about a myriad of things, ranging from user clicks to galaxies, have been accumulated, and continue to be collected, on storage media. The increasing availability of such data, along with the abundant supply of compute power and the urge to create useful knowledge, gave rise to a new data analytics paradigm in which data is subjected to intensive analysis, and additional data is created in the process. Meanwhile, a new cloud computing environment has emerged where seemingly limitless compute and storage resources are being provided to host computation and data for multiple users through virtualization technologies. Such a cloud environment is becoming the home for data analytics. Consequently, providing good performance at run-time to data analytics workload is an important issue for cloud management. In this paper, we provide an overview of the data analytics and cloud environment landscapes, and investigate the performance management issues related to running data analytics in the cloud. In particular, we focus on topics such as workload characterization, profiling analytics applications and their pattern of data usage, cloud resource allocation, placement of computation and data and their dynamic migration in the cloud, and performance prediction. In solving such management problems one relies on various run-time analytic models. We discuss approaches for modeling and optimizing the dynamic data analytics workload in the cloud environment. All along, we use the Map-Reduce paradigm as an illustration of data analytics.
Storage element performance optimization for CMS analysis jobs
NASA Astrophysics Data System (ADS)
Behrmann, G.; Dahlblom, J.; Guldmyr, J.; Happonen, K.; Lindén, T.
2012-12-01
Tier-2 computing sites in the Worldwide Large Hadron Collider Computing Grid (WLCG) host CPU-resources (Compute Element, CE) and storage resources (Storage Element, SE). The vast amount of data that needs to processed from the Large Hadron Collider (LHC) experiments requires good and efficient use of the available resources. Having a good CPU efficiency for the end users analysis jobs requires that the performance of the storage system is able to scale with I/O requests from hundreds or even thousands of simultaneous jobs. In this presentation we report on the work on improving the SE performance at the Helsinki Institute of Physics (HIP) Tier-2 used for the Compact Muon Experiment (CMS) at the LHC. Statistics from CMS grid jobs are collected and stored in the CMS Dashboard for further analysis, which allows for easy performance monitoring by the sites and by the CMS collaboration. As part of the monitoring framework CMS uses the JobRobot which sends every four hours 100 analysis jobs to each site. CMS also uses the HammerCloud tool for site monitoring and stress testing and it has replaced the JobRobot. The performance of the analysis workflow submitted with JobRobot or HammerCloud can be used to track the performance due to site configuration changes, since the analysis workflow is kept the same for all sites and for months in time. The CPU efficiency of the JobRobot jobs at HIP was increased approximately by 50 % to more than 90 %, by tuning the SE and by improvements in the CMSSW and dCache software. The performance of the CMS analysis jobs improved significantly too. Similar work has been done on other CMS Tier-sites, since on average the CPU efficiency for CMSSW jobs has increased during 2011. Better monitoring of the SE allows faster detection of problems, so that the performance level can be kept high. The next storage upgrade at HIP consists of SAS disk enclosures which can be stress tested on demand with HammerCloud workflows, to make sure that the I/O-performance is good.
Rautenberg, Philipp L.; Kumaraswamy, Ajayrama; Tejero-Cantero, Alvaro; Doblander, Christoph; Norouzian, Mohammad R.; Kai, Kazuki; Jacobsen, Hans-Arno; Ai, Hiroyuki; Wachtler, Thomas; Ikeno, Hidetoshi
2014-01-01
Neuroscience today deals with a “data deluge” derived from the availability of high-throughput sensors of brain structure and brain activity, and increased computational resources for detailed simulations with complex output. We report here (1) a novel approach to data sharing between collaborating scientists that brings together file system tools and cloud technologies, (2) a service implementing this approach, called NeuronDepot, and (3) an example application of the service to a complex use case in the neurosciences. The main drivers for our approach are to facilitate collaborations with a transparent, automated data flow that shields scientists from having to learn new tools or data structuring paradigms. Using NeuronDepot is simple: one-time data assignment from the originator and cloud based syncing—thus making experimental and modeling data available across the collaboration with minimum overhead. Since data sharing is cloud based, our approach opens up the possibility of using new software developments and hardware scalabitliy which are associated with elastic cloud computing. We provide an implementation that relies on existing synchronization services and is usable from all devices via a reactive web interface. We are motivating our solution by solving the practical problems of the GinJang project, a collaboration of three universities across eight time zones with a complex workflow encompassing data from electrophysiological recordings, imaging, morphological reconstructions, and simulations. PMID:24971059
Rautenberg, Philipp L; Kumaraswamy, Ajayrama; Tejero-Cantero, Alvaro; Doblander, Christoph; Norouzian, Mohammad R; Kai, Kazuki; Jacobsen, Hans-Arno; Ai, Hiroyuki; Wachtler, Thomas; Ikeno, Hidetoshi
2014-01-01
Neuroscience today deals with a "data deluge" derived from the availability of high-throughput sensors of brain structure and brain activity, and increased computational resources for detailed simulations with complex output. We report here (1) a novel approach to data sharing between collaborating scientists that brings together file system tools and cloud technologies, (2) a service implementing this approach, called NeuronDepot, and (3) an example application of the service to a complex use case in the neurosciences. The main drivers for our approach are to facilitate collaborations with a transparent, automated data flow that shields scientists from having to learn new tools or data structuring paradigms. Using NeuronDepot is simple: one-time data assignment from the originator and cloud based syncing-thus making experimental and modeling data available across the collaboration with minimum overhead. Since data sharing is cloud based, our approach opens up the possibility of using new software developments and hardware scalabitliy which are associated with elastic cloud computing. We provide an implementation that relies on existing synchronization services and is usable from all devices via a reactive web interface. We are motivating our solution by solving the practical problems of the GinJang project, a collaboration of three universities across eight time zones with a complex workflow encompassing data from electrophysiological recordings, imaging, morphological reconstructions, and simulations.
Motion/imagery secure cloud enterprise architecture analysis
NASA Astrophysics Data System (ADS)
DeLay, John L.
2012-06-01
Cloud computing with storage virtualization and new service-oriented architectures brings a new perspective to the aspect of a distributed motion imagery and persistent surveillance enterprise. Our existing research is focused mainly on content management, distributed analytics, WAN distributed cloud networking performance issues of cloud based technologies. The potential of leveraging cloud based technologies for hosting motion imagery, imagery and analytics workflows for DOD and security applications is relatively unexplored. This paper will examine technologies for managing, storing, processing and disseminating motion imagery and imagery within a distributed network environment. Finally, we propose areas for future research in the area of distributed cloud content management enterprises.
Xu, Zhongxiao; Wu, Yuelong; Tian, Long; Chen, Lirong; Zhang, Zhiying; Yan, Zhihui; Li, Shujing; Wang, Hai; Xie, Changde; Peng, Kunchi
2013-12-13
Long-lived and high-fidelity memory for a photonic polarization qubit (PPQ) is crucial for constructing quantum networks. We present a millisecond storage system based on electromagnetically induced transparency, in which a moderate magnetic field is applied on a cold-atom cloud to lift Zeeman degeneracy and, thus, the PPQ states are stored as two magnetic-field-insensitive spin waves. Especially, the influence of magnetic-field-sensitive spin waves on the storage performances is almost totally avoided. The measured average fidelities of the polarization states are 98.6% at 200 μs and 78.4% at 4.5 ms, respectively.
Moving image analysis to the cloud: A case study with a genome-scale tomographic study
NASA Astrophysics Data System (ADS)
Mader, Kevin; Stampanoni, Marco
2016-01-01
Over the last decade, the time required to measure a terabyte of microscopic imaging data has gone from years to minutes. This shift has moved many of the challenges away from experimental design and measurement to scalable storage, organization, and analysis. As many scientists and scientific institutions lack training and competencies in these areas, major bottlenecks have arisen and led to substantial delays and gaps between measurement, understanding, and dissemination. We present in this paper a framework for analyzing large 3D datasets using cloud-based computational and storage resources. We demonstrate its applicability by showing the setup and costs associated with the analysis of a genome-scale study of bone microstructure. We then evaluate the relative advantages and disadvantages associated with local versus cloud infrastructures.
NASA Astrophysics Data System (ADS)
Patel, M. N.; Young, K.; Halling-Brown, M. D.
2018-03-01
The demand for medical images for research is ever increasing owing to the rapid rise in novel machine learning approaches for early detection and diagnosis. The OPTIMAM Medical Image Database (OMI-DB)1,2 was created to provide a centralized, fully annotated dataset for research. The database contains both processed and unprocessed images, associated data, annotations and expert-determined ground truths. Since the inception of the database in early 2011, the volume of images and associated data collected has dramatically increased owing to automation of the collection pipeline and inclusion of new sites. Currently, these data are stored at each respective collection site and synced periodically to a central store. This leads to a large data footprint at each site, requiring large physical onsite storage, which is expensive. Here, we propose an update to the OMI-DB collection system, whereby the storage of all the data is automatically transferred to the cloud on collection. This change in the data collection paradigm reduces the reliance of physical servers at each site; allows greater scope for future expansion; and removes the need for dedicated backups and improves security. Moreover, with the number of applications to access the data increasing rapidly with the maturity of the dataset cloud technology facilities faster sharing of data and better auditing of data access. Such updates, although may sound trivial; require substantial modification to the existing pipeline to ensure data integrity and security compliance. Here, we describe the extensions to the OMI-DB collection pipeline and discuss the relative merits of the new system.
Dalpé, Gratien; Joly, Yann
2014-09-01
Healthcare-related bioinformatics databases are increasingly offering the possibility to maintain, organize, and distribute DNA sequencing data. Different national and international institutions are currently hosting such databases that offer researchers website platforms where they can obtain sequencing data on which they can perform different types of analysis. Until recently, this process remained mostly one-dimensional, with most analysis concentrated on a limited amount of data. However, newer genome sequencing technology is producing a huge amount of data that current computer facilities are unable to handle. An alternative approach has been to start adopting cloud computing services for combining the information embedded in genomic and model system biology data, patient healthcare records, and clinical trials' data. In this new technological paradigm, researchers use virtual space and computing power from existing commercial or not-for-profit cloud service providers to access, store, and analyze data via different application programming interfaces. Cloud services are an alternative to the need of larger data storage; however, they raise different ethical, legal, and social issues. The purpose of this Commentary is to summarize how cloud computing can contribute to bioinformatics-based drug discovery and to highlight some of the outstanding legal, ethical, and social issues that are inherent in the use of cloud services. © 2014 Wiley Periodicals, Inc.
Cloud-based adaptive exon prediction for DNA analysis
Putluri, Srinivasareddy; Fathima, Shaik Yasmeen
2018-01-01
Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database. PMID:29515813
Silva, Luís A Bastião; Costa, Carlos; Oliveira, José Luis
2013-05-01
Healthcare institutions worldwide have adopted picture archiving and communication system (PACS) for enterprise access to images, relying on Digital Imaging Communication in Medicine (DICOM) standards for data exchange. However, communication over a wider domain of independent medical institutions is not well standardized. A DICOM-compliant bridge was developed for extending and sharing DICOM services across healthcare institutions without requiring complex network setups or dedicated communication channels. A set of DICOM routers interconnected through a public cloud infrastructure was implemented to support medical image exchange among institutions. Despite the advantages of cloud computing, new challenges were encountered regarding data privacy, particularly when medical data are transmitted over different domains. To address this issue, a solution was introduced by creating a ciphered data channel between the entities sharing DICOM services. Two main DICOM services were implemented in the bridge: Storage and Query/Retrieve. The performance measures demonstrated it is quite simple to exchange information and processes between several institutions. The solution can be integrated with any currently installed PACS-DICOM infrastructure. This method works transparently with well-known cloud service providers. Cloud computing was introduced to augment enterprise PACS by providing standard medical imaging services across different institutions, offering communication privacy and enabling creation of wider PACS scenarios with suitable technical solutions.
Where the Cloud Meets the Commons
ERIC Educational Resources Information Center
Ipri, Tom
2011-01-01
Changes presented by cloud computing--shared computing services, applications, and storage available to end users via the Internet--have the potential to seriously alter how libraries provide services, not only remotely, but also within the physical library, specifically concerning challenges facing the typical desktop computing experience.…
Holtzapple, R. L.; Billing, M. G.; Campbell, R. C.; ...
2016-04-11
Electron cloud related emittance dilution and instabilities of bunch trains limit the performance of high intensity circular colliders. One of the key goals of the Cornell electron-positron storage ring Test Accelerator (CesrTA) research program is to improve our understanding of how the electron cloud alters the dynamics of bunches within the train. Single bunch beam diagnostics have been developed to measure the beam spectra, vertical beam size, two important dynamical effects of beams interacting with the electron cloud, for bunch trains on a turn-by-turn basis. Experiments have been performed at CesrTA to probe the interaction of the electron cloud withmore » stored positron bunch trains. The purpose of these experiments was to characterize the dependence of beam-electron cloud interactions on the machine parameters such as bunch spacing, vertical chromaticity, and bunch current. The beam dynamics of the stored beam, in the presence of the electron cloud, was quantified using: 1) a gated beam position monitor (BPM) and spectrum analyzer to measure the bunch-by-bunch frequency spectrum of the bunch trains, 2) an x-ray beam size monitor to record the bunch-by-bunch, turn-by-turn vertical size of each bunch within the trains. In this study we report on the observations from these experiments and analyze the effects of the electron cloud on the stability of bunches in a train under many different operational conditions.« less
NASA Astrophysics Data System (ADS)
Holtzapple, R. L.; Billing, M. G.; Campbell, R. C.; Dugan, G. F.; Flanagan, J.; McArdle, K. E.; Miller, M. I.; Palmer, M. A.; Ramirez, G. A.; Sonnad, K. G.; Totten, M. M.; Tucker, S. L.; Williams, H. A.
2016-04-01
Electron cloud related emittance dilution and instabilities of bunch trains limit the performance of high intensity circular colliders. One of the key goals of the Cornell electron-positron storage ring Test Accelerator (CesrTA) research program is to improve our understanding of how the electron cloud alters the dynamics of bunches within the train. Single bunch beam diagnotics have been developed to measure the beam spectra, vertical beam size, two important dynamical effects of beams interacting with the electron cloud, for bunch trains on a turn-by-turn basis. Experiments have been performed at CesrTA to probe the interaction of the electron cloud with stored positron bunch trains. The purpose of these experiments was to characterize the dependence of beam-electron cloud interactions on the machine parameters such as bunch spacing, vertical chromaticity, and bunch current. The beam dynamics of the stored beam, in the presence of the electron cloud, was quantified using: 1) a gated beam position monitor (BPM) and spectrum analyzer to measure the bunch-by-bunch frequency spectrum of the bunch trains; 2) an x-ray beam size monitor to record the bunch-by-bunch, turn-by-turn vertical size of each bunch within the trains. In this paper we report on the observations from these experiments and analyze the effects of the electron cloud on the stability of bunches in a train under many different operational conditions.
NASA Astrophysics Data System (ADS)
Wong, Jianhui; Lim, Yun Seng; Morris, Stella; Morris, Ezra; Chua, Kein Huat
2017-04-01
The amount of small-scaled renewable energy sources is anticipated to increase on the low-voltage distribution networks for the improvement of energy efficiency and reduction of greenhouse gas emission. The growth of the PV systems on the low-voltage distribution networks can create voltage unbalance, voltage rise, and reverse-power flow. Usually these issues happen with little fluctuation. However, it tends to fluctuate severely as Malaysia is a region with low clear sky index. A large amount of clouds often passes over the country, hence making the solar irradiance to be highly scattered. Therefore, the PV power output fluctuates substantially. These issues can lead to the malfunction of the electronic based equipment, reduction in the network efficiency and improper operation of the power protection system. At the current practice, the amount of PV system installed on the distribution network is constraint by the utility company. As a result, this can limit the reduction of carbon footprint. Therefore, energy storage system is proposed as a solution for these power quality issues. To ensure an effective operation of the distribution network with PV system, a fuzzy control system is developed and implemented to govern the operation of an energy storage system. The fuzzy driven energy storage system is able to mitigate the fluctuating voltage rise and voltage unbalance on the electrical grid by actively manipulates the flow of real power between the grid and the batteries. To verify the effectiveness of the proposed fuzzy driven energy storage system, an experimental network integrated with 7.2kWp PV system was setup. Several case studies are performed to evaluate the response of the proposed solution to mitigate voltage rises, voltage unbalance and reduce the amount of reverse power flow under highly intermittent PV power output.
Universal Keyword Classifier on Public Key Based Encrypted Multikeyword Fuzzy Search in Public Cloud
Munisamy, Shyamala Devi; Chokkalingam, Arun
2015-01-01
Cloud computing has pioneered the emerging world by manifesting itself as a service through internet and facilitates third party infrastructure and applications. While customers have no visibility on how their data is stored on service provider's premises, it offers greater benefits in lowering infrastructure costs and delivering more flexibility and simplicity in managing private data. The opportunity to use cloud services on pay-per-use basis provides comfort for private data owners in managing costs and data. With the pervasive usage of internet, the focus has now shifted towards effective data utilization on the cloud without compromising security concerns. In the pursuit of increasing data utilization on public cloud storage, the key is to make effective data access through several fuzzy searching techniques. In this paper, we have discussed the existing fuzzy searching techniques and focused on reducing the searching time on the cloud storage server for effective data utilization. Our proposed Asymmetric Classifier Multikeyword Fuzzy Search method provides classifier search server that creates universal keyword classifier for the multiple keyword request which greatly reduces the searching time by learning the search path pattern for all the keywords in the fuzzy keyword set. The objective of using BTree fuzzy searchable index is to resolve typos and representation inconsistencies and also to facilitate effective data utilization. PMID:26380364
Munisamy, Shyamala Devi; Chokkalingam, Arun
2015-01-01
Cloud computing has pioneered the emerging world by manifesting itself as a service through internet and facilitates third party infrastructure and applications. While customers have no visibility on how their data is stored on service provider's premises, it offers greater benefits in lowering infrastructure costs and delivering more flexibility and simplicity in managing private data. The opportunity to use cloud services on pay-per-use basis provides comfort for private data owners in managing costs and data. With the pervasive usage of internet, the focus has now shifted towards effective data utilization on the cloud without compromising security concerns. In the pursuit of increasing data utilization on public cloud storage, the key is to make effective data access through several fuzzy searching techniques. In this paper, we have discussed the existing fuzzy searching techniques and focused on reducing the searching time on the cloud storage server for effective data utilization. Our proposed Asymmetric Classifier Multikeyword Fuzzy Search method provides classifier search server that creates universal keyword classifier for the multiple keyword request which greatly reduces the searching time by learning the search path pattern for all the keywords in the fuzzy keyword set. The objective of using BTree fuzzy searchable index is to resolve typos and representation inconsistencies and also to facilitate effective data utilization.
Wang, Shangping; Zhang, Xiaoxue; Zhang, Yaling
2016-01-01
Cipher-policy attribute-based encryption (CP-ABE) focus on the problem of access control, and keyword-based searchable encryption scheme focus on the problem of finding the files that the user interested in the cloud storage quickly. To design a searchable and attribute-based encryption scheme is a new challenge. In this paper, we propose an efficiently multi-user searchable attribute-based encryption scheme with attribute revocation and grant for cloud storage. In the new scheme the attribute revocation and grant processes of users are delegated to proxy server. Our scheme supports multi attribute are revoked and granted simultaneously. Moreover, the keyword searchable function is achieved in our proposed scheme. The security of our proposed scheme is reduced to the bilinear Diffie-Hellman (BDH) assumption. Furthermore, the scheme is proven to be secure under the security model of indistinguishability against selective ciphertext-policy and chosen plaintext attack (IND-sCP-CPA). And our scheme is also of semantic security under indistinguishability against chosen keyword attack (IND-CKA) in the random oracle model. PMID:27898703
Wang, Shangping; Zhang, Xiaoxue; Zhang, Yaling
2016-01-01
Cipher-policy attribute-based encryption (CP-ABE) focus on the problem of access control, and keyword-based searchable encryption scheme focus on the problem of finding the files that the user interested in the cloud storage quickly. To design a searchable and attribute-based encryption scheme is a new challenge. In this paper, we propose an efficiently multi-user searchable attribute-based encryption scheme with attribute revocation and grant for cloud storage. In the new scheme the attribute revocation and grant processes of users are delegated to proxy server. Our scheme supports multi attribute are revoked and granted simultaneously. Moreover, the keyword searchable function is achieved in our proposed scheme. The security of our proposed scheme is reduced to the bilinear Diffie-Hellman (BDH) assumption. Furthermore, the scheme is proven to be secure under the security model of indistinguishability against selective ciphertext-policy and chosen plaintext attack (IND-sCP-CPA). And our scheme is also of semantic security under indistinguishability against chosen keyword attack (IND-CKA) in the random oracle model.
CADC and CANFAR: Extending the role of the data centre
NASA Astrophysics Data System (ADS)
Gaudet, Severin
2015-12-01
Over the past six years, the CADC has moved beyond the astronomy archive data centre to a multi-service system for the community. This evolution is based on two major initiatives. The first is the adoption of International Virtual Observatory Alliance (IVOA) standards in both the system and data architecture of the CADC, including a common characterization data model. The second is the Canadian Advanced Network for Astronomical Research (CANFAR), a digital infrastructure combining the Canadian national research network (CANARIE), cloud processing and storage resources (Compute Canada) and a data centre (Canadian Astronomy Data Centre) into a unified ecosystem for storage and processing for the astronomy community. This talk will describe the architecture and integration of IVOA and CANFAR services into CADC operations, the operational experiences, the lessons learned and future directions
A new Information publishing system Based on Internet of things
NASA Astrophysics Data System (ADS)
Zhu, Li; Ma, Guoguang
2018-03-01
A new information publishing system based on Internet of things is proposed, which is composed of four level hierarchical structure, including the screen identification layer, the network transport layer, the service management layer and the publishing application layer. In the architecture, the screen identification layer has realized the internet of screens in which geographically dispersed independent screens are connected to the internet by the customized set-top boxes. The service management layer uses MQTT protocol to implement a lightweight broker-based publish/subscribe messaging mechanism in constrained environments such as internet of things to solve the bandwidth bottleneck. Meanwhile the cloud-based storage technique is used to storage and manage the promptly increasing multimedia publishing information. The paper has designed and realized a prototype SzIoScreen, and give some related test results.
A hazy outlook for cloud computing.
Perna, Gabriel
2012-01-01
Because of competing priorities as well as cost, security, and implementation concerns, cloud-based storage development has gotten off to a slow start in healthcare. CIOs, CTOs, and other healthcare IT leaders are adopting a variety of strategies in this area, based on their organizations' needs, resources, and priorities.
Effect of acidification on carrot (Daucus carota) juice cloud stability.
Schultz, Alison K; Barrett, Diane M; Dungan, Stephanie R
2014-11-26
Effects of acidity on cloud stability in pasteurized carrot juice were examined over the pH range of 3.5-6.2. Cloud sedimentation, particle diameter, and ζ potential were measured at each pH condition to quantify juice cloud stability and clarification during 3 days of storage. Acidification below pH 4.9 resulted in a less negative ζ potential, an increased particle size, and an unstable cloud, leading to juice clarification. As the acidity increased, clarification occurred more rapidly and to a greater extent. Only a weak effect of ionic strength was observed when sodium salts were added to the juice, but the addition of calcium salts significantly reduced the cloud stability.
Trust Model to Enhance Security and Interoperability of Cloud Environment
NASA Astrophysics Data System (ADS)
Li, Wenjuan; Ping, Lingdi
Trust is one of the most important means to improve security and enable interoperability of current heterogeneous independent cloud platforms. This paper first analyzed several trust models used in large and distributed environment and then introduced a novel cloud trust model to solve security issues in cross-clouds environment in which cloud customer can choose different providers' services and resources in heterogeneous domains can cooperate. The model is domain-based. It divides one cloud provider's resource nodes into the same domain and sets trust agent. It distinguishes two different roles cloud customer and cloud server and designs different strategies for them. In our model, trust recommendation is treated as one type of cloud services just like computation or storage. The model achieves both identity authentication and behavior authentication. The results of emulation experiments show that the proposed model can efficiently and safely construct trust relationship in cross-clouds environment.
Climbing the Slope of Enlightenment during NASA's Arctic Boreal Vulnerability Experiment
NASA Astrophysics Data System (ADS)
Griffith, P. C.; Hoy, E.; Duffy, D.; McInerney, M.
2015-12-01
The Arctic Boreal Vulnerability Experiment (ABoVE) is a new field campaign sponsored by NASA's Terrestrial Ecology Program and designed to improve understanding of the vulnerability and resilience of Arctic and boreal social-ecological systems to environmental change (http://above.nasa.gov). ABoVE is integrating field-based studies, modeling, and data from airborne and satellite remote sensing. The NASA Center for Climate Simulation (NCCS) has partnered with the NASA Carbon Cycle and Ecosystems Office (CCEO) to create a high performance science cloud for this field campaign. The ABoVE Science Cloud combines high performance computing with emerging technologies and data management with tools for analyzing and processing geographic information to create an environment specifically designed for large-scale modeling, analysis of remote sensing data, copious disk storage for "big data" with integrated data management, and integration of core variables from in-situ networks. The ABoVE Science Cloud is a collaboration that is accelerating the pace of new Arctic science for researchers participating in the field campaign. Specific examples of the utilization of the ABoVE Science Cloud by several funded projects will be presented.
Zhou, Xiuze; Lin, Fan; Yang, Lvqing; Nie, Jing; Tan, Qian; Zeng, Wenhua; Zhang, Nian
2016-01-01
With the continuous expansion of the cloud computing platform scale and rapid growth of users and applications, how to efficiently use system resources to improve the overall performance of cloud computing has become a crucial issue. To address this issue, this paper proposes a method that uses an analytic hierarchy process group decision (AHPGD) to evaluate the load state of server nodes. Training was carried out by using a hybrid hierarchical genetic algorithm (HHGA) for optimizing a radial basis function neural network (RBFNN). The AHPGD makes the aggregative indicator of virtual machines in cloud, and become input parameters of predicted RBFNN. Also, this paper proposes a new dynamic load balancing scheduling algorithm combined with a weighted round-robin algorithm, which uses the predictive periodical load value of nodes based on AHPPGD and RBFNN optimized by HHGA, then calculates the corresponding weight values of nodes and makes constant updates. Meanwhile, it keeps the advantages and avoids the shortcomings of static weighted round-robin algorithm.
State DOT use of web-based data storage.
DOT National Transportation Integrated Search
2013-01-01
This study explores the experiences of state departments of transportation (DOT) in the use of web or : cloud-based data storage and related practices. The study provides results of a survey of State DOTs : and presents best practices of state govern...
Developing a Hadoop-based Middleware for Handling Multi-dimensional NetCDF
NASA Astrophysics Data System (ADS)
Li, Z.; Yang, C. P.; Schnase, J. L.; Duffy, D.; Lee, T. J.
2014-12-01
Climate observations and model simulations are collecting and generating vast amounts of climate data, and these data are ever-increasing and being accumulated in a rapid speed. Effectively managing and analyzing these data are essential for climate change studies. Hadoop, a distributed storage and processing framework for large data sets, has attracted increasing attentions in dealing with the Big Data challenge. The maturity of Infrastructure as a Service (IaaS) of cloud computing further accelerates the adoption of Hadoop in solving Big Data problems. However, Hadoop is designed to process unstructured data such as texts, documents and web pages, and cannot effectively handle the scientific data format such as array-based NetCDF files and other binary data format. In this paper, we propose to build a Hadoop-based middleware for transparently handling big NetCDF data by 1) designing a distributed climate data storage mechanism based on POSIX-enabled parallel file system to enable parallel big data processing with MapReduce, as well as support data access by other systems; 2) modifying the Hadoop framework to transparently processing NetCDF data in parallel without sequencing or converting the data into other file formats, or loading them to HDFS; and 3) seamlessly integrating Hadoop, cloud computing and climate data in a highly scalable and fault-tolerance framework.
A secure online image trading system for untrusted cloud environments.
Munadi, Khairul; Arnia, Fitri; Syaryadhi, Mohd; Fujiyoshi, Masaaki; Kiya, Hitoshi
2015-01-01
In conventional image trading systems, images are usually stored unprotected on a server, rendering them vulnerable to untrusted server providers and malicious intruders. This paper proposes a conceptual image trading framework that enables secure storage and retrieval over Internet services. The process involves three parties: an image publisher, a server provider, and an image buyer. The aim is to facilitate secure storage and retrieval of original images for commercial transactions, while preventing untrusted server providers and unauthorized users from gaining access to true contents. The framework exploits the Discrete Cosine Transform (DCT) coefficients and the moment invariants of images. Original images are visually protected in the DCT domain, and stored on a repository server. Small representation of the original images, called thumbnails, are generated and made publicly accessible for browsing. When a buyer is interested in a thumbnail, he/she sends a query to retrieve the visually protected image. The thumbnails and protected images are matched using the DC component of the DCT coefficients and the moment invariant feature. After the matching process, the server returns the corresponding protected image to the buyer. However, the image remains visually protected unless a key is granted. Our target application is the online market, where publishers sell their stock images over the Internet using public cloud servers.
Beating the tyranny of scale with a private cloud configured for Big Data
NASA Astrophysics Data System (ADS)
Lawrence, Bryan; Bennett, Victoria; Churchill, Jonathan; Juckes, Martin; Kershaw, Philip; Pepler, Sam; Pritchard, Matt; Stephens, Ag
2015-04-01
The Joint Analysis System, JASMIN, consists of a five significant hardware components: a batch computing cluster, a hypervisor cluster, bulk disk storage, high performance disk storage, and access to a tape robot. Each of the computing clusters consists of a heterogeneous set of servers, supporting a range of possible data analysis tasks - and a unique network environment makes it relatively trivial to migrate servers between the two clusters. The high performance disk storage will include the world's largest (publicly visible) deployment of the Panasas parallel disk system. Initially deployed in April 2012, JASMIN has already undergone two major upgrades, culminating in a system which by April 2015, will have in excess of 16 PB of disk and 4000 cores. Layered on the basic hardware are a range of services, ranging from managed services, such as the curated archives of the Centre for Environmental Data Archival or the data analysis environment for the National Centres for Atmospheric Science and Earth Observation, to a generic Infrastructure as a Service (IaaS) offering for the UK environmental science community. Here we present examples of some of the big data workloads being supported in this environment - ranging from data management tasks, such as checksumming 3 PB of data held in over one hundred million files, to science tasks, such as re-processing satellite observations with new algorithms, or calculating new diagnostics on petascale climate simulation outputs. We will demonstrate how the provision of a cloud environment closely coupled to a batch computing environment, all sharing the same high performance disk system allows massively parallel processing without the necessity to shuffle data excessively - even as it supports many different virtual communities, each with guaranteed performance. We will discuss the advantages of having a heterogeneous range of servers with available memory from tens of GB at the low end to (currently) two TB at the high end. There are some limitations of the JASMIN environment, the high performance disk environment is not fully available in the IaaS environment, and a planned ability to burst compute heavy jobs into the public cloud is not yet fully available. There are load balancing and performance issues that need to be understood. We will conclude with projections for future usage, and our plans to meet those requirements.
NASA Astrophysics Data System (ADS)
Creamean, J.; Ault, A. P.; White, A. B.; Neiman, P. J.; Minnis, P.; Prather, K. A.
2014-12-01
Aerosols that serve as cloud condensation nuclei (CCN) and ice nuclei (IN) have the potential to profoundly influence precipitation processes. Furthermore, changes in orographic precipitation have broad implications for reservoir storage and flood risks. As part of the CalWater I field campaign (2009-2011), the impacts of aerosol sources on precipitation were investigated in the California Sierra Nevada Mountains. In 2009, the precipitation collected on the ground was influenced by both local biomass burning and long-range transported dust and biological particles, while in 2010, by mostly local sources of biomass burning and pollution, and in 2011 by mostly long-range transport of dust and biological particles from distant sources. Although vast differences in the sources of residues were observed from year-to-year, dust and biological residues were omnipresent (on average, 55% of the total residues combined) and were associated with storms consisting of deep convective cloud systems and larger quantities of precipitation initiated in the ice phase. Further, biological residues were dominant during storms with relatively warm cloud temperatures (up to -15°C), suggesting biological components were more efficient IN than mineral dust. On the other hand, when precipitation quantities were lower, local biomass burning and pollution residues were observed (on average 31% and 9%, respectively), suggesting these residues potentially served as CCN at the base of shallow cloud systems and that lower level polluted clouds of storm systems produced less precipitation than non-polluted (i.e., marine) clouds. The direct connection of the sources of aerosols within clouds and precipitation type and quantity can be used in models to better assess how local emissions versus long-range transported dust and biological aerosols play a role in impacting regional weather and climate, ultimately with the goal of more accurate predictive weather forecast models and water resource management.
A Cloud Robotics Based Service for Managing RPAS in Emergency, Rescue and Hazardous Scenarios
NASA Astrophysics Data System (ADS)
Silvagni, Mario; Chiaberge, Marcello; Sanguedolce, Claudio; Dara, Gianluca
2016-04-01
Cloud robotics and cloud services are revolutionizing not only the ICT world but also the robotics industry, giving robots more computing capabilities, storage and connection bandwidth while opening new scenarios that blend the physical to the digital world. In this vision, new IT architectures are required to manage robots, retrieve data from them and create services to interact with users. Among all the robots this work is mainly focused on flying robots, better known as drones, UAV (Unmanned Aerial Vehicle) or RPAS (Remotely Piloted Aircraft Systems). The cloud robotics approach shifts the concept of having a single local "intelligence" for every single UAV, as a unique device that carries out onboard all the computation and storage processes, to a more powerful "centralized brain" located in the cloud. This breakthrough opens new scenarios where UAVs are agents, relying on remote servers for most of their computational load and data storage, creating a network of devices where they can share knowledge and information. Many applications, using UAVs, are growing as interesting and suitable devices for environment monitoring. Many services can be build fetching data from UAVs, such as telemetry, video streaming, pictures or sensors data; once. These services, part of the IT architecture, can be accessed via web by other devices or shared with other UAVs. As test cases of the proposed architecture, two examples are reported. In the first one a search and rescue or emergency management, where UAVs are required for monitoring intervention, is shown. In case of emergency or aggression, the user requests the emergency service from the IT architecture, providing GPS coordinates and an identification number. The IT architecture uses a UAV (choosing among the available one according to distance, service status, etc.) to reach him/her for monitoring and support operations. In the meantime, an officer will use the service to see the current position of the UAV, its telemetry and video streaming from its camera. Data are stored for further use and documentation and can be shared to all the involved personal or services. The second case refer to imaging survey. An investigation area is selected using a map or a set of coordinates by a user that can be on the field on in a management facility. The cloud system elaborate this data and automatically compute a flight plan that consider the survey data requirements (i.e: picture ground resolution, overlapping) but also several environment constraints (i.e: no fly zones, possible hazardous areas, known obstacles, etc). Once the flight plan is loaded in the selected UAV the mission starts. During the mission, if a suitable data network coverage is available, the UAV transmit acquired images (typically low quality image to limit bandwidth) and shooting pose in order to perform a preliminary check during the mission and minimize failing in survey; if not, all data are uploaded asynchronously after the mission. The cloud servers perform all the tasks related to image processing (mosaic, ortho-photo, geo-referencing, 3D models) and data management.
NASA Astrophysics Data System (ADS)
Hua, H.; Owen, S. E.; Yun, S. H.; Agram, P. S.; Manipon, G.; Starch, M.; Sacco, G. F.; Bue, B. D.; Dang, L. B.; Linick, J. P.; Malarout, N.; Rosen, P. A.; Fielding, E. J.; Lundgren, P.; Moore, A. W.; Liu, Z.; Farr, T.; Webb, F.; Simons, M.; Gurrola, E. M.
2017-12-01
With the increased availability of open SAR data (e.g. Sentinel-1 A/B), new challenges are being faced with processing and analyzing the voluminous SAR datasets to make geodetic measurements. Upcoming SAR missions such as NISAR are expected to generate close to 100TB per day. The Advanced Rapid Imaging and Analysis (ARIA) project can now generate geocoded unwrapped phase and coherence products from Sentinel-1 TOPS mode data in an automated fashion, using the ISCE software. This capability is currently being exercised on various study sites across the United States and around the globe, including Hawaii, Central California, Iceland and South America. The automated and large-scale SAR data processing and analysis capabilities use cloud computing techniques to speed the computations and provide scalable processing power and storage. Aspects such as how to processing these voluminous SLCs and interferograms at global scales, keeping up with the large daily SAR data volumes, and how to handle the voluminous data rates are being explored. Scene-partitioning approaches in the processing pipeline help in handling global-scale processing up to unwrapped interferograms with stitching done at a late stage. We have built an advanced science data system with rapid search functions to enable access to the derived data products. Rapid image processing of Sentinel-1 data to interferograms and time series is already being applied to natural hazards including earthquakes, floods, volcanic eruptions, and land subsidence due to fluid withdrawal. We will present the status of the ARIA science data system for generating science-ready data products and challenges that arise from being able to process SAR datasets to derived time series data products at large scales. For example, how do we perform large-scale data quality screening on interferograms? What approaches can be used to minimize compute, storage, and data movement costs for time series analysis in the cloud? We will also present some of our findings from applying machine learning and data analytics on the processed SAR data streams. We will also present lessons learned on how to ease the SAR community onto interfacing with these cloud-based SAR science data systems.
An approximate dynamic programming approach to resource management in multi-cloud scenarios
NASA Astrophysics Data System (ADS)
Pietrabissa, Antonio; Priscoli, Francesco Delli; Di Giorgio, Alessandro; Giuseppi, Alessandro; Panfili, Martina; Suraci, Vincenzo
2017-03-01
The programmability and the virtualisation of network resources are crucial to deploy scalable Information and Communications Technology (ICT) services. The increasing demand of cloud services, mainly devoted to the storage and computing, requires a new functional element, the Cloud Management Broker (CMB), aimed at managing multiple cloud resources to meet the customers' requirements and, simultaneously, to optimise their usage. This paper proposes a multi-cloud resource allocation algorithm that manages the resource requests with the aim of maximising the CMB revenue over time. The algorithm is based on Markov decision process modelling and relies on reinforcement learning techniques to find online an approximate solution.
A Novel Market-Oriented Dynamic Collaborative Cloud Service Platform
NASA Astrophysics Data System (ADS)
Hassan, Mohammad Mehedi; Huh, Eui-Nam
In today's world the emerging Cloud computing (Weiss, 2007) offer a new computing model where resources such as computing power, storage, online applications and networking infrastructures can be shared as "services" over the internet. Cloud providers (CPs) are incentivized by the profits to be made by charging consumers for accessing these services. Consumers, such as enterprises, are attracted by the opportunity for reducing or eliminating costs associated with "in-house" provision of these services.
Maintenance Downtime May 8 - 11, 2015
Atmospheric Science Data Center
2015-05-06
... The ASDC will experience a partial outage to move from old storage to new storage. ANGe ingest will be paused and production processing on ... any inconvenience this may cause. The following data providers will be impacted: AFWA-MESH16 CloudSat FLASH GHRC NCEP ...
Using Cloud Computing infrastructure with CloudBioLinux, CloudMan and Galaxy
Afgan, Enis; Chapman, Brad; Jadan, Margita; Franke, Vedran; Taylor, James
2012-01-01
Cloud computing has revolutionized availability and access to computing and storage resources; making it possible to provision a large computational infrastructure with only a few clicks in a web browser. However, those resources are typically provided in the form of low-level infrastructure components that need to be procured and configured before use. In this protocol, we demonstrate how to utilize cloud computing resources to perform open-ended bioinformatics analyses, with fully automated management of the underlying cloud infrastructure. By combining three projects, CloudBioLinux, CloudMan, and Galaxy into a cohesive unit, we have enabled researchers to gain access to more than 100 preconfigured bioinformatics tools and gigabytes of reference genomes on top of the flexible cloud computing infrastructure. The protocol demonstrates how to setup the available infrastructure and how to use the tools via a graphical desktop interface, a parallel command line interface, and the web-based Galaxy interface. PMID:22700313
Using cloud computing infrastructure with CloudBioLinux, CloudMan, and Galaxy.
Afgan, Enis; Chapman, Brad; Jadan, Margita; Franke, Vedran; Taylor, James
2012-06-01
Cloud computing has revolutionized availability and access to computing and storage resources, making it possible to provision a large computational infrastructure with only a few clicks in a Web browser. However, those resources are typically provided in the form of low-level infrastructure components that need to be procured and configured before use. In this unit, we demonstrate how to utilize cloud computing resources to perform open-ended bioinformatic analyses, with fully automated management of the underlying cloud infrastructure. By combining three projects, CloudBioLinux, CloudMan, and Galaxy, into a cohesive unit, we have enabled researchers to gain access to more than 100 preconfigured bioinformatics tools and gigabytes of reference genomes on top of the flexible cloud computing infrastructure. The protocol demonstrates how to set up the available infrastructure and how to use the tools via a graphical desktop interface, a parallel command-line interface, and the Web-based Galaxy interface.
Parameterization of cloud lidar backscattering profiles by means of asymmetrical Gaussians
NASA Astrophysics Data System (ADS)
del Guasta, Massimo; Morandi, Marco; Stefanutti, Leopoldo
1995-06-01
A fitting procedure for cloud lidar data processing is shown that is based on the computation of the first three moments of the vertical-backscattering (or -extinction) profile. Single-peak clouds or single cloud layers are approximated to asymmetrical Gaussians. The algorithm is particularly stable with respect to noise and processing errors, and it is much faster than the equivalent least-squares approach. Multilayer clouds can easily be treated as a sum of single asymmetrical Gaussian peaks. The method is suitable for cloud-shape parametrization in noisy lidar signatures (like those expected from satellite lidars). It also permits an improvement of cloud radiative-property computations that are based on huge lidar data sets for which storage and careful examination of single lidar profiles can't be carried out.
The State of Cloud-Based Biospecimen and Biobank Data Management Tools.
Paul, Shonali; Gade, Aditi; Mallipeddi, Sumani
2017-04-01
Biobanks are critical for collecting and managing high-quality biospecimens from donors with appropriate clinical annotation. The high-quality human biospecimens and associated data are required to better understand disease processes. Therefore, biobanks have become an important and essential resource for healthcare research and drug discovery. However, collecting and managing huge volumes of data (biospecimens and associated clinical data) necessitate that biobanks use appropriate data management solutions that can keep pace with the ever-changing requirements of research. To automate biobank data management, biobanks have been investing in traditional Laboratory Information Management Systems (LIMS). However, there are a myriad of challenges faced by biobanks in acquiring traditional LIMS. Traditional LIMS are cost-intensive and often lack the flexibility to accommodate changes in data sources and workflows. Cloud technology is emerging as an alternative that provides the opportunity to small and medium-sized biobanks to automate their operations in a cost-effective manner, even without IT personnel. Cloud-based solutions offer the advantage of heightened security, rapid scalability, dynamic allocation of services, and can facilitate collaboration between different research groups by using a shared environment on a "pay-as-you-go" basis. The benefits offered by cloud technology have resulted in the development of cloud-based data management solutions as an alternative to traditional on-premise software. After evaluating the advantages offered by cloud technology, several biobanks have started adopting cloud-based tools. Cloud-based tools provide biobanks with easy access to biospecimen data for real-time sharing with clinicians. Another major benefit realized by biobanks by implementing cloud-based applications is unlimited data storage on the cloud and automatic backups for protecting any data loss in the face of natural calamities.
Designing Albaha Internet of Farming Architecture
NASA Astrophysics Data System (ADS)
Alahmadi, A.
2017-04-01
Up to now, most farmers in Albaha, Saudi Arabia are still practicing traditional way, which is not optimized in term of water usage, quality of product, etc. At the same time, nowadays ICT becomes a key driver for Innovation in Farming. In this project, we propose a smart Internet of farming system to assist farmers in Albaha to optimize their farm productivity by providing accurate information to the farmers the right time prediction to harvest, to fertilize, to watering and other activities related to the farming/agriculture technology. The proposed system utilizes wireless sensor cloud to capture remotely important data such as temperature, humidity, soil condition (moisture, water level), etc., and then they are sent to a storage servers at Albaha University cloud. An adaptive knowledge engine will process the captured data into knowledge and the farmers can retrieve the knowledge using their smartphones via the Internet.
A cloud-based system for measuring radiation treatment plan similarity
NASA Astrophysics Data System (ADS)
Andrea, Jennifer
PURPOSE: Radiation therapy is used to treat cancer using carefully designed plans that maximize the radiation dose delivered to the target and minimize damage to healthy tissue, with the dose administered over multiple occasions. Creating treatment plans is a laborious process and presents an obstacle to more frequent replanning, which remains an unsolved problem. However, in between new plans being created, the patient's anatomy can change due to multiple factors including reduction in tumor size and loss of weight, which results in poorer patient outcomes. Cloud computing is a newer technology that is slowly being used for medical applications with promising results. The objective of this work was to design and build a system that could analyze a database of previously created treatment plans, which are stored with their associated anatomical information in studies, to find the one with the most similar anatomy to a new patient. The analyses would be performed in parallel on the cloud to decrease the computation time of finding this plan. METHODS: The system used SlicerRT, a radiation therapy toolkit for the open-source platform 3D Slicer, for its tools to perform the similarity analysis algorithm. Amazon Web Services was used for the cloud instances on which the analyses were performed, as well as for storage of the radiation therapy studies and messaging between the instances and a master local computer. A module was built in SlicerRT to provide the user with an interface to direct the system on the cloud, as well as to perform other related tasks. RESULTS: The cloud-based system out-performed previous methods of conducting the similarity analyses in terms of time, as it analyzed 100 studies in approximately 13 minutes, and produced the same similarity values as those methods. It also scaled up to larger numbers of studies to analyze in the database with a small increase in computation time of just over 2 minutes. CONCLUSION: This system successfully analyzes a large database of radiation therapy studies and finds the one that is most similar to a new patient, which represents a potential step forward in achieving feasible adaptive radiation therapy replanning.
Now and Next-Generation Sequencing Techniques: Future of Sequence Analysis Using Cloud Computing
Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav
2012-01-01
Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed “cloud computing”) has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows. PMID:23248640
Now and next-generation sequencing techniques: future of sequence analysis using cloud computing.
Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav
2012-01-01
Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed "cloud computing") has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows.
CERNBox + EOS: end-user storage for science
NASA Astrophysics Data System (ADS)
Mascetti, L.; Gonzalez Labrador, H.; Lamanna, M.; Mościcki, JT; Peters, AJ
2015-12-01
CERNBox is a cloud synchronisation service for end-users: it allows syncing and sharing files on all major mobile and desktop platforms (Linux, Windows, MacOSX, Android, iOS) aiming to provide offline availability to any data stored in the CERN EOS infrastructure. The successful beta phase of the service confirmed the high demand in the community for an easily accessible cloud storage solution such as CERNBox. Integration of the CERNBox service with the EOS storage back-end is the next step towards providing “sync and share” capabilities for scientific and engineering use-cases. In this report we will present lessons learnt in offering the CERNBox service, key technical aspects of CERNBox/EOS integration and new, emerging usage possibilities. The latter includes the ongoing integration of “sync and share” capabilities with the LHC data analysis tools and transfer services.
NASA Technical Reports Server (NTRS)
Moses, John F.; Memarsadeghi, Nargess; Overoye, David; Littlefield, Brain
2017-01-01
The Global Learning and Observation to Benefit the Environment (GLOBE) Data and Information System supports an international science and education program with capabilities to accept local environment observations, archive, display and visualize them along with global satellite observations. Since its inception twenty years ago, the Web and database system has been upgraded periodically to accommodate the changes in technology and the steady growth of GLOBEs education community and collection of observations. Recently, near the end-of-life of the system hardware, new commercial computer platform options were explored and a decision made to utilize Cloud services. Now the GLOBE DIS has been fully deployed and maintained using Amazon Cloud services for over two years now. This paper reviews the early risks, actual challenges, and some unexpected findings as a result of the GLOBE DIS migration. We describe the plans, cost drivers and estimates, highlight adjustments that were made and suggest improvements. We present the trade studies for provisioning, for load balancing, networks, processing, storage, as well as production, staging and backup systems. We outline the migration teams skills and required level of effort for transition, and resulting changes in the overall maintenance and operations activities. Examples include incremental adjustments to processing capacity and frequency of backups, and efforts previously expended on hardware maintenance that were refocused onto application-specific enhancements.
NASA Technical Reports Server (NTRS)
Moses, John F.; Memarsadeghi, Nargess; Overoye, David; Littlefield, Bryan
2016-01-01
The Global Learning and Observation to Benefit the Environment (GLOBE) Data and Information System supports an international science and education program with capabilities to accept local environment observations, archive, display and visualize them along with global satellite observations. Since its inception twenty years ago, the Web and database system has been upgraded periodically to accommodate the changes in technology and the steady growth of GLOBEs education community and collection of observations. Recently, near the end-of-life of the system hardware, new commercial computer platform options were explored and a decision made to utilize Cloud services. Now the GLOBE DIS has been fully deployed and maintained using Amazon Cloud services for over two years now. This paper reviews the early risks, actual challenges, and some unexpected findings as a result of the GLOBE DIS migration. We describe the plans, cost drivers and estimates, highlight adjustments that were made and suggest improvements. We present the trade studies for provisioning, for load balancing, networks, processing, storage, as well as production, staging and backup systems. We outline the migration teams skills and required level of effort for transition, and resulting changes in the overall maintenance and operations activities. Examples include incremental adjustments to processing capacity and frequency of backups, and efforts previously expended on hardware maintenance that were refocused onto application-specific enhancements.
NASA Astrophysics Data System (ADS)
Moses, J. F.; Memarsadeghi, N.; Overoye, D.; Littlefield, B.
2016-12-01
The Global Learning and Observation to Benefit the Environment (GLOBE) Data and Information System supports an international science and education program with capabilities to accept local environment observations, archive, display and visualize them along with global satellite observations. Since its inception twenty years ago, the Web and database system has been upgraded periodically to accommodate the changes in technology and the steady growth of GLOBE's education community and collection of observations. Recently, near the end-of-life of the system hardware, new commercial computer platform options were explored and a decision made to utilize Cloud services. Now the GLOBE DIS has been fully deployed and maintained using Amazon Cloud services for over two years now. This paper reviews the early risks, actual challenges, and some unexpected findings as a result of the GLOBE DIS migration. We describe the plans, cost drivers and estimates, highlight adjustments that were made and suggest improvements. We present the trade studies for provisioning, for load balancing, networks, processing , storage, as well as production, staging and backup systems. We outline the migration team's skills and required level of effort for transition, and resulting changes in the overall maintenance and operations activities. Examples include incremental adjustments to processing capacity and frequency of backups, and efforts previously expended on hardware maintenance that were refocused onto application-specific enhancements.
A European Federated Cloud: Innovative distributed computing solutions by EGI
NASA Astrophysics Data System (ADS)
Sipos, Gergely; Turilli, Matteo; Newhouse, Steven; Kacsuk, Peter
2013-04-01
The European Grid Infrastructure (EGI) is the result of pioneering work that has, over the last decade, built a collaborative production infrastructure of uniform services through the federation of national resource providers that supports multi-disciplinary science across Europe and around the world. This presentation will provide an overview of the recently established 'federated cloud computing services' that the National Grid Initiatives (NGIs), operators of EGI, offer to scientific communities. The presentation will explain the technical capabilities of the 'EGI Federated Cloud' and the processes whereby earth and space science researchers can engage with it. EGI's resource centres have been providing services for collaborative, compute- and data-intensive applications for over a decade. Besides the well-established 'grid services', several NGIs already offer privately run cloud services to their national researchers. Many of these researchers recently expressed the need to share these cloud capabilities within their international research collaborations - a model similar to the way the grid emerged through the federation of institutional batch computing and file storage servers. To facilitate the setup of a pan-European cloud service from the NGIs' resources, the EGI-InSPIRE project established a Federated Cloud Task Force in September 2011. The Task Force has a mandate to identify and test technologies for a multinational federated cloud that could be provisioned within EGI by the NGIs. A guiding principle for the EGI Federated Cloud is to remain technology neutral and flexible for both resource providers and users: • Resource providers are allowed to use any cloud hypervisor and management technology to join virtualised resources into the EGI Federated Cloud as long as the site is subscribed to the user-facing interfaces selected by the EGI community. • Users can integrate high level services - such as brokers, portals and customised Virtual Research Environments - with the EGI Federated Cloud as long as these services access cloud resources through the user-facing interfaces selected by the EGI community. The Task Force will be closed in May 2013. It already • Identified key enabling technologies by which a multinational, federated 'Infrastructure as a Service' (IaaS) type cloud can be built from the NGIs' resources; • Deployed a test bed to evaluate the integration of virtualised resources within EGI and to engage with early adopter use cases from different scientific domains; • Integrated cloud resources into the EGI production infrastructure through cloud specific bindings of the EGI information system, monitoring system, authentication system, etc.; • Collected and catalogued requirements concerning the federated cloud services from the feedback of early adopter use cases; • Provided feedback and requirements to relevant technology providers on their implementations and worked with these providers to address those requirements; • Identified issues that need to be addressed by other areas of EGI (such as portal solutions, resource allocation policies, marketing and user support) to reach a production system. The Task Force will publish a blueprint in April 2013. The blueprint will drive the establishment of a production level EGI Federated Cloud service after May 2013.
Distance Learning and Cloud Computing: "Just Another Buzzword or a Major E-Learning Breakthrough?"
ERIC Educational Resources Information Center
Romiszowski, Alexander J.
2012-01-01
"Cloud computing is a model for the enabling of ubiquitous, convenient, and on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and other services) that can be rapidly provisioned and released with minimal management effort or service provider interaction." This…
Computational biology in the cloud: methods and new insights from computing at scale.
Kasson, Peter M
2013-01-01
The past few years have seen both explosions in the size of biological data sets and the proliferation of new, highly flexible on-demand computing capabilities. The sheer amount of information available from genomic and metagenomic sequencing, high-throughput proteomics, experimental and simulation datasets on molecular structure and dynamics affords an opportunity for greatly expanded insight, but it creates new challenges of scale for computation, storage, and interpretation of petascale data. Cloud computing resources have the potential to help solve these problems by offering a utility model of computing and storage: near-unlimited capacity, the ability to burst usage, and cheap and flexible payment models. Effective use of cloud computing on large biological datasets requires dealing with non-trivial problems of scale and robustness, since performance-limiting factors can change substantially when a dataset grows by a factor of 10,000 or more. New computing paradigms are thus often needed. The use of cloud platforms also creates new opportunities to share data, reduce duplication, and to provide easy reproducibility by making the datasets and computational methods easily available.
NASA Astrophysics Data System (ADS)
van Lew, Baldur; Botha, Charl P.; Milles, Julien R.; Vrooman, Henri A.; van de Giessen, Martijn; Lelieveldt, Boudewijn P. F.
2015-03-01
The cohort size required in epidemiological imaging genetics studies often mandates the pooling of data from multiple hospitals. Patient data, however, is subject to strict privacy protection regimes, and physical data storage may be legally restricted to a hospital network. To enable biomarker discovery, fast data access and interactive data exploration must be combined with high-performance computing resources, while respecting privacy regulations. We present a system using fast and inherently secure light-paths to access distributed data, thereby obviating the need for a central data repository. A secure private cloud computing framework facilitates interactive, computationally intensive exploration of this geographically distributed, privacy sensitive data. As a proof of concept, MRI brain imaging data hosted at two remote sites were processed in response to a user command at a third site. The system was able to automatically start virtual machines, run a selected processing pipeline and write results to a user accessible database, while keeping data locally stored in the hospitals. Individual tasks took approximately 50% longer compared to a locally hosted blade server but the cloud infrastructure reduced the total elapsed time by a factor of 40 using 70 virtual machines in the cloud. We demonstrated that the combination light-path and private cloud is a viable means of building an analysis infrastructure for secure data analysis. The system requires further work in the areas of error handling, load balancing and secure support of multiple users.
Efficient LIDAR Point Cloud Data Managing and Processing in a Hadoop-Based Distributed Framework
NASA Astrophysics Data System (ADS)
Wang, C.; Hu, F.; Sha, D.; Han, X.
2017-10-01
Light Detection and Ranging (LiDAR) is one of the most promising technologies in surveying and mapping city management, forestry, object recognition, computer vision engineer and others. However, it is challenging to efficiently storage, query and analyze the high-resolution 3D LiDAR data due to its volume and complexity. In order to improve the productivity of Lidar data processing, this study proposes a Hadoop-based framework to efficiently manage and process LiDAR data in a distributed and parallel manner, which takes advantage of Hadoop's storage and computing ability. At the same time, the Point Cloud Library (PCL), an open-source project for 2D/3D image and point cloud processing, is integrated with HDFS and MapReduce to conduct the Lidar data analysis algorithms provided by PCL in a parallel fashion. The experiment results show that the proposed framework can efficiently manage and process big LiDAR data.
Task 28: Web Accessible APIs in the Cloud Trade Study
NASA Technical Reports Server (NTRS)
Gallagher, James; Habermann, Ted; Jelenak, Aleksandar; Lee, Joe; Potter, Nathan; Yang, Muqun
2017-01-01
This study explored three candidate architectures for serving NASA Earth Science Hierarchical Data Format Version 5 (HDF5) data via Hyrax running on Amazon Web Services (AWS). We studied the cost and performance for each architecture using several representative Use-Cases. The objectives of the project are: Conduct a trade study to identify one or more high performance integrated solutions for storing and retrieving NASA HDF5 and Network Common Data Format Version 4 (netCDF4) data in a cloud (web object store) environment. The target environment is Amazon Web Services (AWS) Simple Storage Service (S3).Conduct needed level of software development to properly evaluate solutions in the trade study and to obtain required benchmarking metrics for input into government decision of potential follow-on prototyping. Develop a cloud cost model for the preferred data storage solution (or solutions) that accounts for different granulation and aggregation schemes as well as cost and performance trades.
Design and Implementation of Website Information Disclosure Assessment System
Cho, Ying-Chiang; Pan, Jen-Yi
2015-01-01
Internet application technologies, such as cloud computing and cloud storage, have increasingly changed people’s lives. Websites contain vast amounts of personal privacy information. In order to protect this information, network security technologies, such as database protection and data encryption, attract many researchers. The most serious problems concerning web vulnerability are e-mail address and network database leakages. These leakages have many causes. For example, malicious users can steal database contents, taking advantage of mistakes made by programmers and administrators. In order to mitigate this type of abuse, a website information disclosure assessment system is proposed in this study. This system utilizes a series of technologies, such as web crawler algorithms, SQL injection attack detection, and web vulnerability mining, to assess a website’s information disclosure. Thirty websites, randomly sampled from the top 50 world colleges, were used to collect leakage information. This testing showed the importance of increasing the security and privacy of website information for academic websites. PMID:25768434
Design and Implement of Astronomical Cloud Computing Environment In China-VO
NASA Astrophysics Data System (ADS)
Li, Changhua; Cui, Chenzhou; Mi, Linying; He, Boliang; Fan, Dongwei; Li, Shanshan; Yang, Sisi; Xu, Yunfei; Han, Jun; Chen, Junyi; Zhang, Hailong; Yu, Ce; Xiao, Jian; Wang, Chuanjun; Cao, Zihuang; Fan, Yufeng; Liu, Liang; Chen, Xiao; Song, Wenming; Du, Kangyu
2017-06-01
Astronomy cloud computing environment is a cyber-Infrastructure for Astronomy Research initiated by Chinese Virtual Observatory (China-VO) under funding support from NDRC (National Development and Reform commission) and CAS (Chinese Academy of Sciences). Based on virtualization technology, astronomy cloud computing environment was designed and implemented by China-VO team. It consists of five distributed nodes across the mainland of China. Astronomer can get compuitng and storage resource in this cloud computing environment. Through this environments, astronomer can easily search and analyze astronomical data collected by different telescopes and data centers , and avoid the large scale dataset transportation.
Retrieving and Indexing Spatial Data in the Cloud Computing Environment
NASA Astrophysics Data System (ADS)
Wang, Yonggang; Wang, Sheng; Zhou, Daliang
In order to solve the drawbacks of spatial data storage in common Cloud Computing platform, we design and present a framework for retrieving, indexing, accessing and managing spatial data in the Cloud environment. An interoperable spatial data object model is provided based on the Simple Feature Coding Rules from the OGC such as Well Known Binary (WKB) and Well Known Text (WKT). And the classic spatial indexing algorithms like Quad-Tree and R-Tree are re-designed in the Cloud Computing environment. In the last we develop a prototype software based on Google App Engine to implement the proposed model.
AstroCloud, a Cyber-Infrastructure for Astronomy Research: Data Access and Interoperability
NASA Astrophysics Data System (ADS)
Fan, D.; He, B.; Xiao, J.; Li, S.; Li, C.; Cui, C.; Yu, C.; Hong, Z.; Yin, S.; Wang, C.; Cao, Z.; Fan, Y.; Mi, L.; Wan, W.; Wang, J.
2015-09-01
Data access and interoperability module connects the observation proposals, data, virtual machines and software. According to the unique identifier of PI (principal investigator), an email address or an internal ID, data can be collected by PI's proposals, or by the search interfaces, e.g. conesearch. Files associated with the searched results could be easily transported to cloud storages, including the storage with virtual machines, or several commercial platforms like Dropbox. Benefitted from the standards of IVOA (International Observatories Alliance), VOTable formatted searching result could be sent to kinds of VO software. Latter endeavor will try to integrate more data and connect archives and some other astronomical resources.
A novel data storage logic in the cloud
Mátyás, Bence; Szarka, Máté; Járvás, Gábor; Kusper, Gábor; Argay, István; Fialowski, Alice
2016-01-01
Databases which store and manage long-term scientific information related to life science are used to store huge amount of quantitative attributes. Introduction of a new entity attribute requires modification of the existing data tables and the programs that use these data tables. The solution is increasing the virtual data tables while the number of screens remains the same. The main objective of the present study was to introduce a logic called Joker Tao (JT) which provides universal data storage for cloud-based databases. It means all types of input data can be interpreted as an entity and attribute at the same time, in the same data table. PMID:29026521
A novel data storage logic in the cloud.
Mátyás, Bence; Szarka, Máté; Járvás, Gábor; Kusper, Gábor; Argay, István; Fialowski, Alice
2016-01-01
Databases which store and manage long-term scientific information related to life science are used to store huge amount of quantitative attributes. Introduction of a new entity attribute requires modification of the existing data tables and the programs that use these data tables. The solution is increasing the virtual data tables while the number of screens remains the same. The main objective of the present study was to introduce a logic called Joker Tao (JT) which provides universal data storage for cloud-based databases. It means all types of input data can be interpreted as an entity and attribute at the same time, in the same data table.
Tagliaferri, Luca; Gobitti, Carlo; Colloca, Giuseppe Ferdinando; Boldrini, Luca; Farina, Eleonora; Furlan, Carlo; Paiar, Fabiola; Vianello, Federica; Basso, Michela; Cerizza, Lorenzo; Monari, Fabio; Simontacchi, Gabriele; Gambacorta, Maria Antonietta; Lenkowicz, Jacopo; Dinapoli, Nicola; Lanzotti, Vito; Mazzarotto, Renzo; Russi, Elvio; Mangoni, Monica
2018-07-01
The big data approach offers a powerful alternative to Evidence-based medicine. This approach could guide cancer management thanks to machine learning application to large-scale data. Aim of the Thyroid CoBRA (Consortium for Brachytherapy Data Analysis) project is to develop a standardized web data collection system, focused on thyroid cancer. The Metabolic Radiotherapy Working Group of Italian Association of Radiation Oncology (AIRO) endorsed the implementation of a consortium directed to thyroid cancer management and data collection. The agreement conditions, the ontology of the collected data and the related software services were defined by a multicentre ad hoc working-group (WG). Six Italian cancer centres were firstly started the project, defined and signed the Thyroid COBRA consortium agreement. Three data set tiers were identified: Registry, Procedures and Research. The COBRA-Storage System (C-SS) appeared to be not time-consuming and to be privacy respecting, as data can be extracted directly from the single centre's storage platforms through a secured connection that ensures reliable encryption of sensible data. Automatic data archiving could be directly performed from Image Hospital Storage System or the Radiotherapy Treatment Planning Systems. The C-SS architecture will allow "Cloud storage way" or "distributed learning" approaches for predictive model definition and further clinical decision support tools development. The development of the Thyroid COBRA data Storage System C-SS through a multicentre consortium approach appeared to be a feasible tool in the setup of complex and privacy saving data sharing system oriented to the management of thyroid cancer and in the near future every cancer type. Copyright © 2018 European Federation of Internal Medicine. Published by Elsevier B.V. All rights reserved.
Efficient and secure outsourcing of genomic data storage.
Sousa, João Sá; Lefebvre, Cédric; Huang, Zhicong; Raisaro, Jean Louis; Aguilar-Melchor, Carlos; Killijian, Marc-Olivier; Hubaux, Jean-Pierre
2017-07-26
Cloud computing is becoming the preferred solution for efficiently dealing with the increasing amount of genomic data. Yet, outsourcing storage and processing sensitive information, such as genomic data, comes with important concerns related to privacy and security. This calls for new sophisticated techniques that ensure data protection from untrusted cloud providers and that still enable researchers to obtain useful information. We present a novel privacy-preserving algorithm for fully outsourcing the storage of large genomic data files to a public cloud and enabling researchers to efficiently search for variants of interest. In order to protect data and query confidentiality from possible leakage, our solution exploits optimal encoding for genomic variants and combines it with homomorphic encryption and private information retrieval. Our proposed algorithm is implemented in C++ and was evaluated on real data as part of the 2016 iDash Genome Privacy-Protection Challenge. Results show that our solution outperforms the state-of-the-art solutions and enables researchers to search over millions of encrypted variants in a few seconds. As opposed to prior beliefs that sophisticated privacy-enhancing technologies (PETs) are unpractical for real operational settings, our solution demonstrates that, in the case of genomic data, PETs are very efficient enablers.
Cloud cover archiving on a global scale - A discussion of principles
NASA Technical Reports Server (NTRS)
Henderson-Sellers, A.; Hughes, N. A.; Wilson, M.
1981-01-01
Monitoring of climatic variability and climate modeling both require a reliable global cloud data set. Examination is made of the temporal and spatial variability of cloudiness in light of recommendations made by GARP in 1975 (and updated by JOC in 1978 and 1980) for cloud data archiving. An examination of the methods of comparing cloud cover frequency curves suggests that the use of the beta distribution not only facilitates objective comparison, but also reduces overall storage requirements. A specific study of the only current global cloud climatology (the U.S. Air Force's 3-dimensional nephanalysis) over the United Kingdom indicates that discussion of methods of validating satellite-based data sets is urgently required.
NASA Astrophysics Data System (ADS)
Shamugam, Veeramani; Murray, I.; Leong, J. A.; Sidhu, Amandeep S.
2016-03-01
Cloud computing provides services on demand instantly, such as access to network infrastructure consisting of computing hardware, operating systems, network storage, database and applications. Network usage and demands are growing at a very fast rate and to meet the current requirements, there is a need for automatic infrastructure scaling. Traditional networks are difficult to automate because of the distributed nature of their decision making process for switching or routing which are collocated on the same device. Managing complex environments using traditional networks is time-consuming and expensive, especially in the case of generating virtual machines, migration and network configuration. To mitigate the challenges, network operations require efficient, flexible, agile and scalable software defined networks (SDN). This paper discuss various issues in SDN and suggests how to mitigate the network management related issues. A private cloud prototype test bed was setup to implement the SDN on the OpenStack platform to test and evaluate the various network performances provided by the various configurations.
Role of the ATLAS Grid Information System (AGIS) in Distributed Data Analysis and Simulation
NASA Astrophysics Data System (ADS)
Anisenkov, A. V.
2018-03-01
In modern high-energy physics experiments, particular attention is paid to the global integration of information and computing resources into a unified system for efficient storage and processing of experimental data. Annually, the ATLAS experiment performed at the Large Hadron Collider at the European Organization for Nuclear Research (CERN) produces tens of petabytes raw data from the recording electronics and several petabytes of data from the simulation system. For processing and storage of such super-large volumes of data, the computing model of the ATLAS experiment is based on heterogeneous geographically distributed computing environment, which includes the worldwide LHC computing grid (WLCG) infrastructure and is able to meet the requirements of the experiment for processing huge data sets and provide a high degree of their accessibility (hundreds of petabytes). The paper considers the ATLAS grid information system (AGIS) used by the ATLAS collaboration to describe the topology and resources of the computing infrastructure, to configure and connect the high-level software systems of computer centers, to describe and store all possible parameters, control, configuration, and other auxiliary information required for the effective operation of the ATLAS distributed computing applications and services. The role of the AGIS system in the development of a unified description of the computing resources provided by grid sites, supercomputer centers, and cloud computing into a consistent information model for the ATLAS experiment is outlined. This approach has allowed the collaboration to extend the computing capabilities of the WLCG project and integrate the supercomputers and cloud computing platforms into the software components of the production and distributed analysis workload management system (PanDA, ATLAS).
CloudNeo: a cloud pipeline for identifying patient-specific tumor neoantigens.
Bais, Preeti; Namburi, Sandeep; Gatti, Daniel M; Zhang, Xinyu; Chuang, Jeffrey H
2017-10-01
We present CloudNeo, a cloud-based computational workflow for identifying patient-specific tumor neoantigens from next generation sequencing data. Tumor-specific mutant peptides can be detected by the immune system through their interactions with the human leukocyte antigen complex, and neoantigen presence has recently been shown to correlate with anti T-cell immunity and efficacy of checkpoint inhibitor therapy. However computing capabilities to identify neoantigens from genomic sequencing data are a limiting factor for understanding their role. This challenge has grown as cancer datasets become increasingly abundant, making them cumbersome to store and analyze on local servers. Our cloud-based pipeline provides scalable computation capabilities for neoantigen identification while eliminating the need to invest in local infrastructure for data transfer, storage or compute. The pipeline is a Common Workflow Language (CWL) implementation of human leukocyte antigen (HLA) typing using Polysolver or HLAminer combined with custom scripts for mutant peptide identification and NetMHCpan for neoantigen prediction. We have demonstrated the efficacy of these pipelines on Amazon cloud instances through the Seven Bridges Genomics implementation of the NCI Cancer Genomics Cloud, which provides graphical interfaces for running and editing, infrastructure for workflow sharing and version tracking, and access to TCGA data. The CWL implementation is at: https://github.com/TheJacksonLaboratory/CloudNeo. For users who have obtained licenses for all internal software, integrated versions in CWL and on the Seven Bridges Cancer Genomics Cloud platform (https://cgc.sbgenomics.com/, recommended version) can be obtained by contacting the authors. jeff.chuang@jax.org. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
An early warning system for high climate sensitivity? (Invited)
NASA Astrophysics Data System (ADS)
Pierrehumbert, R.
2010-12-01
The scientific case for the clear and present danger of global warming has been unassailable at least since the release of the Charney Report more than thirty years ago, if not longer. While prompt action to begin decarbonizing energy systems could still head off much of the potential warming, it is distinctly possible that emissions will continue unabated in the coming decades, leading to a doubling or more of pre-industrial carbon dioxide concentrations. At present, we are in the unenviable position of not even knowing how bad things will get if this scenario comes to pass, because of the uncertainty in climate sensitivity. If climate sensitivity is high, then the consequences will be dire, perhaps even catastrophic. As the world continues to warm in response to continued carbon dioxide emissions, will we at least be able to monitor the climate and provide an early warning that the planet is on a high-sensitivity track, if such turns out to be the case? At what point will we actually know the climate sensitivity? It has long been recognized that the prime contributor to uncertainty in climate sensitivity is uncertainty in cloud feedbacks. Study of paleoclimate and climate of the past century has not been able to resolve which models do cloud feedback most correctly, because of uncertainties in radiative forcing. In this talk, I will discuss monitoring requirements, and analysis techniques, that might have the potential to determine which climate models most faithfully represent climate feedbacks, and thus determine which models provide the best estimate of climate sensitivity. The endeavor is complicated by the distinction between transient climate response and equilibrium climate sensitivity. I will discuss the particular challenges posed by this issue, particularly in light of recent indications that the pattern of ocean heat storage may lead to different cloud feedbacks in the transient warming stage than apply once the system has reached equilibrium. Apart from this problem, the transient nature of climate response driven by increasing CO2 requires careful monitoring of ocean heat storage as well as top-of-atmosphere radiative budgets, if climate sensitivity is to be estimated. Water vapor feedback is not considered as uncertain as cloud feedback, but there is still a considerable potential for surprises. I will discuss microwave monitoring requirements for tracking water vapor feedback. At the other extreme, the longer term feedbacks that contribute to Earth System Sensitivity are even more uncertain than cloud feedbacks, particularly with regard to the terrestrial carbon cycle. Prospects for obtaining an early warning of a PETM-type organic carbon release seem bleak. Finally, I will discuss the particular challenge of obtaining an early warning of high climate sensitivity in the case that the climate system has a bifurcation.
Pervez, Zeeshan; Ahmad, Mahmood; Khattak, Asad Masood; Lee, Sungyoung; Chung, Tae Choong
2016-01-01
Privacy-aware search of outsourced data ensures relevant data access in the untrusted domain of a public cloud service provider. Subscriber of a public cloud storage service can determine the presence or absence of a particular keyword by submitting search query in the form of a trapdoor. However, these trapdoor-based search queries are limited in functionality and cannot be used to identify secure outsourced data which contains semantically equivalent information. In addition, trapdoor-based methodologies are confined to pre-defined trapdoors and prevent subscribers from searching outsourced data with arbitrarily defined search criteria. To solve the problem of relevant data access, we have proposed an index-based privacy-aware search methodology that ensures semantic retrieval of data from an untrusted domain. This method ensures oblivious execution of a search query and leverages authorized subscribers to model conjunctive search queries without relying on predefined trapdoors. A security analysis of our proposed methodology shows that, in a conspired attack, unauthorized subscribers and untrusted cloud service providers cannot deduce any information that can lead to the potential loss of data privacy. A computational time analysis on commodity hardware demonstrates that our proposed methodology requires moderate computational resources to model a privacy-aware search query and for its oblivious evaluation on a cloud service provider.
Pervez, Zeeshan; Ahmad, Mahmood; Khattak, Asad Masood; Lee, Sungyoung; Chung, Tae Choong
2016-01-01
Privacy-aware search of outsourced data ensures relevant data access in the untrusted domain of a public cloud service provider. Subscriber of a public cloud storage service can determine the presence or absence of a particular keyword by submitting search query in the form of a trapdoor. However, these trapdoor-based search queries are limited in functionality and cannot be used to identify secure outsourced data which contains semantically equivalent information. In addition, trapdoor-based methodologies are confined to pre-defined trapdoors and prevent subscribers from searching outsourced data with arbitrarily defined search criteria. To solve the problem of relevant data access, we have proposed an index-based privacy-aware search methodology that ensures semantic retrieval of data from an untrusted domain. This method ensures oblivious execution of a search query and leverages authorized subscribers to model conjunctive search queries without relying on predefined trapdoors. A security analysis of our proposed methodology shows that, in a conspired attack, unauthorized subscribers and untrusted cloud service providers cannot deduce any information that can lead to the potential loss of data privacy. A computational time analysis on commodity hardware demonstrates that our proposed methodology requires moderate computational resources to model a privacy-aware search query and for its oblivious evaluation on a cloud service provider. PMID:27571421
NASA Astrophysics Data System (ADS)
Puzyrkov, Dmitry; Polyakov, Sergey; Podryga, Viktoriia; Markizov, Sergey
2018-02-01
At the present stage of computer technology development it is possible to study the properties and processes in complex systems at molecular and even atomic levels, for example, by means of molecular dynamics methods. The most interesting are problems related with the study of complex processes under real physical conditions. Solving such problems requires the use of high performance computing systems of various types, for example, GRID systems and HPC clusters. Considering the time consuming computational tasks, the need arises of software for automatic and unified monitoring of such computations. A complex computational task can be performed over different HPC systems. It requires output data synchronization between the storage chosen by a scientist and the HPC system used for computations. The design of the computational domain is also quite a problem. It requires complex software tools and algorithms for proper atomistic data generation on HPC systems. The paper describes the prototype of a cloud service, intended for design of atomistic systems of large volume for further detailed molecular dynamic calculations and computational management for this calculations, and presents the part of its concept aimed at initial data generation on the HPC systems.
A Hybrid Cloud Computing Service for Earth Sciences
NASA Astrophysics Data System (ADS)
Yang, C. P.
2016-12-01
Cloud Computing is becoming a norm for providing computing capabilities for advancing Earth sciences including big Earth data management, processing, analytics, model simulations, and many other aspects. A hybrid spatiotemporal cloud computing service is bulit at George Mason NSF spatiotemporal innovation center to meet this demands. This paper will report the service including several aspects: 1) the hardware includes 500 computing services and close to 2PB storage as well as connection to XSEDE Jetstream and Caltech experimental cloud computing environment for sharing the resource; 2) the cloud service is geographically distributed at east coast, west coast, and central region; 3) the cloud includes private clouds managed using open stack and eucalyptus, DC2 is used to bridge these and the public AWS cloud for interoperability and sharing computing resources when high demands surfing; 4) the cloud service is used to support NSF EarthCube program through the ECITE project, ESIP through the ESIP cloud computing cluster, semantics testbed cluster, and other clusters; 5) the cloud service is also available for the earth science communities to conduct geoscience. A brief introduction about how to use the cloud service will be included.
Leveraging the Cloud for Robust and Efficient Lunar Image Processing
NASA Technical Reports Server (NTRS)
Chang, George; Malhotra, Shan; Wolgast, Paul
2011-01-01
The Lunar Mapping and Modeling Project (LMMP) is tasked to aggregate lunar data, from the Apollo era to the latest instruments on the LRO spacecraft, into a central repository accessible by scientists and the general public. A critical function of this task is to provide users with the best solution for browsing the vast amounts of imagery available. The image files LMMP manages range from a few gigabytes to hundreds of gigabytes in size with new data arriving every day. Despite this ever-increasing amount of data, LMMP must make the data readily available in a timely manner for users to view and analyze. This is accomplished by tiling large images into smaller images using Hadoop, a distributed computing software platform implementation of the MapReduce framework, running on a small cluster of machines locally. Additionally, the software is implemented to use Amazon's Elastic Compute Cloud (EC2) facility. We also developed a hybrid solution to serve images to users by leveraging cloud storage using Amazon's Simple Storage Service (S3) for public data while keeping private information on our own data servers. By using Cloud Computing, we improve upon our local solution by reducing the need to manage our own hardware and computing infrastructure, thereby reducing costs. Further, by using a hybrid of local and cloud storage, we are able to provide data to our users more efficiently and securely. 12 This paper examines the use of a distributed approach with Hadoop to tile images, an approach that provides significant improvements in image processing time, from hours to minutes. This paper describes the constraints imposed on the solution and the resulting techniques developed for the hybrid solution of a customized Hadoop infrastructure over local and cloud resources in managing this ever-growing data set. It examines the performance trade-offs of using the more plentiful resources of the cloud, such as those provided by S3, against the bandwidth limitations such use encounters with remote resources. As part of this discussion this paper will outline some of the technologies employed, the reasons for their selection, the resulting performance metrics and the direction the project is headed based upon the demonstrated capabilities thus far.
ERIC Educational Resources Information Center
Paquet, Katherine G.
2013-01-01
Cloud computing may provide cost benefits for organizations by eliminating the overhead costs of software, hardware, and maintenance (e.g., license renewals, upgrading software, servers and their physical storage space, administration along with funding a large IT department). In addition to the promised savings, the organization may require…
Biomedical cloud computing with Amazon Web Services.
Fusaro, Vincent A; Patil, Prasad; Gafni, Erik; Wall, Dennis P; Tonellato, Peter J
2011-08-01
In this overview to biomedical computing in the cloud, we discussed two primary ways to use the cloud (a single instance or cluster), provided a detailed example using NGS mapping, and highlighted the associated costs. While many users new to the cloud may assume that entry is as straightforward as uploading an application and selecting an instance type and storage options, we illustrated that there is substantial up-front effort required before an application can make full use of the cloud's vast resources. Our intention was to provide a set of best practices and to illustrate how those apply to a typical application pipeline for biomedical informatics, but also general enough for extrapolation to other types of computational problems. Our mapping example was intended to illustrate how to develop a scalable project and not to compare and contrast alignment algorithms for read mapping and genome assembly. Indeed, with a newer aligner such as Bowtie, it is possible to map the entire African genome using one m2.2xlarge instance in 48 hours for a total cost of approximately $48 in computation time. In our example, we were not concerned with data transfer rates, which are heavily influenced by the amount of available bandwidth, connection latency, and network availability. When transferring large amounts of data to the cloud, bandwidth limitations can be a major bottleneck, and in some cases it is more efficient to simply mail a storage device containing the data to AWS (http://aws.amazon.com/importexport/). More information about cloud computing, detailed cost analysis, and security can be found in references.
Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses
Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T
2014-01-01
Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. PMID:24462600
Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.
Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T
2014-06-01
Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. Copyright © 2014 Elsevier Inc. All rights reserved.
Facile Generation and Storage of Polycyclic Aromatic Hydrocarbon Ions in Astrophysical Ices
NASA Technical Reports Server (NTRS)
Gudipati, Murthy S.; Allamandola, Louis J.
2003-01-01
In situ ultraviolet-visible absorption and emission studies of vacuum ultraviolet (VUV) irradiated water-rich, cosmic ice analogs containing polycyclic aromatic hydrocarbons (PAHs) are described. W V irradiation of 12 K water ices containing the PAHs naphthalene (H2O/C10H8 = 200) and 4-methylpyrene (H2O/C17H12 > 500) readily converts the PAHs into their cation form (PAH(+)). Under these conditions, PAH photoionization is the predominant reaction. These ions are trapped and stored in the ices at temperatures between 10 and 50 K, a temperature domain common to ices throughout interstellar clouds and the solar system. Unlike the approx.15% ionization typical after W V irradiation of PAHs isolated in rare-gas matrices, in water ice, PAH photoionization and storage proceed efficiently and almost quantitatively with a greater than 70% ionization yield. As the temperature is increased from 50 to 150 K, the PAH ion bands slowly diminish as the PAH ions ultimately react to form more complex organic species involving the water host. The chemical, spectroscopic, and physical properties of these ion-rich ices can be important in icy objects such as molecular clouds, comets, and planets. Several astrophysical applications are presented.
Molten Boron Phase-Change Thermal Energy Storage to Augment Solar Thermal Propulsion Systems
2011-07-22
during the 50 psi case included bubble clouds somewhat similar to the " popcorn " and "jellyfish" formations observed at ambient-pressure conditions...between the "jellyfish" and " popcorn " was lost -- popcorn formations were generally longer-lived, often traversing a significant portion of the field of... popcorn -like bubbles were generally swirling in/around the jellyfish formations. The turbulent wake of some of the jellyfish-like formations could
The CloudBoard Research Platform: an interactive whiteboard for corporate users
NASA Astrophysics Data System (ADS)
Barrus, John; Schwartz, Edward L.
2013-03-01
Over one million interactive whiteboards (IWBs) are sold annually worldwide, predominantly for classroom use with few sales for corporate use. Unmet needs for IWB corporate use were investigated and the CloudBoard Research Platform (CBRP) was developed to investigate and test technology for meeting these needs. The CBRP supports audio conferencing with shared remote drawing activity, casual capture of whiteboard activity for long-term storage and retrieval, use of standard formats such as PDF for easy import of documents via the web and email and easy export of documents. Company RFID badges and key fobs provide secure access to documents at the board and automatic logout occurs after a period of inactivity. Users manage their documents with a web browser. Analytics and remote device management is provided for administrators. The IWB hardware consists of off-the-shelf components (a Hitachi UST Projector, SMART Technologies, Inc. IWB hardware, Mac Mini, Polycom speakerphone, etc.) and a custom occupancy sensor. The three back-end servers provide the web interface, document storage, stroke and audio streaming. Ease of use, security, and robustness sufficient for internal adoption was achieved. Five of the 10 boards installed at various Ricoh sites have been in daily or weekly use for the past year and total system downtime was less than an hour in 2012. Since CBRP was installed, 65 registered users, 9 of whom use the system regularly, have created over 2600 documents.
Smartphone-coupled rhinolaryngoscopy at the point of care
NASA Astrophysics Data System (ADS)
Mink, Jonah; Bolton, Frank J.; Sebag, Cathy M.; Peterson, Curtis W.; Assia, Shai; Levitz, David
2018-02-01
Rhinolaryngoscopy remains difficult to perform in resource-limited settings due to the high cost of purchasing and maintaining equipment as well as the need for specialists to interpret exam findings. While the lack of expertise can be obviated by adopting telemedicine-based approaches, the capture, storage, and sharing of images/video is not a common native functionality of medical devices. Most rhinolaryngoscopy systems consist of an endoscope that interfaces with the patient's naso/oropharynx, and a tower of modules that record video/images. However, these expensive and bulky modules can be replaced by a smartphone that can fulfill the same functions but at a lower cost. To demonstrate this, a commercially available rhinolaryngoscope was coupled to a smartphone using a 3D-printed adapter. Software developed for other clinical applications was repurposed for ENT use, including an application that controls image and video capture, a HIPAA-compliant image/video storage and transfer cloud database, and customized software features developed to improve practitioner competency. Audio recording capabilities to assess speech pathology were also integrated into the smartphone rhinolaryngoscope system. The illumination module coupled onto the endoscope remained unchanged. The spatial resolution of the rhinolaryngoscope system was defined by the fiber diameter of endoscope fiber bundle, rather than the smartphone camera. The mobile rhinolaryngoscope system was used with appropriate patients by a general practitioner in an office setting. The general practitioner then consulted with an ENT specialist via the HIPAA compliant cloud database and workflow modules on difficult cases. These results suggest the smartphone-based rhinolaryngoscope holds promise for use in low-resource settings.
The impact of short-term stochastic variability in solar irradiance on optimal microgrid design
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schittekatte, Tim; Stadler, Michael; Cardoso, Gonçalo
2016-07-01
This paper proposes a new methodology to capture the impact of fast moving clouds on utility power demand charges observed in microgrids with photovoltaic (PV) arrays, generators, and electrochemical energy storage. It consists of a statistical approach to introduce sub-hourly events in the hourly economic accounting process. The methodology is implemented in the Distributed Energy Resources Customer Adoption Model (DER-CAM), a state of the art mixed integer linear model used to optimally size DER in decentralized energy systems. Results suggest that previous iterations of DER-CAM could undersize battery capacities. The improved model depicts more accurately the economic value of PVmore » as well as the synergistic benefits of pairing PV with storage.« less
A Survey on Data Storage and Information Discovery in the WSANs-Based Edge Computing Systems
Liang, Junbin; Liu, Renping; Ni, Wei; Li, Yin; Li, Ran; Ma, Wenpeng; Qi, Chuanda
2018-01-01
In the post-Cloud era, the proliferation of Internet of Things (IoT) has pushed the horizon of Edge computing, which is a new computing paradigm with data processed at the edge of the network. As the important systems of Edge computing, wireless sensor and actuator networks (WSANs) play an important role in collecting and processing the sensing data from the surrounding environment as well as taking actions on the events happening in the environment. In WSANs, in-network data storage and information discovery schemes with high energy efficiency, high load balance and low latency are needed because of the limited resources of the sensor nodes and the real-time requirement of some specific applications, such as putting out a big fire in a forest. In this article, the existing schemes of WSANs on data storage and information discovery are surveyed with detailed analysis on their advancements and shortcomings, and possible solutions are proposed on how to achieve high efficiency, good load balance, and perfect real-time performances at the same time, hoping that it can provide a good reference for the future research of the WSANs-based Edge computing systems. PMID:29439442
A Survey on Data Storage and Information Discovery in the WSANs-Based Edge Computing Systems.
Ma, Xingpo; Liang, Junbin; Liu, Renping; Ni, Wei; Li, Yin; Li, Ran; Ma, Wenpeng; Qi, Chuanda
2018-02-10
In the post-Cloud era, the proliferation of Internet of Things (IoT) has pushed the horizon of Edge computing, which is a new computing paradigm with data are processed at the edge of the network. As the important systems of Edge computing, wireless sensor and actuator networks (WSANs) play an important role in collecting and processing the sensing data from the surrounding environment as well as taking actions on the events happening in the environment. In WSANs, in-network data storage and information discovery schemes with high energy efficiency, high load balance and low latency are needed because of the limited resources of the sensor nodes and the real-time requirement of some specific applications, such as putting out a big fire in a forest. In this article, the existing schemes of WSANs on data storage and information discovery are surveyed with detailed analysis on their advancements and shortcomings, and possible solutions are proposed on how to achieve high efficiency, good load balance, and perfect real-time performances at the same time, hoping that it can provide a good reference for the future research of the WSANs-based Edge computing systems.
Infrared remote sensing of the vertical and horizontal distribution of clouds
NASA Technical Reports Server (NTRS)
Chahine, M. T.; Haskins, R. D.
1982-01-01
An algorithm has been developed to derive the horizontal and vertical distribution of clouds from the same set of infrared radiance data used to retrieve atmospheric temperature profiles. The method leads to the determination of the vertical atmospheric temperature structure and the cloud distribution simultaneously, providing information on heat sources and sinks, storage rates and transport phenomena in the atmosphere. Experimental verification of this algorithm was obtained using the 15-micron data measured by the NOAA-VTPR temperature sounder. After correcting for water vapor emission, the results show that the cloud cover derived from 15-micron data is less than that obtained from visible data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Le Pimpec, F.; /PSI, Villigen; Kirby, R.E.
In many accelerator storage rings running positively charged beams, ionization of residual gas and secondary electron emission (SEE) in the beam pipe will give rise to an electron cloud which can cause beam blow-up or loss of the circulating beam. A preventative measure that suppresses electron cloud formation is to ensure that the vacuum wall has a low secondary emission yield (SEY). The SEY of thin films of TiN, sputter deposited Non-Evaporable Getters and a novel TiCN alloy were measured under a variety of conditions, including the effect of re-contamination from residual gas.
Lost in the Cloud - New Challenges for Teaching GIS
NASA Astrophysics Data System (ADS)
Bellman, C. J.; Pupedis, G.
2016-06-01
As cloud based services move towards becoming the dominant paradigm in many areas of information technology, GIS has also moved into `the Cloud', creating a new opportunities for professionals and students alike, while at the same time presenting a range of new challenges and opportunities for GIS educators. Learning for many students in the geospatial science disciplines has been based on desktop software for GIS, building their skills from basic data handling and manipulation to advanced spatial analysis and database storage. Cloud-based systems challenge this paradigm in many ways, with some of the skills being replaced by clever and capable software tools, while the ubiquitous nature of the computing environment offers access and processing from anywhere, on any device. This paper describes our experiences over the past two years in developing and delivering a new course incorporating cloud based technologies for GIS and illustrates the many benefits and pitfalls of a cloud based approach to teaching. Throughout the course, students were encouraged to provide regular feedback on the course through the use of online journals. This allowed students to critique the approach to teaching, the learning materials available and to describe their own level of comfort and engagement with the material in an honest and non-confrontational manner. Many of the students did not have a strong information technology background and the journals provided great insight into the views of the students and the challenges they faced in mastering this technology.
HammerCloud: A Stress Testing System for Distributed Analysis
NASA Astrophysics Data System (ADS)
van der Ster, Daniel C.; Elmsheuser, Johannes; Úbeda García, Mario; Paladin, Massimo
2011-12-01
Distributed analysis of LHC data is an I/O-intensive activity which places large demands on the internal network, storage, and local disks at remote computing facilities. Commissioning and maintaining a site to provide an efficient distributed analysis service is therefore a challenge which can be aided by tools to help evaluate a variety of infrastructure designs and configurations. HammerCloud is one such tool; it is a stress testing service which is used by central operations teams, regional coordinators, and local site admins to (a) submit arbitrary number of analysis jobs to a number of sites, (b) maintain at a steady-state a predefined number of jobs running at the sites under test, (c) produce web-based reports summarizing the efficiency and performance of the sites under test, and (d) present a web-interface for historical test results to both evaluate progress and compare sites. HammerCloud was built around the distributed analysis framework Ganga, exploiting its API for grid job management. HammerCloud has been employed by the ATLAS experiment for continuous testing of many sites worldwide, and also during large scale computing challenges such as STEP'09 and UAT'09, where the scale of the tests exceeded 10,000 concurrently running and 1,000,000 total jobs over multi-day periods. In addition, HammerCloud is being adopted by the CMS experiment; the plugin structure of HammerCloud allows the execution of CMS jobs using their official tool (CRAB).
GATECloud.net: a platform for large-scale, open-source text processing on the cloud.
Tablan, Valentin; Roberts, Ian; Cunningham, Hamish; Bontcheva, Kalina
2013-01-28
Cloud computing is increasingly being regarded as a key enabler of the 'democratization of science', because on-demand, highly scalable cloud computing facilities enable researchers anywhere to carry out data-intensive experiments. In the context of natural language processing (NLP), algorithms tend to be complex, which makes their parallelization and deployment on cloud platforms a non-trivial task. This study presents a new, unique, cloud-based platform for large-scale NLP research--GATECloud. net. It enables researchers to carry out data-intensive NLP experiments by harnessing the vast, on-demand compute power of the Amazon cloud. Important infrastructural issues are dealt with by the platform, completely transparently for the researcher: load balancing, efficient data upload and storage, deployment on the virtual machines, security and fault tolerance. We also include a cost-benefit analysis and usage evaluation.
A Cloud-Based System for Automatic Hazard Monitoring from Sentinel-1 SAR Data
NASA Astrophysics Data System (ADS)
Meyer, F. J.; Arko, S. A.; Hogenson, K.; McAlpin, D. B.; Whitley, M. A.
2017-12-01
Despite the all-weather capabilities of Synthetic Aperture Radar (SAR), and its high performance in change detection, the application of SAR for operational hazard monitoring was limited in the past. This has largely been due to high data costs, slow product delivery, and limited temporal sampling associated with legacy SAR systems. Only since the launch of ESA's Sentinel-1 sensors have routinely acquired and free-of-charge SAR data become available, allowing—for the first time—for a meaningful contribution of SAR to disaster monitoring. In this paper, we present recent technical advances of the Sentinel-1-based SAR processing system SARVIEWS, which was originally built to generate hazard products for volcano monitoring centers. We outline the main functionalities of SARVIEWS including its automatic database interface to Sentinel-1 holdings of the Alaska Satellite Facility (ASF), and its set of automatic processing techniques. Subsequently, we present recent system improvements that were added to SARVIEWS and allowed for a vast expansion of its hazard services; specifically: (1) In early 2017, the SARVIEWS system was migrated into the Amazon Cloud, providing access to cloud capabilities such as elastic scaling of compute resources and cloud-based storage; (2) we co-located SARVIEWS with ASF's cloud-based Sentinel-1 archive, enabling the efficient and cost effective processing of large data volumes; (3) we integrated SARVIEWS with ASF's HyP3 system (http://hyp3.asf.alaska.edu/), providing functionality such as subscription creation via API or map interface as well as automatic email notification; (4) we automated the production chains for seismic and volcanic hazards by integrating SARVIEWS with the USGS earthquake notification service (ENS) and the USGS eruption alert system. Email notifications from both services are parsed and subscriptions are automatically created when certain event criteria are met; (5) finally, SARVIEWS-generated hazard products are now being made available to the public via the SARVIEWS hazard portal. These improvements have led to the expansion of SARVIEWS toward a broader set of hazard situations, now including volcanoes, earthquakes, and severe weather. We provide details on newly developed techniques and show examples of disasters for which SARVIEWS was invoked.
RBioCloud: A Light-Weight Framework for Bioconductor and R-based Jobs on the Cloud.
Varghese, Blesson; Patel, Ishan; Barker, Adam
2015-01-01
Large-scale ad hoc analytics of genomic data is popular using the R-programming language supported by over 700 software packages provided by Bioconductor. More recently, analytical jobs are benefitting from on-demand computing and storage, their scalability and their low maintenance cost, all of which are offered by the cloud. While biologists and bioinformaticists can take an analytical job and execute it on their personal workstations, it remains challenging to seamlessly execute the job on the cloud infrastructure without extensive knowledge of the cloud dashboard. How analytical jobs can not only with minimum effort be executed on the cloud, but also how both the resources and data required by the job can be managed is explored in this paper. An open-source light-weight framework for executing R-scripts using Bioconductor packages, referred to as `RBioCloud', is designed and developed. RBioCloud offers a set of simple command-line tools for managing the cloud resources, the data and the execution of the job. Three biological test cases validate the feasibility of RBioCloud. The framework is available from http://www.rbiocloud.com.
Geometric Data Perturbation-Based Personal Health Record Transactions in Cloud Computing
Balasubramaniam, S.; Kavitha, V.
2015-01-01
Cloud computing is a new delivery model for information technology services and it typically involves the provision of dynamically scalable and often virtualized resources over the Internet. However, cloud computing raises concerns on how cloud service providers, user organizations, and governments should handle such information and interactions. Personal health records represent an emerging patient-centric model for health information exchange, and they are outsourced for storage by third parties, such as cloud providers. With these records, it is necessary for each patient to encrypt their own personal health data before uploading them to cloud servers. Current techniques for encryption primarily rely on conventional cryptographic approaches. However, key management issues remain largely unsolved with these cryptographic-based encryption techniques. We propose that personal health record transactions be managed using geometric data perturbation in cloud computing. In our proposed scheme, the personal health record database is perturbed using geometric data perturbation and outsourced to the Amazon EC2 cloud. PMID:25767826
Geometric data perturbation-based personal health record transactions in cloud computing.
Balasubramaniam, S; Kavitha, V
2015-01-01
Cloud computing is a new delivery model for information technology services and it typically involves the provision of dynamically scalable and often virtualized resources over the Internet. However, cloud computing raises concerns on how cloud service providers, user organizations, and governments should handle such information and interactions. Personal health records represent an emerging patient-centric model for health information exchange, and they are outsourced for storage by third parties, such as cloud providers. With these records, it is necessary for each patient to encrypt their own personal health data before uploading them to cloud servers. Current techniques for encryption primarily rely on conventional cryptographic approaches. However, key management issues remain largely unsolved with these cryptographic-based encryption techniques. We propose that personal health record transactions be managed using geometric data perturbation in cloud computing. In our proposed scheme, the personal health record database is perturbed using geometric data perturbation and outsourced to the Amazon EC2 cloud.
Virtualized Networks and Virtualized Optical Line Terminal (vOLT)
NASA Astrophysics Data System (ADS)
Ma, Jonathan; Israel, Stephen
2017-03-01
The success of the Internet and the proliferation of the Internet of Things (IoT) devices is forcing telecommunications carriers to re-architecture a central office as a datacenter (CORD) so as to bring the datacenter economics and cloud agility to a central office (CO). The Open Network Operating System (ONOS) is the first open-source software-defined network (SDN) operating system which is capable of managing and controlling network, computing, and storage resources to support CORD infrastructure and network virtualization. The virtualized Optical Line Termination (vOLT) is one of the key components in such virtualized networks.
A history of radiation detection instrumentation.
Frame, Paul W
2004-08-01
A review is presented of the history of radiation detection instrumentation. Specific radiation detection systems that are discussed include the human senses, photography, calorimetry, color dosimetry, ion chambers, electrometers, electroscopes, proportional counters, Geiger Mueller counters, scalers and rate meters, barium platinocyanide, scintillation counters, semiconductor detectors, radiophotoluminescent dosimeters, thermoluminescent dosimeters, optically stimulated luminescent dosimeters, direct ion storage, electrets, cloud chambers, bubble chambers, and bubble dosimeters. Given the broad scope of this review, the coverage is limited to a few key events in the development of a given detection system and some relevant operating principles. The occasional anecdote is included for interest.
A history of radiation detection instrumentation.
Frame, Paul W
2005-06-01
A review is presented of the history of radiation detection instrumentation. Specific radiation detection systems that are discussed include the human senses, photography, calorimetry, color dosimetry, ion chambers, electrometers, electroscopes, proportional counters, Geiger Mueller counters, scalers and rate meters, barium platinocyanide, scintillation counters, semiconductor detectors, radiophotoluminescent dosimeters, thermoluminescent dosimeters, optically stimulated luminescent dosimeters, direct ion storage, electrets, cloud chambers, bubble chambers, and bubble dosimeters. Given the broad scope of this review, the coverage is limited to a few key events in the development of a given detection system and some relevant operating principles. The occasional anecdote is included for interest.
Translational Biomedical Informatics in the Cloud: Present and Future
Chen, Jiajia; Qian, Fuliang; Yan, Wenying; Shen, Bairong
2013-01-01
Next generation sequencing and other high-throughput experimental techniques of recent decades have driven the exponential growth in publicly available molecular and clinical data. This information explosion has prepared the ground for the development of translational bioinformatics. The scale and dimensionality of data, however, pose obvious challenges in data mining, storage, and integration. In this paper we demonstrated the utility and promise of cloud computing for tackling the big data problems. We also outline our vision that cloud computing could be an enabling tool to facilitate translational bioinformatics research. PMID:23586054
NASA Astrophysics Data System (ADS)
Casey, K. S.; Hausman, S. A.
2016-02-01
In the last year, the NOAA National Oceanographic Data Center (NODC) and its siblings, the National Climatic Data Center and National Geophysical Data Center, were merged into one organization, the NOAA National Centers for Environmental Information (NCEI). Combining its expertise under one management has helped NCEI accelerate its efforts to embrace and integrate private, public, and hybrid cloud environments into its range of data stewardship services. These services span a range of tiers, from basic, long-term preservation and access, through enhanced access and scientific quality control, to authoritative product development and international-level services. Throughout these tiers of stewardship, partnerships and pilot projects have been launched to identify technological and policy-oriented challenges, to establish solutions to these problems, and to highlight success stories for emulation during operational integration of the cloud into NCEI's data stewardship activities. Some of these pilot activities including data storage, access, and reprocessing in Amazon Web Services, the OneStop data discovery and access framework project, and a set of Cooperative Research and Development Agreements under the Big Data Project with Amazon, Google, IBM, Microsoft, and the Open Cloud Consortium. Progress in these efforts will be highlighted along with a future vision of how NCEI could leverage hybrid cloud deployments and federated systems across NOAA to enable effective data stewardship for its oceanographic, atmospheric, climatic, and geophysical Big Data.
Vortex based information storage in Bose-Einstein condensates
NASA Astrophysics Data System (ADS)
Dutton, Zachary; Ruostekoski, Janne
2004-05-01
Recent demonstrations of coherent optical storage in atomic clouds [1,2] have opened up new possibilities for both classical and quantum information storage. In parallel, there have been advances in the generation of Laguerre-Gaussian (LG) modes with angular momentum (optical vortices)[3] and applications of these modes to quantum information architectures based on a alphabets larger than the traditional two-state systems. Here we theoretically consider the storage of such LG modes in atomic Rb-87 Bose-Einstein condensates (BECs). An LG mode writes its vortex phase pattern into a two-component BEC vortex state. The angular momentum information can then be stored in the BEC and then efficiently read back onto the optical field by switching a control field on. We study the fidelity of the writing, storage, and read-out processes. We also consider applying this method to to the transfer of more complicated states, such as two-component vortex lattices, between two spatially distinct BECs. 1. C. Liu, Z. Dutton, C.H. Behroozi, and L.V. Hau, Nature 409, 490 (2001). 2. D.F. Phillips, A. Fleischhauer, A. Mair, R.L. Walsworth, and M.D. Lukin, Phys. Rev. Lett. 86, 783 (2001). 3. A. Vaziri, Gregor Weihs, and A. Zeilinger, cond-mat/0111033.
Protecting Location Privacy for Outsourced Spatial Data in Cloud Storage
Gui, Xiaolin; An, Jian; Zhao, Jianqiang; Zhang, Xuejun
2014-01-01
As cloud computing services and location-aware devices are fully developed, a large amount of spatial data needs to be outsourced to the cloud storage provider, so the research on privacy protection for outsourced spatial data gets increasing attention from academia and industry. As a kind of spatial transformation method, Hilbert curve is widely used to protect the location privacy for spatial data. But sufficient security analysis for standard Hilbert curve (SHC) is seldom proceeded. In this paper, we propose an index modification method for SHC (SHC∗) and a density-based space filling curve (DSC) to improve the security of SHC; they can partially violate the distance-preserving property of SHC, so as to achieve better security. We formally define the indistinguishability and attack model for measuring the privacy disclosure risk of spatial transformation methods. The evaluation results indicate that SHC∗ and DSC are more secure than SHC, and DSC achieves the best index generation performance. PMID:25097865
Miao, Yinbin; Ma, Jianfeng; Liu, Ximeng; Wei, Fushan; Liu, Zhiquan; Wang, Xu An
2016-11-01
Online personal health record (PHR) is more inclined to shift data storage and search operations to cloud server so as to enjoy the elastic resources and lessen computational burden in cloud storage. As multiple patients' data is always stored in the cloud server simultaneously, it is a challenge to guarantee the confidentiality of PHR data and allow data users to search encrypted data in an efficient and privacy-preserving way. To this end, we design a secure cryptographic primitive called as attribute-based multi-keyword search over encrypted personal health records in multi-owner setting to support both fine-grained access control and multi-keyword search via Ciphertext-Policy Attribute-Based Encryption. Formal security analysis proves our scheme is selectively secure against chosen-keyword attack. As a further contribution, we conduct empirical experiments over real-world dataset to show its feasibility and practicality in a broad range of actual scenarios without incurring additional computational burden.
NASA Astrophysics Data System (ADS)
Gouwens, C.; Dragosavic, M.
The large reserves and increasing use of natural gas as a source of energy have resulted in its storage and transport becoming an urgent problem. Since a liquid of the same mass occupies only a fraction of the volume of a gas, it is economical to store natural gas as a liquid. Liquefied natural gas is stored in insulated tanks and also carried by ship at a temperature of -160 C to 170 C. If a serious accident allows the LNG to escape, a gas cloud forms. The results of a possible explosion from such a gas cloud are studied. The development of a leak, escape and evaporation, size and propagation of the gas cloud, the explosive pressures to be expected and the results on the environment are investigated. Damage to buildings is examined making use of the preliminary conclusions of the other sub-projects and especially the explosive pressures.
Protecting location privacy for outsourced spatial data in cloud storage.
Tian, Feng; Gui, Xiaolin; An, Jian; Yang, Pan; Zhao, Jianqiang; Zhang, Xuejun
2014-01-01
As cloud computing services and location-aware devices are fully developed, a large amount of spatial data needs to be outsourced to the cloud storage provider, so the research on privacy protection for outsourced spatial data gets increasing attention from academia and industry. As a kind of spatial transformation method, Hilbert curve is widely used to protect the location privacy for spatial data. But sufficient security analysis for standard Hilbert curve (SHC) is seldom proceeded. In this paper, we propose an index modification method for SHC (SHC(∗)) and a density-based space filling curve (DSC) to improve the security of SHC; they can partially violate the distance-preserving property of SHC, so as to achieve better security. We formally define the indistinguishability and attack model for measuring the privacy disclosure risk of spatial transformation methods. The evaluation results indicate that SHC(∗) and DSC are more secure than SHC, and DSC achieves the best index generation performance.
SenSyF Experience on Integration of EO Services in a Generic, Cloud-Based EO Exploitation Platform
NASA Astrophysics Data System (ADS)
Almeida, Nuno; Catarino, Nuno; Gutierrez, Antonio; Grosso, Nuno; Andrade, Joao; Caumont, Herve; Goncalves, Pedro; Villa, Guillermo; Mangin, Antoine; Serra, Romain; Johnsen, Harald; Grydeland, Tom; Emsley, Stephen; Jauch, Eduardo; Moreno, Jose; Ruiz, Antonio
2016-08-01
SenSyF is a cloud-based data processing framework for EO- based services. It has been pioneer in addressing Big Data issues from the Earth Observation point of view, and is a precursor of several of the technologies and methodologies that will be deployed in ESA's Thematic Exploitation Platforms and other related systems.The SenSyF system focuses on developing fully automated data management, together with access to a processing and exploitation framework, including Earth Observation specific tools. SenSyF is both a development and validation platform for data intensive applications using Earth Observation data. With SenSyF, scientific, institutional or commercial institutions developing EO- based applications and services can take advantage of distributed computational and storage resources, tailored for applications dependent on big Earth Observation data, and without resorting to deep infrastructure and technological investments.This paper describes the integration process and the experience gathered from different EO Service providers during the project.
Health Informatics for Neonatal Intensive Care Units: An Analytical Modeling Perspective
Mench-Bressan, Nadja; McGregor, Carolyn; Pugh, James Edward
2015-01-01
The effective use of data within intensive care units (ICUs) has great potential to create new cloud-based health analytics solutions for disease prevention or earlier condition onset detection. The Artemis project aims to achieve the above goals in the area of neonatal ICUs (NICU). In this paper, we proposed an analytical model for the Artemis cloud project which will be deployed at McMaster Children’s Hospital in Hamilton. We collect not only physiological data but also the infusion pumps data that are attached to NICU beds. Using the proposed analytical model, we predict the amount of storage, memory, and computation power required for the system. Capacity planning and tradeoff analysis would be more accurate and systematic by applying the proposed analytical model in this paper. Numerical results are obtained using real inputs acquired from McMaster Children’s Hospital and a pilot deployment of the system at The Hospital for Sick Children (SickKids) in Toronto. PMID:27170907
Bioinformatics clouds for big data manipulation.
Dai, Lin; Gao, Xin; Guo, Yan; Xiao, Jingfa; Zhang, Zhang
2012-11-28
As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.
NASA Astrophysics Data System (ADS)
Giraud, Francois
1999-10-01
This dissertation investigates the application of neural network theory to the analysis of a 4-kW Utility-interactive Wind-Photovoltaic System (WPS) with battery storage. The hybrid system comprises a 2.5-kW photovoltaic generator and a 1.5-kW wind turbine. The wind power generator produces power at variable speed and variable frequency (VSVF). The wind energy is converted into dc power by a controlled, tree-phase, full-wave, bridge rectifier. The PV power is maximized by a Maximum Power Point Tracker (MPPT), a dc-to-dc chopper, switching at a frequency of 45 kHz. The whole dc power of both subsystems is stored in the battery bank or conditioned by a single-phase self-commutated inverter to be sold to the utility at a predetermined amount. First, the PV is modeled using Artificial Neural Network (ANN). To reduce model uncertainty, the open-circuit voltage VOC and the short-circuit current ISC of the PV are chosen as model input variables of the ANN. These input variables have the advantage of incorporating the effects of the quantifiable and non-quantifiable environmental variants affecting the PV power. Then, a simplified way to predict accurately the dynamic responses of the grid-linked WPS to gusty winds using a Recurrent Neural Network (RNN) is investigated. The RNN is a single-output feedforward backpropagation network with external feedback, which allows past responses to be fed back to the network input. In the third step, a Radial Basis Functions (RBF) Network is used to analyze the effects of clouds on the Utility-Interactive WPS. Using the irradiance as input signal, the network models the effects of random cloud movement on the output current, the output voltage, the output power of the PV system, as well as the electrical output variables of the grid-linked inverter. Fourthly, using RNN, the combined effects of a random cloud and a wind gusts on the system are analyzed. For short period intervals, the wind speed and the solar radiation are considered as the sole sources of power, whose variations influence the system variables. Since both subsystems have different dynamics, their respective responses are expected to impact differently the whole system behavior. The dispatchability of the battery-supported system as well as its stability and reliability during gusts and/or cloud passage is also discussed. In the fifth step, the goal is to determine to what extent the overall power quality of the grid would be affected by a proliferation of Utility-interactive hybrid system and whether recourse to bulky or individual filtering and voltage controller is necessary. The final stage of the research includes a steady-state analysis of two-year operation (May 96--Apr 98) of the system, with a discussion on system reliability, on any loss of supply probability, and on the effects of the randomness in the wind and solar radiation upon the system design optimization.
Beam induced electron cloud resonances in dipole magnetic fields
Calvey, J. R.; Hartung, W.; Makita, J.; ...
2016-07-01
The buildup of low energy electrons in an accelerator, known as electron cloud, can be severely detrimental to machine performance. Under certain beam conditions, the beam can become resonant with the cloud dynamics, accelerating the buildup of electrons. This paper will examine two such effects: multipacting resonances, in which the cloud development time is resonant with the bunch spacing, and cyclotron resonances, in which the cyclotron period of electrons in a magnetic field is a multiple of bunch spacing. Both resonances have been studied directly in dipole fields using retarding field analyzers installed in the Cornell Electron Storage Ring. Thesemore » measurements are supported by both analytical models and computer simulations.« less
A computational- And storage-cloud for integration of biodiversity collections
Matsunaga, A.; Thompson, A.; Figueiredo, R. J.; Germain-Aubrey, C.C; Collins, M.; Beeman, R.S; Macfadden, B.J.; Riccardi, G.; Soltis, P.S; Page, L. M.; Fortes, J.A.B
2013-01-01
A core mission of the Integrated Digitized Biocollections (iDigBio) project is the building and deployment of a cloud computing environment customized to support the digitization workflow and integration of data from all U.S. nonfederal biocollections. iDigBio chose to use cloud computing technologies to deliver a cyberinfrastructure that is flexible, agile, resilient, and scalable to meet the needs of the biodiversity community. In this context, this paper describes the integration of open source cloud middleware, applications, and third party services using standard formats, protocols, and services. In addition, this paper demonstrates the value of the digitized information from collections in a broader scenario involving multiple disciplines.
Remotely-sensed near real-time monitoring of reservoir storage in India
NASA Astrophysics Data System (ADS)
Tiwari, A. D.; Mishra, V.
2017-12-01
Real-time reservoir storage information at a high temporal resolution is crucial to mitigate the influence of extreme events like floods and droughts. Despite large implications of near real-time reservoir monitoring in India for water resources and irrigation, remotely sensed monitoring systems have been lacking. Here we develop remotely sensed real-time monitoring systems for 91 large reservoirs in India for the period from 2000 to 2017. For the reservoir storage estimation, we combined Moderate Resolution Imaging Spectroradiometer (MODIS) 8-day 250 m Enhanced Vegetation Index (EVI), and Geoscience Laser Altimeter System (GLAS) onboard the Ice, Cloud, and land Elevation Satellite (ICESat) ICESat/GLAS elevation data. Vegetation data with the highest temporal resolution available from the MODIS is at 16 days. To increase the temporal resolution to 8 days, we developed the 8-day composite of near infrared, red, and blue band surface reflectance. Surface reflectance 8-Day L3 Global 250m only have NIR band and Red band, therefore, surface reflectance of 8-Day L3 Global at 500m is used for the blue band, which was regridded to 250m spatial resolution. An area-elevation relationship was derived using area from an unsupervised classification of MODIS image followed by an image enhancement and elevation data from ICESat/GLAS. A trial and error method was used to obtain the area-elevation relationship for those reservoirs for which ICESat/GLAS data is not available. The reservoir storages results were compared with the gauge storage data from 2002 to 2009 (training period), which were then evaluated for the period of 2010 to 2016. Our storage estimates were highly correlated with observations (R2 = 0.6 to 0.96), and the normalized root mean square error (NRMSE) ranged between 10% and 50%. We also developed a relationship between precipitation and reservoir storage that can be used for prediction of storage during the dry season.
The application of cloud computing to scientific workflows: a study of cost and performance.
Berriman, G Bruce; Deelman, Ewa; Juve, Gideon; Rynge, Mats; Vöckler, Jens-S
2013-01-28
The current model of transferring data from data centres to desktops for analysis will soon be rendered impractical by the accelerating growth in the volume of science datasets. Processing will instead often take place on high-performance servers co-located with data. Evaluations of how new technologies such as cloud computing would support such a new distributed computing model are urgently needed. Cloud computing is a new way of purchasing computing and storage resources on demand through virtualization technologies. We report here the results of investigations of the applicability of commercial cloud computing to scientific computing, with an emphasis on astronomy, including investigations of what types of applications can be run cheaply and efficiently on the cloud, and an example of an application well suited to the cloud: processing a large dataset to create a new science product.
Benefits of cloud computing for PACS and archiving.
Koch, Patrick
2012-01-01
The goal of cloud-based services is to provide easy, scalable access to computing resources and IT services. The healthcare industry requires a private cloud that adheres to government mandates designed to ensure privacy and security of patient data while enabling access by authorized users. Cloud-based computing in the imaging market has evolved from a service that provided cost effective disaster recovery for archived data to fully featured PACS and vendor neutral archiving services that can address the needs of healthcare providers of all sizes. Healthcare providers worldwide are now using the cloud to distribute images to remote radiologists while supporting advanced reading tools, deliver radiology reports and imaging studies to referring physicians, and provide redundant data storage. Vendor managed cloud services eliminate large capital investments in equipment and maintenance, as well as staffing for the data center--creating a reduction in total cost of ownership for the healthcare provider.
A hybrid cloud read aligner based on MinHash and kmer voting that preserves privacy
NASA Astrophysics Data System (ADS)
Popic, Victoria; Batzoglou, Serafim
2017-05-01
Low-cost clouds can alleviate the compute and storage burden of the genome sequencing data explosion. However, moving personal genome data analysis to the cloud can raise serious privacy concerns. Here, we devise a method named Balaur, a privacy preserving read mapper for hybrid clouds based on locality sensitive hashing and kmer voting. Balaur can securely outsource a substantial fraction of the computation to the public cloud, while being highly competitive in accuracy and speed with non-private state-of-the-art read aligners on short read data. We also show that the method is significantly faster than the state of the art in long read mapping. Therefore, Balaur can enable institutions handling massive genomic data sets to shift part of their analysis to the cloud without sacrificing accuracy or exposing sensitive information to an untrusted third party.
A hybrid cloud read aligner based on MinHash and kmer voting that preserves privacy
Popic, Victoria; Batzoglou, Serafim
2017-01-01
Low-cost clouds can alleviate the compute and storage burden of the genome sequencing data explosion. However, moving personal genome data analysis to the cloud can raise serious privacy concerns. Here, we devise a method named Balaur, a privacy preserving read mapper for hybrid clouds based on locality sensitive hashing and kmer voting. Balaur can securely outsource a substantial fraction of the computation to the public cloud, while being highly competitive in accuracy and speed with non-private state-of-the-art read aligners on short read data. We also show that the method is significantly faster than the state of the art in long read mapping. Therefore, Balaur can enable institutions handling massive genomic data sets to shift part of their analysis to the cloud without sacrificing accuracy or exposing sensitive information to an untrusted third party. PMID:28508884
Wu, Yilun; Lu, Xicheng; Su, Jinshu; Chen, Peixin
2016-12-01
Preserving the privacy of electronic medical records (EMRs) is extremely important especially when medical systems adopt cloud services to store patients' electronic medical records. Considering both the privacy and the utilization of EMRs, some medical systems apply searchable encryption to encrypt EMRs and enable authorized users to search over these encrypted records. Since individuals would like to share their EMRs with multiple persons, how to design an efficient searchable encryption for sharable EMRs is still a very challenge work. In this paper, we propose a cost-efficient secure channel free searchable encryption (SCF-PEKS) scheme for sharable EMRs. Comparing with existing SCF-PEKS solutions, our scheme reduces the storage overhead and achieves better computation performance. Moreover, our scheme can guard against keyword guessing attack, which is neglected by most of the existing schemes. Finally, we implement both our scheme and a latest medical-based scheme to evaluate the performance. The evaluation results show that our scheme performs much better performance than the latest one for sharable EMRs.
NASA Technical Reports Server (NTRS)
Chen, D. W.; Sengupta, S. K.; Welch, R. M.
1989-01-01
This paper compares the results of cloud-field classification derived from two simplified vector approaches, the Sum and Difference Histogram (SADH) and the Gray Level Difference Vector (GLDV), with the results produced by the Gray Level Cooccurrence Matrix (GLCM) approach described by Welch et al. (1988). It is shown that the SADH method produces accuracies equivalent to those obtained using the GLCM method, while the GLDV method fails to resolve error clusters. Compared to the GLCM method, the SADH method leads to a 31 percent saving in run time and a 50 percent saving in storage requirements, while the GLVD approach leads to a 40 percent saving in run time and an 87 percent saving in storage requirements.
Towards a Multi-Mission, Airborne Science Data System Environment
NASA Astrophysics Data System (ADS)
Crichton, D. J.; Hardman, S.; Law, E.; Freeborn, D.; Kay-Im, E.; Lau, G.; Oswald, J.
2011-12-01
NASA earth science instruments are increasingly relying on airborne missions. However, traditionally, there has been limited common infrastructure support available to principal investigators in the area of science data systems. As a result, each investigator has been required to develop their own computing infrastructures for the science data system. Typically there is little software reuse and many projects lack sufficient resources to provide a robust infrastructure to capture, process, distribute and archive the observations acquired from airborne flights. At NASA's Jet Propulsion Laboratory (JPL), we have been developing a multi-mission data system infrastructure for airborne instruments called the Airborne Cloud Computing Environment (ACCE). ACCE encompasses the end-to-end lifecycle covering planning, provisioning of data system capabilities, and support for scientific analysis in order to improve the quality, cost effectiveness, and capabilities to enable new scientific discovery and research in earth observation. This includes improving data system interoperability across each instrument. A principal characteristic is being able to provide an agile infrastructure that is architected to allow for a variety of configurations of the infrastructure from locally installed compute and storage services to provisioning those services via the "cloud" from cloud computer vendors such as Amazon.com. Investigators often have different needs that require a flexible configuration. The data system infrastructure is built on the Apache's Object Oriented Data Technology (OODT) suite of components which has been used for a number of spaceborne missions and provides a rich set of open source software components and services for constructing science processing and data management systems. In 2010, a partnership was formed between the ACCE team and the Carbon in Arctic Reservoirs Vulnerability Experiment (CARVE) mission to support the data processing and data management needs. A principal goal is to provide support for the Fourier Transform Spectrometer (FTS) instrument which will produce over 700,000 soundings over the life of their three-year mission. The cost to purchase and operate a cluster-based system in order to generate Level 2 Full Physics products from this data was prohibitive. Through an evaluation of cloud computing solutions, Amazon's Elastic Compute Cloud (EC2) was selected for the CARVE deployment. As the ACCE infrastructure is developed and extended to form an infrastructure for airborne missions, the experience of working with CARVE has provided a number of lessons learned and has proven to be important in reinforcing the unique aspects of airborne missions and the importance of the ACCE infrastructure in developing a cost effective, flexible multi-mission capability that leverages emerging capabilities in cloud computing, workflow management, and distributed computing.
Secure Nearest Neighbor Query on Crowd-Sensing Data
Cheng, Ke; Wang, Liangmin; Zhong, Hong
2016-01-01
Nearest neighbor queries are fundamental in location-based services, and secure nearest neighbor queries mainly focus on how to securely and quickly retrieve the nearest neighbor in the outsourced cloud server. However, the previous big data system structure has changed because of the crowd-sensing data. On the one hand, sensing data terminals as the data owner are numerous and mistrustful, while, on the other hand, in most cases, the terminals find it difficult to finish many safety operation due to computation and storage capability constraints. In light of they Multi Owners and Multi Users (MOMU) situation in the crowd-sensing data cloud environment, this paper presents a secure nearest neighbor query scheme based on the proxy server architecture, which is constructed by protocols of secure two-party computation and secure Voronoi diagram algorithm. It not only preserves the data confidentiality and query privacy but also effectively resists the collusion between the cloud server and the data owners or users. Finally, extensive theoretical and experimental evaluations are presented to show that our proposed scheme achieves a superior balance between the security and query performance compared to other schemes. PMID:27669253
Secure Nearest Neighbor Query on Crowd-Sensing Data.
Cheng, Ke; Wang, Liangmin; Zhong, Hong
2016-09-22
Nearest neighbor queries are fundamental in location-based services, and secure nearest neighbor queries mainly focus on how to securely and quickly retrieve the nearest neighbor in the outsourced cloud server. However, the previous big data system structure has changed because of the crowd-sensing data. On the one hand, sensing data terminals as the data owner are numerous and mistrustful, while, on the other hand, in most cases, the terminals find it difficult to finish many safety operation due to computation and storage capability constraints. In light of they Multi Owners and Multi Users (MOMU) situation in the crowd-sensing data cloud environment, this paper presents a secure nearest neighbor query scheme based on the proxy server architecture, which is constructed by protocols of secure two-party computation and secure Voronoi diagram algorithm. It not only preserves the data confidentiality and query privacy but also effectively resists the collusion between the cloud server and the data owners or users. Finally, extensive theoretical and experimental evaluations are presented to show that our proposed scheme achieves a superior balance between the security and query performance compared to other schemes.
Comparative Analysis of Data Structures for Storing Massive Tins in a Dbms
NASA Astrophysics Data System (ADS)
Kumar, K.; Ledoux, H.; Stoter, J.
2016-06-01
Point cloud data are an important source for 3D geoinformation. Modern day 3D data acquisition and processing techniques such as airborne laser scanning and multi-beam echosounding generate billions of 3D points for simply an area of few square kilometers. With the size of the point clouds exceeding the billion mark for even a small area, there is a need for their efficient storage and management. These point clouds are sometimes associated with attributes and constraints as well. Storing billions of 3D points is currently possible which is confirmed by the initial implementations in Oracle Spatial SDO PC and the PostgreSQL Point Cloud extension. But to be able to analyse and extract useful information from point clouds, we need more than just points i.e. we require the surface defined by these points in space. There are different ways to represent surfaces in GIS including grids, TINs, boundary representations, etc. In this study, we investigate the database solutions for the storage and management of massive TINs. The classical (face and edge based) and compact (star based) data structures are discussed at length with reference to their structure, advantages and limitations in handling massive triangulations and are compared with the current solution of PostGIS Simple Feature. The main test dataset is the TIN generated from third national elevation model of the Netherlands (AHN3) with a point density of over 10 points/m2. PostgreSQL/PostGIS DBMS is used for storing the generated TIN. The data structures are tested with the generated TIN models to account for their geometry, topology, storage, indexing, and loading time in a database. Our study is useful in identifying what are the limitations of the existing data structures for storing massive TINs and what is required to optimise these structures for managing massive triangulations in a database.
Cloud Image Data Center for Healthcare Network in Taiwan.
Weng, Shao-Jen; Lai, Lai-Shiun; Gotcher, Donald; Wu, Hsin-Hung; Xu, Yeong-Yuh; Yang, Ching-Wen
2016-04-01
This paper investigates how a healthcare network in Taiwan uses a practical cloud image data center (CIDC) to communicate with its constituent hospital branches. A case study approach was used. The study was carried out in the central region of Taiwan, with four hospitals belonging to the Veterans Hospital healthcare network. The CIDC provides synchronous and asynchronous consultation among these branches. It provides storage, platforms, and services on demand to the hospitals. Any branch-client can pull up the patient's medical images from any hospital off this cloud. Patients can be examined at the branches, and the images and reports can be further evaluated by physicians in the main Taichung Veterans General Hospital (TVGH) to enhance the usage and efficiency of equipment in the various branches, thereby shortening the waiting time of patients. The performance of the CIDC over 5 years shows: (1) the total number of cross-hospital images accessed with CDC in the branches was 132,712; and (2) TVGH assisted the branches in keying in image reports using the CIDC 4,424 times; and (3) Implementation of the system has improved management, efficiency, speed and quality of care. Therefore, the results lead to the recommendation of continuing and expanding the cloud computing architecture to improve information sharing among branches in the healthcare network.
High Energy Density Capacitors
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
2010-07-01
BEEST Project: Recapping is developing a capacitor that could rival the energy storage potential and price of today’s best EV batteries. When power is needed, the capacitor rapidly releases its stored energy, similar to lightning being discharged from a cloud. Capacitors are an ideal substitute for batteries if their energy storage capacity can be improved. Recapping is addressing storage capacity by experimenting with the material that separates the positive and negative electrodes of its capacitors. These separators could significantly improve the energy density of electrochemical devices.
CEDIMS: cloud ethical DICOM image Mojette storage
NASA Astrophysics Data System (ADS)
Guédon, Jeanpierre; Evenou, Pierre; Tervé, Pierre; David, Sylvain; Béranger, Jérome
2012-02-01
Dicom images of patients will necessarily been stored in Clouds. However, ethical constraints must apply. In this paper, a method which provides the two following conditions is presented: 1) the medical information is not readable by the cloud owner since it is distributed along several clouds 2) the medical information can be retrieved from any sufficient subset of clouds In order to obtain this result in a real time processing, the Mojette transform is used. This paper reviews the interesting features of the Mojette transform in terms of information theory. Since only portions of the original Dicom files are stored into each cloud, their contents are not reachable. For instance, we use 4 different public clouds to save 4 different projections of each file, with the additional condition that any 3 over 4 projections are enough to reconstruct the original file. Thus, even if a cloud is unavailable when the user wants to load a Dicom file, the other 3 are giving enough information for real time reconstruction. The paper presents an implementation on 3 actual clouds. For ethical reasons, we use a Dicom image spreaded over 3 public clouds to show the obtained confidentiality and possible real time recovery.
Bioinformatics clouds for big data manipulation
2012-01-01
Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. PMID:23190475
Federal Register 2010, 2011, 2012, 2013, 2014
2010-05-18
... device to function as a cloud computing device similar to a network storage RAID array (HDDs strung... contract. This final determination, in HQ H082476, was issued at the request of Scale Computing under... response to your request dated October 15, 2009, made on behalf of Scale Computing (``Scale''). You ask for...
Organizational principles of cloud storage to support collaborative biomedical research.
Kanbar, Lara J; Shalish, Wissam; Robles-Rubio, Carlos A; Precup, Doina; Brown, Karen; Sant'Anna, Guilherme M; Kearney, Robert E
2015-08-01
This paper describes organizational guidelines and an anonymization protocol for the management of sensitive information in interdisciplinary, multi-institutional studies with multiple collaborators. This protocol is flexible, automated, and suitable for use in cloud-based projects as well as for publication of supplementary information in journal papers. A sample implementation of the anonymization protocol is illustrated for an ongoing study dealing with Automated Prediction of EXtubation readiness (APEX).
Processing NASA Earth Science Data on Nebula Cloud
NASA Technical Reports Server (NTRS)
Chen, Aijun; Pham, Long; Kempler, Steven
2012-01-01
Three applications were successfully migrated to Nebula, including S4PM, AIRS L1/L2 algorithms, and Giovanni MAPSS. Nebula has some advantages compared with local machines (e.g. performance, cost, scalability, bundling, etc.). Nebula still faces some challenges (e.g. stability, object storage, networking, etc.). Migrating applications to Nebula is feasible but time consuming. Lessons learned from our Nebula experience will benefit future Cloud Computing efforts at GES DISC.
Reviews on Security Issues and Challenges in Cloud Computing
NASA Astrophysics Data System (ADS)
An, Y. Z.; Zaaba, Z. F.; Samsudin, N. F.
2016-11-01
Cloud computing is an Internet-based computing service provided by the third party allowing share of resources and data among devices. It is widely used in many organizations nowadays and becoming more popular because it changes the way of how the Information Technology (IT) of an organization is organized and managed. It provides lots of benefits such as simplicity and lower costs, almost unlimited storage, least maintenance, easy utilization, backup and recovery, continuous availability, quality of service, automated software integration, scalability, flexibility and reliability, easy access to information, elasticity, quick deployment and lower barrier to entry. While there is increasing use of cloud computing service in this new era, the security issues of the cloud computing become a challenges. Cloud computing must be safe and secure enough to ensure the privacy of the users. This paper firstly lists out the architecture of the cloud computing, then discuss the most common security issues of using cloud and some solutions to the security issues since security is one of the most critical aspect in cloud computing due to the sensitivity of user's data.
NASA Astrophysics Data System (ADS)
Li-Chee-Ming, J.; Armenakis, C.
2014-11-01
This paper presents the ongoing development of a small unmanned aerial mapping system (sUAMS) that in the future will track its trajectory and perform 3D mapping in near-real time. As both mapping and tracking algorithms require powerful computational capabilities and large data storage facilities, we propose to use the RoboEarth Cloud Engine (RCE) to offload heavy computation and store data to secure computing environments in the cloud. While the RCE's capabilities have been demonstrated with terrestrial robots in indoor environments, this paper explores the feasibility of using the RCE in mapping and tracking applications in outdoor environments by small UAMS. The experiments presented in this work assess the data processing strategies and evaluate the attainable tracking and mapping accuracies using the data obtained by the sUAMS. Testing was performed with an Aeryon Scout quadcopter. It flew over York University, up to approximately 40 metres above the ground. The quadcopter was equipped with a single-frequency GPS receiver providing positioning to about 3 meter accuracies, an AHRS (Attitude and Heading Reference System) estimating the attitude to about 3 degrees, and an FPV (First Person Viewing) camera. Video images captured from the onboard camera were processed using VisualSFM and SURE, which are being reformed as an Application-as-a-Service via the RCE. The 3D virtual building model of York University was used as a known environment to georeference the point cloud generated from the sUAMS' sensor data. The estimated position and orientation parameters of the video camera show increases in accuracy when compared to the sUAMS' autopilot solution, derived from the onboard GPS and AHRS. The paper presents the proposed approach and the results, along with their accuracies.
Using IKAROS as a data transfer and management utility within the KM3NeT computing model
NASA Astrophysics Data System (ADS)
Filippidis, Christos; Cotronis, Yiannis; Markou, Christos
2016-04-01
KM3NeT is a future European deep-sea research infrastructure hosting a new generation neutrino detectors that - located at the bottom of the Mediterranean Sea - will open a new window on the universe and answer fundamental questions both in particle physics and astrophysics. IKAROS is a framework that enables creating scalable storage formations on-demand and helps addressing several limitations that the current file systems face when dealing with very large scale infrastructures. It enables creating ad-hoc nearby storage formations and can use a huge number of I/O nodes in order to increase the available bandwidth (I/O and network). IKAROS unifies remote and local access in the overall data flow, by permitting direct access to each I/O node. In this way we can handle the overall data flow at the network layer, limiting the interaction with the operating system. This approach allows virtually connecting, at the users level, the several different computing facilities used (Grids, Clouds, HPCs, Data Centers, Local computing Clusters and personal storage devices), on-demand, based on the needs, by using well known standards and protocols, like HTTP.
A Tale Of 160 Scientists, Three Applications, a Workshop and a Cloud
NASA Astrophysics Data System (ADS)
Berriman, G. B.; Brinkworth, C.; Gelino, D.; Wittman, D. K.; Deelman, E.; Juve, G.; Rynge, M.; Kinney, J.
2013-10-01
The NASA Exoplanet Science Institute (NExScI) hosts the annual Sagan Workshops, thematic meetings aimed at introducing researchers to the latest tools and methodologies in exoplanet research. The theme of the Summer 2012 workshop, held from July 23 to July 27 at Caltech, was to explore the use of exoplanet light curves to study planetary system architectures and atmospheres. A major part of the workshop was to use hands-on sessions to instruct attendees in the use of three open source tools for the analysis of light curves, especially from the Kepler mission. Each hands-on session involved the 160 attendees using their laptops to follow step-by-step tutorials given by experts. One of the applications, PyKE, is a suite of Python tools designed to reduce and analyze Kepler light curves; these tools can be invoked from the Unix command line or a GUI in PyRAF. The Transit Analysis Package (TAP) uses Markov Chain Monte Carlo (MCMC) techniques to fit light curves under the Interactive Data Language (IDL) environment, and Transit Timing Variations (TTV) uses IDL tools and Java-based GUIs to confirm and detect exoplanets from timing variations in light curve fitting. Rather than attempt to run these diverse applications on the inevitable wide range of environments on attendees laptops, they were run instead on the Amazon Elastic Cloud 2 (EC2). The cloud offers features ideal for this type of short term need: computing and storage services are made available on demand for as long as needed, and a processing environment can be customized and replicated as needed. The cloud environment included an NFS file server virtual machine (VM), 20 client VMs for use by attendees, and a VM to enable ftp downloads of the attendees' results. The file server was configured with a 1 TB Elastic Block Storage (EBS) volume (network-attached storage mounted as a device) containing the application software and attendees home directories. The clients were configured to mount the applications and home directories from the server via NFS. All VMs were built with CentOS version 5.8. Attendees connected their laptops to one of the client VMs using the Virtual Network Computing (VNC) protocol, which enabled them to interact with a remote desktop GUI during the hands-on sessions. We will describe the mechanisms for handling security, failovers, and licensing of commercial software. In particular, IDL licenses were managed through a server at Caltech, connected to the IDL instances running on Amazon EC2 via a Secure Shell (ssh) tunnel. The system operated flawlessly during the workshop.
NASA Technical Reports Server (NTRS)
Loeb, Norman G.; Wielicki, Bruce A.; Doelling, David R.
2008-01-01
There are some in the science community who believe that the response of the climate system to anthropogenic radiative forcing is unpredictable and we should therefore call off the quest . The key limitation in climate predictability is associated with cloud feedback. Narrowing the uncertainty in cloud feedback (and therefore climate sensitivity) requires optimal use of the best available observations to evaluate and improve climate model processes and constrain climate model simulations over longer time scales. The Clouds and the Earth s Radiant Energy System (CERES) is a satellite-based program that provides global cloud, aerosol and radiative flux observations for improving our understanding of cloud-aerosol-radiation feedbacks in the Earth s climate system. CERES is the successor to the Earth Radiation Budget Experiment (ERBE), which has widely been used to evaluate climate models both at short time scales (e.g., process studies) and at decadal time scales. A CERES instrument flew on the TRMM satellite and captured the dramatic 1998 El Nino, and four other CERES instruments are currently flying aboard the Terra and Aqua platforms. Plans are underway to fly the remaining copy of CERES on the upcoming NPP spacecraft (mid-2010 launch date). Every aspect of CERES represents a significant improvement over ERBE. While both CERES and ERBE measure broadband radiation, CERES calibration is a factor of 2 better than ERBE. In order to improve the characterization of clouds and aerosols within a CERES footprint, we use coincident higher-resolution imager observations (VIRS, MODIS or VIIRS) to provide a consistent cloud-aerosol-radiation dataset at climate accuracy. Improved radiative fluxes are obtained by using new CERES-derived Angular Distribution Models (ADMs) for converting measured radiances to fluxes. CERES radiative fluxes are a factor of 2 more accurate than ERBE overall, but the improvement by cloud type and at high latitudes can be as high as a factor of 5. Diurnal cycles are explicitly resolved by merging geostationary satellite observations with CERES and MODIS. Atmospheric state data are provided from a frozen version of the Global Modeling and Assimilation Office- Data Assimilation System at the NASA Goddard Space Flight Center. In addition to improving the accuracy of top-of-atmosphere (TOA) radiative fluxes, CERES also produces radiative fluxes at the surface and at several levels in the atmosphere using radiative transfer modeling, constrained at the TOA by CERES (ERBE was limited to the TOA). In all, CERES uses 11 instruments on 7 spacecraft all integrated to obtain climate accuracy in TOA to surface fluxes. This presentation will provide an overview of several new CERES datasets of interest to the climate community (including a new adjusted TOA flux dataset constrained by estimates of heat storage in the Earth system), show direct comparisons between CERES ad ERBE, and provide a detailed error analysis of CERES fluxes at various time and space scales. We discuss how observations can be used to reduce uncertainties in cloud feedback and climate sensitivity and strongly argue why we should NOT "call off the quest".
Laser photovoltaic power system synergy for SEI applications
NASA Technical Reports Server (NTRS)
Landis, Geoffrey A.; Hickman, J. M.
1991-01-01
Solar arrays can provide reliable space power, but do not operate when there is no solar energy. Photovoltaic arrays can also convert laser energy with high efficiency. One proposal to reduce the required mass of energy storage required is to illuminate the photovoltaic arrays by a ground laser system. It is proposed to locate large lasers on cloud-free sites at one or more ground locations, and use large lenses or mirrors with adaptive optical correction to reduce the beam spread due to diffraction or atmospheric turbulence. During the eclipse periods or lunar night, the lasers illuminate the solar arrays to a level sufficient to provide operating power.
Lebeda, Frank J; Zalatoris, Jeffrey J; Scheerer, Julia B
2018-02-07
This position paper summarizes the development and the present status of Department of Defense (DoD) and other government policies and guidances regarding cloud computing services. Due to the heterogeneous and growing biomedical big datasets, cloud computing services offer an opportunity to mitigate the associated storage and analysis requirements. Having on-demand network access to a shared pool of flexible computing resources creates a consolidated system that should reduce potential duplications of effort in military biomedical research. Interactive, online literature searches were performed with Google, at the Defense Technical Information Center, and at two National Institutes of Health research portfolio information sites. References cited within some of the collected documents also served as literature resources. We gathered, selected, and reviewed DoD and other government cloud computing policies and guidances published from 2009 to 2017. These policies were intended to consolidate computer resources within the government and reduce costs by decreasing the number of federal data centers and by migrating electronic data to cloud systems. Initial White House Office of Management and Budget information technology guidelines were developed for cloud usage, followed by policies and other documents from the DoD, the Defense Health Agency, and the Armed Services. Security standards from the National Institute of Standards and Technology, the Government Services Administration, the DoD, and the Army were also developed. Government Services Administration and DoD Inspectors General monitored cloud usage by the DoD. A 2016 Government Accountability Office report characterized cloud computing as being economical, flexible and fast. A congressionally mandated independent study reported that the DoD was active in offering a wide selection of commercial cloud services in addition to its milCloud system. Our findings from the Department of Health and Human Services indicated that the security infrastructure in cloud services may be more compliant with the Health Insurance Portability and Accountability Act of 1996 regulations than traditional methods. To gauge the DoD's adoption of cloud technologies proposed metrics included cost factors, ease of use, automation, availability, accessibility, security, and policy compliance. Since 2009, plans and policies were developed for the use of cloud technology to help consolidate and reduce the number of data centers which were expected to reduce costs, improve environmental factors, enhance information technology security, and maintain mission support for service members. Cloud technologies were also expected to improve employee efficiency and productivity. Federal cloud computing policies within the last decade also offered increased opportunities to advance military healthcare. It was assumed that these opportunities would benefit consumers of healthcare and health science data by allowing more access to centralized cloud computer facilities to store, analyze, search and share relevant data, to enhance standardization, and to reduce potential duplications of effort. We recommend that cloud computing be considered by DoD biomedical researchers for increasing connectivity, presumably by facilitating communications and data sharing, among the various intra- and extramural laboratories. We also recommend that policies and other guidances be updated to include developing additional metrics that will help stakeholders evaluate the above mentioned assumptions and expectations. Published by Oxford University Press on behalf of the Association of Military Surgeons of the United States 2018. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Private and Efficient Query Processing on Outsourced Genomic Databases.
Ghasemi, Reza; Al Aziz, Md Momin; Mohammed, Noman; Dehkordi, Massoud Hadian; Jiang, Xiaoqian
2017-09-01
Applications of genomic studies are spreading rapidly in many domains of science and technology such as healthcare, biomedical research, direct-to-consumer services, and legal and forensic. However, there are a number of obstacles that make it hard to access and process a big genomic database for these applications. First, sequencing genomic sequence is a time consuming and expensive process. Second, it requires large-scale computation and storage systems to process genomic sequences. Third, genomic databases are often owned by different organizations, and thus, not available for public usage. Cloud computing paradigm can be leveraged to facilitate the creation and sharing of big genomic databases for these applications. Genomic data owners can outsource their databases in a centralized cloud server to ease the access of their databases. However, data owners are reluctant to adopt this model, as it requires outsourcing the data to an untrusted cloud service provider that may cause data breaches. In this paper, we propose a privacy-preserving model for outsourcing genomic data to a cloud. The proposed model enables query processing while providing privacy protection of genomic databases. Privacy of the individuals is guaranteed by permuting and adding fake genomic records in the database. These techniques allow cloud to evaluate count and top-k queries securely and efficiently. Experimental results demonstrate that a count and a top-k query over 40 Single Nucleotide Polymorphisms (SNPs) in a database of 20 000 records takes around 100 and 150 s, respectively.
Private and Efficient Query Processing on Outsourced Genomic Databases
Ghasemi, Reza; Al Aziz, Momin; Mohammed, Noman; Dehkordi, Massoud Hadian; Jiang, Xiaoqian
2017-01-01
Applications of genomic studies are spreading rapidly in many domains of science and technology such as healthcare, biomedical research, direct-to-consumer services, and legal and forensic. However, there are a number of obstacles that make it hard to access and process a big genomic database for these applications. First, sequencing genomic sequence is a time-consuming and expensive process. Second, it requires large-scale computation and storage systems to processes genomic sequences. Third, genomic databases are often owned by different organizations and thus not available for public usage. Cloud computing paradigm can be leveraged to facilitate the creation and sharing of big genomic databases for these applications. Genomic data owners can outsource their databases in a centralized cloud server to ease the access of their databases. However, data owners are reluctant to adopt this model, as it requires outsourcing the data to an untrusted cloud service provider that may cause data breaches. In this paper, we propose a privacy-preserving model for outsourcing genomic data to a cloud. The proposed model enables query processing while providing privacy protection of genomic databases. Privacy of the individuals is guaranteed by permuting and adding fake genomic records in the database. These techniques allow cloud to evaluate count and top-k queries securely and efficiently. Experimental results demonstrate that a count and a top-k query over 40 SNPs in a database of 20,000 records takes around 100 and 150 seconds, respectively. PMID:27834660
Key Lessons in Building "Data Commons": The Open Science Data Cloud Ecosystem
NASA Astrophysics Data System (ADS)
Patterson, M.; Grossman, R.; Heath, A.; Murphy, M.; Wells, W.
2015-12-01
Cloud computing technology has created a shift around data and data analysis by allowing researchers to push computation to data as opposed to having to pull data to an individual researcher's computer. Subsequently, cloud-based resources can provide unique opportunities to capture computing environments used both to access raw data in its original form and also to create analysis products which may be the source of data for tables and figures presented in research publications. Since 2008, the Open Cloud Consortium (OCC) has operated the Open Science Data Cloud (OSDC), which provides scientific researchers with computational resources for storing, sharing, and analyzing large (terabyte and petabyte-scale) scientific datasets. OSDC has provided compute and storage services to over 750 researchers in a wide variety of data intensive disciplines. Recently, internal users have logged about 2 million core hours each month. The OSDC also serves the research community by colocating these resources with access to nearly a petabyte of public scientific datasets in a variety of fields also accessible for download externally by the public. In our experience operating these resources, researchers are well served by "data commons," meaning cyberinfrastructure that colocates data archives, computing, and storage infrastructure and supports essential tools and services for working with scientific data. In addition to the OSDC public data commons, the OCC operates a data commons in collaboration with NASA and is developing a data commons for NOAA datasets. As cloud-based infrastructures for distributing and computing over data become more pervasive, we ask, "What does it mean to publish data in a data commons?" Here we present the OSDC perspective and discuss several services that are key in architecting data commons, including digital identifier services.
NASA Technical Reports Server (NTRS)
Habermann, Ted; Gallagher, James; Jelenak, Aleksandar; Potter, Nathan; Lee, Joe; Yang, Kent
2017-01-01
This study explored three candidate architectures with different types of objects and access paths for serving NASA Earth Science HDF5 data via Hyrax running on Amazon Web Services (AWS). We studied the cost and performance for each architecture using several representative Use-Cases. The objectives of the study were: Conduct a trade study to identify one or more high performance integrated solutions for storing and retrieving NASA HDF5 and netCDF4 data in a cloud (web object store) environment. The target environment is Amazon Web Services (AWS) Simple Storage Service (S3). Conduct needed level of software development to properly evaluate solutions in the trade study and to obtain required benchmarking metrics for input into government decision of potential follow-on prototyping. Develop a cloud cost model for the preferred data storage solution (or solutions) that accounts for different granulation and aggregation schemes as well as cost and performance trades.We will describe the three architectures and the use cases along with performance results and recommendations for further work.
Archiving and access systems for remote sensing: Chapter 6
Faundeen, John L.; Percivall, George; Baros, Shirley; Baumann, Peter; Becker, Peter H.; Behnke, J.; Benedict, Karl; Colaiacomo, Lucio; Di, Liping; Doescher, Chris; Dominguez, J.; Edberg, Roger; Ferguson, Mark; Foreman, Stephen; Giaretta, David; Hutchison, Vivian; Ip, Alex; James, N.L.; Khalsa, Siri Jodha S.; Lazorchak, B.; Lewis, Adam; Li, Fuqin; Lymburner, Leo; Lynnes, C.S.; Martens, Matt; Melrose, Rachel; Morris, Steve; Mueller, Norman; Navale, Vivek; Navulur, Kumar; Newman, D.J.; Oliver, Simon; Purss, Matthew; Ramapriyan, H.K.; Rew, Russ; Rosen, Michael; Savickas, John; Sixsmith, Joshua; Sohre, Tom; Thau, David; Uhlir, Paul; Wang, Lan-Wei; Young, Jeff
2016-01-01
Focuses on major developments inaugurated by the Committee on Earth Observation Satellites, the Group on Earth Observations System of Systems, and the International Council for Science World Data System at the global level; initiatives at national levels to create data centers (e.g. the National Aeronautics and Space Administration (NASA) Distributed Active Archive Centers and other international space agency counterparts), and non-government systems (e.g. Center for International Earth Science Information Network). Other major elements focus on emerging tool sets, requirements for metadata, data storage and refresh methods, the rise of cloud computing, and questions about what and how much data should be saved. The sub-sections of the chapter address topics relevant to the science, engineering and standards used for state-of-the-art operational and experimental systems.
Data Mining as a Service (DMaaS)
NASA Astrophysics Data System (ADS)
Tejedor, E.; Piparo, D.; Mascetti, L.; Moscicki, J.; Lamanna, M.; Mato, P.
2016-10-01
Data Mining as a Service (DMaaS) is a software and computing infrastructure that allows interactive mining of scientific data in the cloud. It allows users to run advanced data analyses by leveraging the widely adopted Jupyter notebook interface. Furthermore, the system makes it easier to share results and scientific code, access scientific software, produce tutorials and demonstrations as well as preserve the analyses of scientists. This paper describes how a first pilot of the DMaaS service is being deployed at CERN, starting from the notebook interface that has been fully integrated with the ROOT analysis framework, in order to provide all the tools for scientists to run their analyses. Additionally, we characterise the service backend, which combines a set of IT services such as user authentication, virtual computing infrastructure, mass storage, file synchronisation, development portals or batch systems. The added value acquired by the combination of the aforementioned categories of services is discussed, focusing on the opportunities offered by the CERNBox synchronisation service and its massive storage backend, EOS.
Buffer thermal energy storage for a solar thermal powered 1-MW sub e electrical plant
NASA Astrophysics Data System (ADS)
Polzien, R. E.
The application of a latent heat thermal energy buffer storage (TEBS) subsystem to the small community solar thermal power experiment (SCSE) is discussed. The SCSE is a 1-MW sub e solar thermal electric plant consisting of multiple paraboloidal concentrators with an organic Rankine cycle power conversion unit mounted at the focus of each concentrator. Objective of the TEBS is to minimize plant shutdowns during intermittent cloud coverage thereby improving life expectancy of major subsystems. An SCSE plant performance model is used with time varying insolation to show that 70 to 80 percent of the potential engine shutdowns may be averted with the TEBS system. Parametric variation of engine life dependency on start/stop cycles shows the potential for a 4 percent reduction in levelized bus bar energy cost using TEBS.
Effect of pectin methylesterase on carrot (Daucus carota) juice cloud stability.
Schultz, Alison K; Anthon, Gordon E; Dungan, Stephanie R; Barrett, Diane M
2014-02-05
To determine the effect of residual enzyme activity on carrot juice cloud, 0 to 1 U/g pectin methylesterase (PME) was added to pasteurized carrot juice. Cloud stability and particle diameters were measured to quantify juice cloud stability and clarification for 56 days of storage. All levels of PME addition resulted in clarification; higher amounts had a modest effect in causing more rapid clarification, due to a faster increase in particle size. The cloud initially exhibited a trimodal distribution of particle sizes. For enzyme-containing samples, particles in the smallest-sized mode initially aggregated to merge with the second peak over 5-10 days. This larger population then continued to aggregate more slowly over longer times. This observation of a more rapid destabilization process initially, followed by slower subsequent changes in the cloud, was also manifested in measurements of sedimentation extent and in turbidity tests. Optical microscopy showed that aggregation created elongated, fractal particle structures over time.
RAIN: A Bio-Inspired Communication and Data Storage Infrastructure.
Monti, Matteo; Rasmussen, Steen
2017-01-01
We summarize the results and perspectives from a companion article, where we presented and evaluated an alternative architecture for data storage in distributed networks. We name the bio-inspired architecture RAIN, and it offers file storage service that, in contrast with current centralized cloud storage, has privacy by design, is open source, is more secure, is scalable, is more sustainable, has community ownership, is inexpensive, and is potentially faster, more efficient, and more reliable. We propose that a RAIN-style architecture could form the backbone of the Internet of Things that likely will integrate multiple current and future infrastructures ranging from online services and cryptocurrency to parts of government administration.
A performance study of WebDav access to storages within the Belle II collaboration
NASA Astrophysics Data System (ADS)
Pardi, S.; Russo, G.
2017-10-01
WebDav and HTTP are becoming popular protocols for data access in the High Energy Physics community. The most used Grid and Cloud storage solutions provide such kind of interfaces, in this scenario tuning and performance evaluation became crucial aspects to promote the adoption of these protocols within the Belle II community. In this work, we present the results of a large-scale test activity, made with the goal to evaluate performances and reliability of the WebDav protocol, and study a possible adoption for the user analysis. More specifically, we considered a pilot infrastructure composed by a set of storage elements configured with the WebDav interface, hosted at the Belle II sites. The performance tests include a comparison with xrootd and gridftp. As reference tests we used a set of analysis jobs running under the Belle II software framework, accessing the input data with the ROOT I/O library, in order to simulate as much as possible a realistic user activity. The final analysis shows the possibility to achieve promising performances with WebDav on different storage systems, and gives an interesting feedback, for Belle II community and for other high energy physics experiments.
Flexible services for the support of research.
Turilli, Matteo; Wallom, David; Williams, Chris; Gough, Steve; Curran, Neal; Tarrant, Richard; Bretherton, Dan; Powell, Andy; Johnson, Matt; Harmer, Terry; Wright, Peter; Gordon, John
2013-01-28
Cloud computing has been increasingly adopted by users and providers to promote a flexible, scalable and tailored access to computing resources. Nonetheless, the consolidation of this paradigm has uncovered some of its limitations. Initially devised by corporations with direct control over large amounts of computational resources, cloud computing is now being endorsed by organizations with limited resources or with a more articulated, less direct control over these resources. The challenge for these organizations is to leverage the benefits of cloud computing while dealing with limited and often widely distributed computing resources. This study focuses on the adoption of cloud computing by higher education institutions and addresses two main issues: flexible and on-demand access to a large amount of storage resources, and scalability across a heterogeneous set of cloud infrastructures. The proposed solutions leverage a federated approach to cloud resources in which users access multiple and largely independent cloud infrastructures through a highly customizable broker layer. This approach allows for a uniform authentication and authorization infrastructure, a fine-grained policy specification and the aggregation of accounting and monitoring. Within a loosely coupled federation of cloud infrastructures, users can access vast amount of data without copying them across cloud infrastructures and can scale their resource provisions when the local cloud resources become insufficient.
NASA Astrophysics Data System (ADS)
Fedorov, D.; Miller, R. J.; Kvilekval, K. G.; Doheny, B.; Sampson, S.; Manjunath, B. S.
2016-02-01
Logistical and financial limitations of underwater operations are inherent in marine science, including biodiversity observation. Imagery is a promising way to address these challenges, but the diversity of organisms thwarts simple automated analysis. Recent developments in computer vision methods, such as convolutional neural networks (CNN), are promising for automated classification and detection tasks but are typically very computationally expensive and require extensive training on large datasets. Therefore, managing and connecting distributed computation, large storage and human annotations of diverse marine datasets is crucial for effective application of these methods. BisQue is a cloud-based system for management, annotation, visualization, analysis and data mining of underwater and remote sensing imagery and associated data. Designed to hide the complexity of distributed storage, large computational clusters, diversity of data formats and inhomogeneous computational environments behind a user friendly web-based interface, BisQue is built around an idea of flexible and hierarchical annotations defined by the user. Such textual and graphical annotations can describe captured attributes and the relationships between data elements. Annotations are powerful enough to describe cells in fluorescent 4D images, fish species in underwater videos and kelp beds in aerial imagery. Presently we are developing BisQue-based analysis modules for automated identification of benthic marine organisms. Recent experiments with drop-out and CNN based classification of several thousand annotated underwater images demonstrated an overall accuracy above 70% for the 15 best performing species and above 85% for the top 5 species. Based on these promising results, we have extended bisque with a CNN-based classification system allowing continuous training on user-provided data.
Secure public cloud platform for medical images sharing.
Pan, Wei; Coatrieux, Gouenou; Bouslimi, Dalel; Prigent, Nicolas
2015-01-01
Cloud computing promises medical imaging services offering large storage and computing capabilities for limited costs. In this data outsourcing framework, one of the greatest issues to deal with is data security. To do so, we propose to secure a public cloud platform devoted to medical image sharing by defining and deploying a security policy so as to control various security mechanisms. This policy stands on a risk assessment we conducted so as to identify security objectives with a special interest for digital content protection. These objectives are addressed by means of different security mechanisms like access and usage control policy, partial-encryption and watermarking.
BAMSI: a multi-cloud service for scalable distributed filtering of massive genome data.
Ausmees, Kristiina; John, Aji; Toor, Salman Z; Hellander, Andreas; Nettelblad, Carl
2018-06-26
The advent of next-generation sequencing (NGS) has made whole-genome sequencing of cohorts of individuals a reality. Primary datasets of raw or aligned reads of this sort can get very large. For scientific questions where curated called variants are not sufficient, the sheer size of the datasets makes analysis prohibitively expensive. In order to make re-analysis of such data feasible without the need to have access to a large-scale computing facility, we have developed a highly scalable, storage-agnostic framework, an associated API and an easy-to-use web user interface to execute custom filters on large genomic datasets. We present BAMSI, a Software as-a Service (SaaS) solution for filtering of the 1000 Genomes phase 3 set of aligned reads, with the possibility of extension and customization to other sets of files. Unique to our solution is the capability of simultaneously utilizing many different mirrors of the data to increase the speed of the analysis. In particular, if the data is available in private or public clouds - an increasingly common scenario for both academic and commercial cloud providers - our framework allows for seamless deployment of filtering workers close to data. We show results indicating that such a setup improves the horizontal scalability of the system, and present a possible use case of the framework by performing an analysis of structural variation in the 1000 Genomes data set. BAMSI constitutes a framework for efficient filtering of large genomic data sets that is flexible in the use of compute as well as storage resources. The data resulting from the filter is assumed to be greatly reduced in size, and can easily be downloaded or routed into e.g. a Hadoop cluster for subsequent interactive analysis using Hive, Spark or similar tools. In this respect, our framework also suggests a general model for making very large datasets of high scientific value more accessible by offering the possibility for organizations to share the cost of hosting data on hot storage, without compromising the scalability of downstream analysis.
Data distribution method of workflow in the cloud environment
NASA Astrophysics Data System (ADS)
Wang, Yong; Wu, Junjuan; Wang, Ying
2017-08-01
Cloud computing for workflow applications provides the required high efficiency calculation and large storage capacity and it also brings challenges to the protection of trade secrets and other privacy data. Because of privacy data will cause the increase of the data transmission time, this paper presents a new data allocation algorithm based on data collaborative damage degree, to improve the existing data allocation strategy? Safety and public cloud computer algorithm depends on the private cloud; the static allocation method in the initial stage only to the non-confidential data division to improve the original data, in the operational phase will continue to generate data to dynamically adjust the data distribution scheme. The experimental results show that the improved method is effective in reducing the data transmission time.
Monitoring Reservoir Storage in South Asia from Satellite Remote Sensing
NASA Astrophysics Data System (ADS)
Zhang, S.; Gao, H.; Naz, B.
2013-12-01
Realtime reservoir storage information is essential for accurate flood monitoring and prediction in South Asia, where the fatality rate (by area) due to floods is among the highest in the world. However, South Asia is dominated by international river basins where communications among neighboring countries about reservoir storage and management are extremely limited. In this study, we use a suite of NASA satellite observations to achieve high quality estimation of reservoir storage and storage variations at near realtime in South Asia. The monitoring approach employs vegetation indices from the Moderate Resolution Imaging Spectroradiometer (MODIS) 16-day 250 m MOD13Q1 product and the surface elevation data from the Geoscience Laser Altimeter System (GLAS) on board the Ice, Cloud and land Elevation Satellite (ICESat). This approach contains four steps: 1) identifying the reservoirs with ICESat GLAS overpasses and extracting the elevation data for these locations; 2) using the K-means method for water classification from MODIS andapplying a novel post-classification algorithm to enhance water area estimation accuracy; 3) deriving the relationship between the MODIS water surface area and the ICESat elevation; and 4) estimating the storage of reservoirs over time based on the elevation-area relationship and the MODIS water area time series. For evaluation purposes, we compared the satellite-based reservoir storage with gauge observations for 16 reservoirs in South Asia. The storage estimates were highly correlated with observations (R = 0.92 to 0.98), with values for the normalized root mean square error (NRMSE) ranging from 8.7% to 25.2%. Using this approach, storage and storage variations were estimated for 16 South Asia reservoirs from 2000 to 2012.
Seqcrawler: biological data indexing and browsing platform.
Sallou, Olivier; Bretaudeau, Anthony; Roult, Aurelien
2012-07-24
Seqcrawler takes its roots in software like SRS or Lucegene. It provides an indexing platform to ease the search of data and meta-data in biological banks and it can scale to face the current flow of data. While many biological bank search tools are available on the Internet, mainly provided by large organizations to search their data, there is a lack of free and open source solutions to browse one's own set of data with a flexible query system and able to scale from a single computer to a cloud system. A personal index platform will help labs and bioinformaticians to search their meta-data but also to build a larger information system with custom subsets of data. The software is scalable from a single computer to a cloud-based infrastructure. It has been successfully tested in a private cloud with 3 index shards (pieces of index) hosting ~400 millions of sequence information (whole GenBank, UniProt, PDB and others) for a total size of 600 GB in a fault tolerant architecture (high-availability). It has also been successfully integrated with software to add extra meta-data from blast results to enhance users' result analysis. Seqcrawler provides a complete open source search and store solution for labs or platforms needing to manage large amount of data/meta-data with a flexible and customizable web interface. All components (search engine, visualization and data storage), though independent, share a common and coherent data system that can be queried with a simple HTTP interface. The solution scales easily and can also provide a high availability infrastructure.
Estai, Mohamed; Kanagasingam, Yogesan; Xiao, Di; Vignarajan, Janardhan; Huang, Boyan; Kruger, Estie; Tennant, Marc
2016-09-01
It is widely considered that telemedicine can make positive contributions to dental practice. This study aimed to evaluate a cloud-based telemedicine application for screening for oral diseases. A telemedicine system, based on a store-and-forward method, was developed to work as a platform for data storage. An Android application was developed to facilitate entering demographic details and capturing oral photos. As a proof-of-concept, six volunteers were enrolled in a trial to obtain oral images using smartphone cameras. Following an onsite oral examination, images of participants' teeth were obtained by a trained dental assistant. Oral images were directly uploaded from the smartphone to a cloud-based server via broadband network. The assessments of oral images by offsite dentists were compared with those carried out via face-to-face oral examinations. A complete set of 30 oral images was obtained from all six participants. Out of 192 teeth reviewed, the proportion of ungradable teeth was 8%. Sensitivity and specificity of teledental screening were 57% and 100% respectively. The inter-grader agreement estimated for two examination modalities and between two teledental graders was 70% and 62% respectively. Findings indicate that the proposed system for screening of oral diseases can be implemented to provide a valid and reliable alternative to traditional oral screening. This study provided evidence that a robust system for store-and-forward screening for dental problems can be developed, and leads to the need for further testing of its robustness to confirm the accuracy and reliability of the teledentistry system. © The Author(s) 2015.
A Cloud-based Infrastructure and Architecture for Environmental System Research
NASA Astrophysics Data System (ADS)
Wang, D.; Wei, Y.; Shankar, M.; Quigley, J.; Wilson, B. E.
2016-12-01
The present availability of high-capacity networks, low-cost computers and storage devices, and the widespread adoption of hardware virtualization and service-oriented architecture provide a great opportunity to enable data and computing infrastructure sharing between closely related research activities. By taking advantage of these approaches, along with the world-class high computing and data infrastructure located at Oak Ridge National Laboratory, a cloud-based infrastructure and architecture has been developed to efficiently deliver essential data and informatics service and utilities to the environmental system research community, and will provide unique capabilities that allows terrestrial ecosystem research projects to share their software utilities (tools), data and even data submission workflow in a straightforward fashion. The infrastructure will minimize large disruptions from current project-based data submission workflows for better acceptances from existing projects, since many ecosystem research projects already have their own requirements or preferences for data submission and collection. The infrastructure will eliminate scalability problems with current project silos by provide unified data services and infrastructure. The Infrastructure consists of two key components (1) a collection of configurable virtual computing environments and user management systems that expedite data submission and collection from environmental system research community, and (2) scalable data management services and system, originated and development by ORNL data centers.
NASA Astrophysics Data System (ADS)
Crinière, Antoine; Dumoulin, Jean; Mevel, Laurent; Andrade-Barroso, Guillermo
2016-04-01
Since late 2014, the project Cloud2SM aims to develop a robust information system able to assess the long term monitoring of civil engineering structures as well as interfacing various sensors and data. Cloud2SM address three main goals, the management of distributed data and sensors network, the asynchronous processing of the data through network and the local management of the sensors themselves [1]. Integrated to this project Cloud2IR is an autonomous sensor system dedicated to the long term monitoring of infrastructures. Past experimentations have shown the need as well as usefulness of such system [2]. Before Cloud2IR an initially laboratory oriented system was used, which implied heavy operating system to be used [3]. Based on such system Cloud2IR has benefited of the experimental knowledge acquired to redefine a lighter architecture based on generics standards, more appropriated to autonomous operations on field and which can be later included in a wide distributed architecture such as Cloud2SM. The sensor system can be divided in two parts. The sensor side, this part is mainly composed by the various sensors drivers themselves as the infrared camera, the weather station or the pyranometers and their different fixed configurations. In our case, as infrared camera are slightly different than other kind of sensors, the system implement in addition an RTSP server which can be used to set up the FOV as well as other measurement parameter considerations. The second part can be seen as the data side, which is common to all sensors. It instantiate through a generic interface all the sensors and control the data access loop (not the requesting). This side of the system is weakly coupled (see data coupling) with the sensor side. It can be seen as a general framework able to aggregate any sensor data, type or size and automatically encapsulate them in various generic data format as HDF5 or cloud data as OGC SWE standard. This whole part is also responsible of the acquisition scenario the local storage management and the network management through SFTP or SOAP for the OGC frame. The data side only need an XML configuration file and if a configuration change occurs in time the system is automatically restarted with the new value. Cloud2IR has been deployed on field since several Monthat the SenseCity outdoor test bed in Marne La Vallée (France)[4]. The next step will be the full standardisation of the system and possibly the full separation between the sensor side and the data side which can be seen at term as an external framework. References: [1] A Crinière, J Dumoulin, L Mevel, G Andrade-Barosso, M Simonin. The Cloud2SM Project.European Geosciences Union General Assembly (EGU2015), Apr 2015, Vienne, Austria. 2015.
OS2: Oblivious similarity based searching for encrypted data outsourced to an untrusted domain
Pervez, Zeeshan; Ahmad, Mahmood; Khattak, Asad Masood; Ramzan, Naeem
2017-01-01
Public cloud storage services are becoming prevalent and myriad data sharing, archiving and collaborative services have emerged which harness the pay-as-you-go business model of public cloud. To ensure privacy and confidentiality often encrypted data is outsourced to such services, which further complicates the process of accessing relevant data by using search queries. Search over encrypted data schemes solve this problem by exploiting cryptographic primitives and secure indexing to identify outsourced data that satisfy the search criteria. Almost all of these schemes rely on exact matching between the encrypted data and search criteria. A few schemes which extend the notion of exact matching to similarity based search, lack realism as those schemes rely on trusted third parties or due to increase storage and computational complexity. In this paper we propose Oblivious Similarity based Search (OS2) for encrypted data. It enables authorized users to model their own encrypted search queries which are resilient to typographical errors. Unlike conventional methodologies, OS2 ranks the search results by using similarity measure offering a better search experience than exact matching. It utilizes encrypted bloom filter and probabilistic homomorphic encryption to enable authorized users to access relevant data without revealing results of search query evaluation process to the untrusted cloud service provider. Encrypted bloom filter based search enables OS2 to reduce search space to potentially relevant encrypted data avoiding unnecessary computation on public cloud. The efficacy of OS2 is evaluated on Google App Engine for various bloom filter lengths on different cloud configurations. PMID:28692697
Pervez, Zeeshan; Ahmad, Mahmood; Khattak, Asad Masood; Ramzan, Naeem; Khan, Wajahat Ali
2017-01-01
Public cloud storage services are becoming prevalent and myriad data sharing, archiving and collaborative services have emerged which harness the pay-as-you-go business model of public cloud. To ensure privacy and confidentiality often encrypted data is outsourced to such services, which further complicates the process of accessing relevant data by using search queries. Search over encrypted data schemes solve this problem by exploiting cryptographic primitives and secure indexing to identify outsourced data that satisfy the search criteria. Almost all of these schemes rely on exact matching between the encrypted data and search criteria. A few schemes which extend the notion of exact matching to similarity based search, lack realism as those schemes rely on trusted third parties or due to increase storage and computational complexity. In this paper we propose Oblivious Similarity based Search ([Formula: see text]) for encrypted data. It enables authorized users to model their own encrypted search queries which are resilient to typographical errors. Unlike conventional methodologies, [Formula: see text] ranks the search results by using similarity measure offering a better search experience than exact matching. It utilizes encrypted bloom filter and probabilistic homomorphic encryption to enable authorized users to access relevant data without revealing results of search query evaluation process to the untrusted cloud service provider. Encrypted bloom filter based search enables [Formula: see text] to reduce search space to potentially relevant encrypted data avoiding unnecessary computation on public cloud. The efficacy of [Formula: see text] is evaluated on Google App Engine for various bloom filter lengths on different cloud configurations.
Simple re-instantiation of small databases using cloud computing.
Tan, Tin Wee; Xie, Chao; De Silva, Mark; Lim, Kuan Siong; Patro, C Pawan K; Lim, Shen Jean; Govindarajan, Kunde Ramamoorthy; Tong, Joo Chuan; Choo, Khar Heng; Ranganathan, Shoba; Khan, Asif M
2013-01-01
Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear.
Simple re-instantiation of small databases using cloud computing
2013-01-01
Background Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. Results We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Conclusions Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear. PMID:24564380
The Czech National Grid Infrastructure
NASA Astrophysics Data System (ADS)
Chudoba, J.; Křenková, I.; Mulač, M.; Ruda, M.; Sitera, J.
2017-10-01
The Czech National Grid Infrastructure is operated by MetaCentrum, a CESNET department responsible for coordinating and managing activities related to distributed computing. CESNET as the Czech National Research and Education Network (NREN) provides many e-infrastructure services, which are used by 94% of the scientific and research community in the Czech Republic. Computing and storage resources owned by different organizations are connected by fast enough network to provide transparent access to all resources. We describe in more detail the computing infrastructure, which is based on several different technologies and covers grid, cloud and map-reduce environment. While the largest part of CPUs is still accessible via distributed torque servers, providing environment for long batch jobs, part of infrastructure is available via standard EGI tools in EGI, subset of NGI resources is provided into EGI FedCloud environment with cloud interface and there is also Hadoop cluster provided by the same e-infrastructure.A broad spectrum of computing servers is offered; users can choose from standard 2 CPU servers to large SMP machines with up to 6 TB of RAM or servers with GPU cards. Different groups have different priorities on various resources, resource owners can even have an exclusive access. The software is distributed via AFS. Storage servers offering up to tens of terabytes of disk space to individual users are connected via NFS4 on top of GPFS and access to long term HSM storage with peta-byte capacity is also provided. Overview of available resources and recent statistics of usage will be given.
NASA Technical Reports Server (NTRS)
Starr, David O'C.; Benedetti, Angela; Boehm, Matt; Brown, Philip R. A.; Gierens, Klaus M.; Girard, Eric; Giraud, Vincent; Jakob, Christian; Jensen, Eric
2000-01-01
The GEWEX Cloud System Study (GCSS, GEWEX is the Global Energy and Water Cycle Experiment) is a community activity aiming to promote development of improved cloud parameterizations for application in the large-scale general circulation models (GCMs) used for climate research and for numerical weather prediction. The GCSS strategy is founded upon the use of cloud-system models (CSMs). These are "process" models with sufficient spatial and temporal resolution to represent individual cloud elements, but spanning a wide range of space and time scales to enable statistical analysis of simulated cloud systems. GCSS also employs single-column versions of the parametric cloud models (SCMs) used in GCMs. GCSS has working groups on boundary-layer clouds, cirrus clouds, extratropical layer cloud systems, precipitating deep convective cloud systems, and polar clouds.
A review on the state-of-the-art privacy-preserving approaches in the e-health clouds.
Abbas, Assad; Khan, Samee U
2014-07-01
Cloud computing is emerging as a new computing paradigm in the healthcare sector besides other business domains. Large numbers of health organizations have started shifting the electronic health information to the cloud environment. Introducing the cloud services in the health sector not only facilitates the exchange of electronic medical records among the hospitals and clinics, but also enables the cloud to act as a medical record storage center. Moreover, shifting to the cloud environment relieves the healthcare organizations of the tedious tasks of infrastructure management and also minimizes development and maintenance costs. Nonetheless, storing the patient health data in the third-party servers also entails serious threats to data privacy. Because of probable disclosure of medical records stored and exchanged in the cloud, the patients' privacy concerns should essentially be considered when designing the security and privacy mechanisms. Various approaches have been used to preserve the privacy of the health information in the cloud environment. This survey aims to encompass the state-of-the-art privacy-preserving approaches employed in the e-Health clouds. Moreover, the privacy-preserving approaches are classified into cryptographic and noncryptographic approaches and taxonomy of the approaches is also presented. Furthermore, the strengths and weaknesses of the presented approaches are reported and some open issues are highlighted.
A Custom Approach for a Flexible, Real-Time and Reliable Software Defined Utility.
Zaballos, Agustín; Navarro, Joan; Martín De Pozuelo, Ramon
2018-02-28
Information and communication technologies (ICTs) have enabled the evolution of traditional electric power distribution networks towards a new paradigm referred to as the smart grid. However, the different elements that compose the ICT plane of a smart grid are usually conceived as isolated systems that typically result in rigid hardware architectures, which are hard to interoperate, manage and adapt to new situations. In the recent years, software-defined systems that take advantage of software and high-speed data network infrastructures have emerged as a promising alternative to classic ad hoc approaches in terms of integration, automation, real-time reconfiguration and resource reusability. The purpose of this paper is to propose the usage of software-defined utilities (SDUs) to address the latent deployment and management limitations of smart grids. More specifically, the implementation of a smart grid's data storage and management system prototype by means of SDUs is introduced, which exhibits the feasibility of this alternative approach. This system features a hybrid cloud architecture able to meet the data storage requirements of electric utilities and adapt itself to their ever-evolving needs. Conducted experimentations endorse the feasibility of this solution and encourage practitioners to point their efforts in this direction.
A Custom Approach for a Flexible, Real-Time and Reliable Software Defined Utility
2018-01-01
Information and communication technologies (ICTs) have enabled the evolution of traditional electric power distribution networks towards a new paradigm referred to as the smart grid. However, the different elements that compose the ICT plane of a smart grid are usually conceived as isolated systems that typically result in rigid hardware architectures, which are hard to interoperate, manage and adapt to new situations. In the recent years, software-defined systems that take advantage of software and high-speed data network infrastructures have emerged as a promising alternative to classic ad hoc approaches in terms of integration, automation, real-time reconfiguration and resource reusability. The purpose of this paper is to propose the usage of software-defined utilities (SDUs) to address the latent deployment and management limitations of smart grids. More specifically, the implementation of a smart grid’s data storage and management system prototype by means of SDUs is introduced, which exhibits the feasibility of this alternative approach. This system features a hybrid cloud architecture able to meet the data storage requirements of electric utilities and adapt itself to their ever-evolving needs. Conducted experimentations endorse the feasibility of this solution and encourage practitioners to point their efforts in this direction. PMID:29495599
NASA Technical Reports Server (NTRS)
Starr, David OC.; Benedetti, Angela; Boehm, Matt; Brown, Philip R. A.; Gierens, Klaus M.; Girard, Eric; Giraud, Vincent; Jakob, Christian; Jensen, Eric; Khvorostyanov, Vitaly;
2000-01-01
The GEWEX Cloud System Study (GCSS, GEWEX is the Global Energy and Water Cycle Experiment) is a community activity aiming to promote development of improved cloud parameterizations for application in the large-scale general circulation models (GCMs) used for climate research and for numerical weather prediction (Browning et al, 1994). The GCSS strategy is founded upon the use of cloud-system models (CSMs). These are "process" models with sufficient spatial and temporal resolution to represent individual cloud elements, but spanning a wide range of space and time scales to enable statistical analysis of simulated cloud systems. GCSS also employs single-column versions of the parametric cloud models (SCMs) used in GCMs. GCSS has working groups on boundary-layer clouds, cirrus clouds, extratropical layer cloud systems, precipitating deep convective cloud systems, and polar clouds.
Williams, Patricia A H
2013-01-01
It is no small task to manage the protection of healthcare data and healthcare information systems. In an environment that is demanding adaptation to change for all information collection, storage and retrieval systems, including those for of e-health and information systems, it is imperative that good information security governance is in place. This includes understanding and meeting legislative and regulatory requirements. This chapter provides three models to educate and guide organisations in this complex area, and to simplify the process of information security governance and ensure appropriate and effective measures are put in place. The approach is risk based, adapted and contextualized for healthcare. In addition, specific considerations of the impact of cloud services, secondary use of data, big data and mobile health are discussed.
Research and Development of Laser Diode Based Instruments for Applications in Space
NASA Technical Reports Server (NTRS)
Krainak, Michael; Abshire, James; Cornwell, Donald; Dragic, Peter; Duerksen, Gary; Switzer, Gregg
1999-01-01
Laser diode technology continues to advance at a very rapid rate due to commercial applications such as telecommunications and data storage. The advantages of laser diodes include, wide diversity of wavelengths, high efficiency, small size and weight and high reliability. Semiconductor and fiber optical-amplifiers permit efficient, high power master oscillator power amplifier (MOPA) transmitter systems. Laser diode systems which incorporate monolithic or discrete (fiber optic) gratings permit single frequency operation. We describe experimental and theoretical results of laser diode based instruments currently under development at NASA Goddard Space Flight Center including miniature lidars for measuring clouds and aerosols, water vapor and wind for Earth and planetary (Mars Lander) use.
Klonoff, David C
2017-07-01
The Internet of Things (IoT) is generating an immense volume of data. With cloud computing, medical sensor and actuator data can be stored and analyzed remotely by distributed servers. The results can then be delivered via the Internet. The number of devices in IoT includes such wireless diabetes devices as blood glucose monitors, continuous glucose monitors, insulin pens, insulin pumps, and closed-loop systems. The cloud model for data storage and analysis is increasingly unable to process the data avalanche, and processing is being pushed out to the edge of the network closer to where the data-generating devices are. Fog computing and edge computing are two architectures for data handling that can offload data from the cloud, process it nearby the patient, and transmit information machine-to-machine or machine-to-human in milliseconds or seconds. Sensor data can be processed near the sensing and actuating devices with fog computing (with local nodes) and with edge computing (within the sensing devices). Compared to cloud computing, fog computing and edge computing offer five advantages: (1) greater data transmission speed, (2) less dependence on limited bandwidths, (3) greater privacy and security, (4) greater control over data generated in foreign countries where laws may limit use or permit unwanted governmental access, and (5) lower costs because more sensor-derived data are used locally and less data are transmitted remotely. Connected diabetes devices almost all use fog computing or edge computing because diabetes patients require a very rapid response to sensor input and cannot tolerate delays for cloud computing.
NASA Astrophysics Data System (ADS)
Hao, Huadong; Shi, Haolei; Yi, Pengju; Liu, Ying; Li, Cunjun; Li, Shuguang
2018-01-01
A Volume Metrology method based on Internal Electro-optical Distance-ranging method is established for large vertical energy storage tank. After analyzing the vertical tank volume calculation mathematical model, the key processing algorithms, such as gross error elimination, filtering, streamline, and radius calculation are studied for the point cloud data. The corresponding volume values are automatically calculated in the different liquids by calculating the cross-sectional area along the horizontal direction and integrating from vertical direction. To design the comparison system, a vertical tank which the nominal capacity is 20,000 m3 is selected as the research object, and there are shown that the method has good repeatability and reproducibility. Through using the conventional capacity measurement method as reference, the relative deviation of calculated volume is less than 0.1%, meeting the measurement requirements. And the feasibility and effectiveness are demonstrated.
Karapiperis, Christos; Kempf, Stefan J; Quintens, Roel; Azimzadeh, Omid; Vidal, Victoria Linares; Pazzaglia, Simonetta; Bazyka, Dimitry; Mastroberardino, Pier G; Scouras, Zacharias G; Tapio, Soile; Benotmane, Mohammed Abderrafi; Ouzounis, Christos A
2016-05-11
The underlying molecular processes representing stress responses to low-dose ionising radiation (LDIR) in mammals are just beginning to be understood. In particular, LDIR effects on the brain and their possible association with neurodegenerative disease are currently being explored using omics technologies. We describe a light-weight approach for the storage, analysis and distribution of relevant LDIR omics datasets. The data integration platform, called BRIDE, contains information from the literature as well as experimental information from transcriptomics and proteomics studies. It deploys a hybrid, distributed solution using both local storage and cloud technology. BRIDE can act as a knowledge broker for LDIR researchers, to facilitate molecular research on the systems biology of LDIR response in mammals. Its flexible design can capture a range of experimental information for genomics, epigenomics, transcriptomics, and proteomics. The data collection is available at:
EGI-EUDAT integration activity - Pair data and high-throughput computing resources together
NASA Astrophysics Data System (ADS)
Scardaci, Diego; Viljoen, Matthew; Vitlacil, Dejan; Fiameni, Giuseppe; Chen, Yin; sipos, Gergely; Ferrari, Tiziana
2016-04-01
EGI (www.egi.eu) is a publicly funded e-infrastructure put together to give scientists access to more than 530,000 logical CPUs, 200 PB of disk capacity and 300 PB of tape storage to drive research and innovation in Europe. The infrastructure provides both high throughput computing and cloud compute/storage capabilities. Resources are provided by about 350 resource centres which are distributed across 56 countries in Europe, the Asia-Pacific region, Canada and Latin America. EUDAT (www.eudat.eu) is a collaborative Pan-European infrastructure providing research data services, training and consultancy for researchers, research communities, research infrastructures and data centres. EUDAT's vision is to enable European researchers and practitioners from any research discipline to preserve, find, access, and process data in a trusted environment, as part of a Collaborative Data Infrastructure (CDI) conceived as a network of collaborating, cooperating centres, combining the richness of numerous community-specific data repositories with the permanence and persistence of some of Europe's largest scientific data centres. EGI and EUDAT, in the context of their flagship projects, EGI-Engage and EUDAT2020, started in March 2015 a collaboration to harmonise the two infrastructures, including technical interoperability, authentication, authorisation and identity management, policy and operations. The main objective of this work is to provide end-users with a seamless access to an integrated infrastructure offering both EGI and EUDAT services and, then, pairing data and high-throughput computing resources together. To define the roadmap of this collaboration, EGI and EUDAT selected a set of relevant user communities, already collaborating with both infrastructures, which could bring requirements and help to assign the right priorities to each of them. In this way, from the beginning, this activity has been really driven by the end users. The identified user communities are relevant European Research infrastructure in the field of Earth Science (EPOS and ICOS), Bioinformatics (BBMRI and ELIXIR) and Space Physics (EISCAT-3D). The first outcome of this activity has been the definition of a generic use case that captures the typical user scenario with respect the integrated use of the EGI and EUDAT infrastructures. This generic use case allows a user to instantiate a set of Virtual Machine images on the EGI Federated Cloud to perform computational jobs that analyse data previously stored on EUDAT long-term storage systems. The results of such analysis can be staged back to EUDAT storages, and if needed, allocated with Permanent identifyers (PIDs) for future use. The implementation of this generic use case requires the following integration activities between EGI and EUDAT: (1) harmonisation of the user authentication and authorisation models, (2) implementing interface connectors between the relevant EGI and EUDAT services, particularly EGI Cloud compute facilities and EUDAT long-term storage and PID systems. In the presentation, the collected user requirements and the implementation status of the universal use case will be showed. Furthermore, how the universal use case is currently applied to satisfy EPOS and ICOS needs will be described.
Cloud computing for genomic data analysis and collaboration.
Langmead, Ben; Nellore, Abhinav
2018-04-01
Next-generation sequencing has made major strides in the past decade. Studies based on large sequencing data sets are growing in number, and public archives for raw sequencing data have been doubling in size every 18 months. Leveraging these data requires researchers to use large-scale computational resources. Cloud computing, a model whereby users rent computers and storage from large data centres, is a solution that is gaining traction in genomics research. Here, we describe how cloud computing is used in genomics for research and large-scale collaborations, and argue that its elasticity, reproducibility and privacy features make it ideally suited for the large-scale reanalysis of publicly available archived data, including privacy-protected data.
Menu-driven cloud computing and resource sharing for R and Bioconductor.
Bolouri, Hamid; Dulepet, Rajiv; Angerman, Michael
2011-08-15
We report CRdata.org, a cloud-based, free, open-source web server for running analyses and sharing data and R scripts with others. In addition to using the free, public service, CRdata users can launch their own private Amazon Elastic Computing Cloud (EC2) nodes and store private data and scripts on Amazon's Simple Storage Service (S3) with user-controlled access rights. All CRdata services are provided via point-and-click menus. CRdata is open-source and free under the permissive MIT License (opensource.org/licenses/mit-license.php). The source code is in Ruby (ruby-lang.org/en/) and available at: github.com/seerdata/crdata. hbolouri@fhcrc.org.
NASA Technical Reports Server (NTRS)
Pearson, J. B.; Sims, Herb; Martin, James; Chakrabarti, Suman; Lewis, Raymond; Fant, Wallace
2003-01-01
The significant energy density of matter-antimatter annihilation is attractive to the designers of future space propulsion systems, with the potential to offer a highly compact source of power. Many propulsion concepts exist that could take advantage of matter-antimatter reactions, and current antiproton production rates are sufficient to support basic proof-of-principle evaluation of technology associated with antimatter- derived propulsion. One enabling technology for such experiments is portable storage of low energy antiprotons, allowing antiprotons to be trapped, stored, and transported for use at an experimental facility. To address this need, the Marshall Space Flight Center's Propulsion Research Center is developing a storage system referred to as the High Performance Antiproton Trap (HiPAT) with a design goal of containing 10(exp 12) particles for up to 18 days. The HiPAT makes use of an electromagnetic system (Penning- Malmberg design) consisting of a 4 Telsa superconductor, high voltage electrode structure, radio frequency (RF) network, and ultra high vacuum system. To evaluate the system normal matter sources (both electron guns and ion sources) are used to generate charged particles. The electron beams ionize gas within the trapping region producing ions in situ, whereas the ion sources produce the particles external to the trapping region and required dynamic capture. A wide range of experiments has been performed examining factors such as ion storage lifetimes, effect of RF energy on storage lifetime, and ability to routinely perform dynamic ion capture. Current efforts have been focused on improving the FW rotating wall system to permit longer storage times and non-destructive diagnostics of stored ions. Typical particle detection is performed by extracting trapped ions from HiPAT and destructively colliding them with a micro-channel plate detector (providing number and energy information). This improved RF system has been used to detect various plasma modes for both electron and ion plasmas in the two traps at MSFC, including axial, cyclotron, and diocotron modes. New diagnostics are also being added to HiPAT to measure the axial density distribution of the trapped cloud to match measured RF plasma modes to plasma conditions.
NASA Technical Reports Server (NTRS)
Creamean, J. M.; Ault, A. P.; White, A. B.; Neiman, P. J.; Ralph, F. M.; Minnis, Patrick; Prather, K. A.
2014-01-01
Aerosols that serve as cloud condensation nuclei (CCN) and ice nuclei (IN) have the potential to profoundly influence precipitation processes. Furthermore, changes in orographic precipitation have broad implications for reservoir storage and flood risks. As part of the CalWater I field campaign (2009-2011), the impacts of aerosol sources on precipitation were investigated in the California Sierra Nevada. In 2009, the precipitation collected on the ground was influenced by both local biomass burning (up to 79% of the insoluble residues found in precipitation) and long-range transported dust and biological particles (up to 80% combined), while in 2010, by mostly local sources of biomass burning and pollution (30-79% combined), and in 2011 by mostly long-range transport from distant sources (up to 100% dust and biological). Although vast differences in the source of residues was observed from year-to-year, dust and biological residues were omnipresent (on average, 55% of the total residues combined) and were associated with storms consisting of deep convective cloud systems and larger quantities of precipitation initiated in the ice phase. Further, biological residues were dominant during storms with relatively warm cloud temperatures (up to -15 C), suggesting these particles were more efficient IN compared to mineral dust. On the other hand, lower percentages of residues from local biomass burning and pollution were observed (on average 31% and 9%, respectively), yet these residues potentially served as CCN at the base of shallow cloud systems when precipitation quantities were low. The direct connection of the source of aerosols within clouds and precipitation type and quantity can be used in models to better assess how local emissions versus long-range transported dust and biological aerosols play a role in impacting regional weather and climate, ultimately with the goal of more accurate predictive weather forecast models and water resource management.
Monitoring small reservoirs' storage with satellite remote sensing in inaccessible areas
NASA Astrophysics Data System (ADS)
Avisse, Nicolas; Tilmant, Amaury; François Müller, Marc; Zhang, Hua
2017-12-01
In river basins with water storage facilities, the availability of regularly updated information on reservoir level and capacity is of paramount importance for the effective management of those systems. However, for the vast majority of reservoirs around the world, storage levels are either not measured or not readily available due to financial, political, or legal considerations. This paper proposes a novel approach using Landsat imagery and digital elevation models (DEMs) to retrieve information on storage variations in any inaccessible region. Unlike existing approaches, the method does not require any in situ measurement and is appropriate for monitoring small, and often undocumented, irrigation reservoirs. It consists of three recovery steps: (i) a 2-D dynamic classification of Landsat spectral band information to quantify the surface area of water, (ii) a statistical correction of DEM data to characterize the topography of each reservoir, and (iii) a 3-D reconstruction algorithm to correct for clouds and Landsat 7 Scan Line Corrector failure. The method is applied to quantify reservoir storage in the Yarmouk basin in southern Syria, where ground monitoring is impeded by the ongoing civil war. It is validated against available in situ measurements in neighbouring Jordanian reservoirs. Coefficients of determination range from 0.69 to 0.84, and the normalized root-mean-square error from 10 to 16 % for storage estimations on six Jordanian reservoirs with maximal water surface areas ranging from 0.59 to 3.79 km2.
Exploring Venus by Solar Airplane
NASA Technical Reports Server (NTRS)
Landis, Geoffrey A.
2001-01-01
A solar-powered airplane is proposed to explore the atmospheric environment of Venus. Venus has several advantages for a solar airplane. At the top of the cloud level, the solar intensity is comparable to or greater than terrestrial solar intensities. The Earthlike atmospheric pressure means that the power required for flight is lower for Venus than that of Mars, and the slow rotation of Venus allows an airplane to be designed for continuous sunlight, with no energy storage needed for night-time flight. These factors mean that Venus is perhaps the easiest planet in the solar system for flight of a long-duration solar airplane.
NASA Astrophysics Data System (ADS)
Zhang, Guang J.; Zurovac-Jevtic, Dance; Boer, Erwin R.
1999-10-01
A Lagrangian cloud classification algorithm is applied to the cloud fields in the tropical Pacific simulated by a high-resolution regional atmospheric model. The purpose of this work is to assess the model's ability to reproduce the observed spatial characteristics of the tropical cloud systems. The cloud systems are broadly grouped into three categories: deep clouds, mid-level clouds and low clouds. The deep clouds are further divided into mesoscale convective systems and non
mesoscale convective systems. It is shown that the model is able to simulate the total cloud cover for each category reasonably well. However, when the cloud cover is broken down into contributions from cloud systems of different sizes, it is shown that the simulated cloud size distribution is biased toward large cloud systems, with contribution from relatively small cloud systems significantly under-represented in the model for both deep and mid-level clouds. The number distribution and area contribution to the cloud cover from mesoscale convective systems are very well simulated compared to the satellite observations, so are low clouds as well. The dependence of the cloud physical properties on cloud scale is examined. It is found that cloud liquid water path, rainfall, and ocean surface sensible and latent heat fluxes have a clear dependence on cloud types and scale. This is of particular interest to studies of the cloud effects on surface energy budget and hydrological cycle. The diurnal variation of the cloud population and area is also examined. The model exhibits a varying degree of success in simulating the diurnal variation of the cloud number and area. The observed early morning maximum cloud cover in deep convective cloud systems is qualitatively simulated. However, the afternoon secondary maximum is missing in the model simulation. The diurnal variation of the tropospheric temperature is well reproduced by the model while simulation of the diurnal variation of the moisture field is poor. The implication of this comparison between model simulation and observations on cloud parameterization is discussed.
Research on the full life cycle management system of smart electric energy meter
NASA Astrophysics Data System (ADS)
Chen, Xiangqun; Huang, Rui; Shen, Liman; Guo, Dingying; Xiong, Dezhi; Xiao, Xiangqi; Liu, Mouhai; Renheng, Xu
2018-02-01
At present, China’s smart electric energy meter life management is started from the procurement and acceptance. The related monitoring and management of the manufacturing sector has not yet been carried out. This article applies RFID technology and network cloud platform to full life cycle management system of smart electric energy meters, builds this full life cycle management system including design and manufacturing, process control, measurement and calibration testing, storage management, user acceptance, site operation, maintenance scrap and other aspects. Exploring smart electric energy meters on-line and off-line communication by the application of active RFID communication functions, and the actual functional application such as local data exchange and instrument calibration. This system provides technical supports on power demand side management and the improvement of smart electric energy meter reliability evaluation system.
NASA Astrophysics Data System (ADS)
Wu, D. L.; Esper, J.; Ehsan, N.; Piepmeier, J. R.; Racette, P.
2014-12-01
Ice clouds play a key role in the Earth's radiation budget, mostly through their strong regulation of infrared radiation exchange. Submillimeter wave remote sensing offers a unique capability to improve cloud ice measurements from space. At 874 GHz cloud scattering produces a larger brightness temperature depression from cirrus than lower frequencies, which can be used to retrieve vertically-integrated cloud ice water path (IWP) and ice particle size. The objective of the IceCube project is to retire risks of 874-GHz receiver technology by raising its TRL from 5 to 7. The project will demonstrate, on a 3-U CubeSat in a low Earth orbit (LEO) environment, the 874-GHz receiver system with noise equivalent differential temperature (NEDT) of ~0.2 K for 1-second integration and calibration error of 2.0 K or less as measured from deep-space observations. The Goddard Space Flight Center (GSFC) is partnering with Virginia Diodes, Inc (VDI) to qualify commercially available 874-GHz receiver technology for spaceflight, and demonstrate the radiometer performance. The instrument (submm-wave cloud radiometer, or SCR), along with the CubeSat system developed and integrated by GSFC, will be ready for launch in two years. The instrument subsystem includes a reflector antenna, sub-millimeter wave mixer, frequency multipliers and stable local oscillator, an intermediate frequency (IF) circuit with noise injection, and data-power boards. The mixer and frequency multipliers are procured from VDI with GSFC insight into fabrication and testing processes to ensure scalability to spaceflight beyond TRL 7. The remaining components are a combination of GSFC-designed and commercial off-the-shelf (COTS) at TRLs of 5 or higher. The spacecraft system is specified by GSFC and comprises COTS components including three-axis stabilizer and sun sensor, GPS receiver, deployable solar arrays, UHF radio, and 2 GB of on-board storage. The spacecraft and instrument are integrated and flight qualified through environmental testing at GSFC. The concept of operations is to fly the GSFC designed instrument/spacecraft in a LEO orbit and collect the 874-GHz radiance data for a period of at least 28+ days. Communication will be through the WFF's UHF ground station. Mission Operations and data processing and validation will be conducted at GSFC.
NASA Astrophysics Data System (ADS)
Xiong, Ting; He, Zhiwen
2017-06-01
Cloud computing was first proposed by Google Company in the United States, which was based on the Internet center, providing a standard and open network sharing service approach. With the rapid development of the higher education in China, the educational resources provided by colleges and universities had greatly gap in the actual needs of teaching resources. therefore, Cloud computing of using the Internet technology to provide shared methods liked the timely rain, which had become an important means of the Digital Education on sharing applications in the current higher education. Based on Cloud computing environment, the paper analyzed the existing problems about the sharing of digital educational resources in Jiangxi Province Independent Colleges. According to the sharing characteristics of mass storage, efficient operation and low input about Cloud computing, the author explored and studied the design of the sharing model about the digital educational resources of higher education in Independent College. Finally, the design of the shared model was put into the practical applications.
NASA Technical Reports Server (NTRS)
Limaye, Ashutosh S.; Molthan, Andrew L.; Srikishen, Jayanthi
2010-01-01
The development of the Nebula Cloud Computing Platform at NASA Ames Research Center provides an open-source solution for the deployment of scalable computing and storage capabilities relevant to the execution of real-time weather forecasts and the distribution of high resolution satellite data to the operational weather community. Two projects at Marshall Space Flight Center may benefit from use of the Nebula system. The NASA Short-term Prediction Research and Transition (SPoRT) Center facilitates the use of unique NASA satellite data and research capabilities in the operational weather community by providing datasets relevant to numerical weather prediction, and satellite data sets useful in weather analysis. SERVIR provides satellite data products for decision support, emphasizing environmental threats such as wildfires, floods, landslides, and other hazards, with interests in numerical weather prediction in support of disaster response. The Weather Research and Forecast (WRF) model Environmental Modeling System (WRF-EMS) has been configured for Nebula cloud computing use via the creation of a disk image and deployment of repeated instances. Given the available infrastructure within Nebula and the "infrastructure as a service" concept, the system appears well-suited for the rapid deployment of additional forecast models over different domains, in response to real-time research applications or disaster response. Future investigations into Nebula capabilities will focus on the development of a web mapping server and load balancing configuration to support the distribution of high resolution satellite data sets to users within the National Weather Service and international partners of SERVIR.
NASA Technical Reports Server (NTRS)
Chakrabarti, S.; Martin, J. J.; Pearson, J. B.; Lewis, R. A.
2003-01-01
The NASA MSFC Propulsion Research Center (PRC) is conducting a research activity examining the storage of low energy antiprotons. The High Performance Antiproton Trap (HiPAT) is an electromagnetic system (Penning-Malmberg design) consisting of a 4 Tesla superconductor, a high voltage confinement electrode system, and an ultra high vacuum test section; designed with an ultimate goal of maintaining charged particles with a half-life of 18 days. Currently, this system is being experimentally evaluated using normal matter ions which are cheap to produce and relatively easy to handle and provide a good indication of overall trap behavior, with the exception of assessing annihilation losses. Computational particle-in-cell plasma modeling using the XOOPIC code is supplementing the experiments. Differing electrode voltage configurations are employed to contain charged particles, typically using flat, modified flat and harmonic potential wells. Ion cloud oscillation frequencies are obtained experimentally by amplification of signals induced on the electrodes by the particle motions. XOOPIC simulations show that for given electrode voltage configurations, the calculated charged particle oscillation frequencies are close to experimental measurements. As a two-dimensional axisymmetric code, XOOPIC cannot model azimuthal plasma variations, such as those induced by radio-frequency (RF) modulation of the central quadrupole electrode in experiments designed to enhance ion cloud containment. However, XOOPIC can model analytically varying electric potential boundary conditions and particle velocity initial conditions. Application of these conditions produces ion cloud axial and radial oscillation frequency modes of interest in achieving the goal of optimizing HiPAT for reliable containment of antiprotons.
NASA Astrophysics Data System (ADS)
Capone, V.; Esposito, R.; Pardi, S.; Taurino, F.; Tortone, G.
2012-12-01
Over the last few years we have seen an increasing number of services and applications needed to manage and maintain cloud computing facilities. This is particularly true for computing in high energy physics, which often requires complex configurations and distributed infrastructures. In this scenario a cost effective rationalization and consolidation strategy is the key to success in terms of scalability and reliability. In this work we describe an IaaS (Infrastructure as a Service) cloud computing system, with high availability and redundancy features, which is currently in production at INFN-Naples and ATLAS Tier-2 data centre. The main goal we intended to achieve was a simplified method to manage our computing resources and deliver reliable user services, reusing existing hardware without incurring heavy costs. A combined usage of virtualization and clustering technologies allowed us to consolidate our services on a small number of physical machines, reducing electric power costs. As a result of our efforts we developed a complete solution for data and computing centres that can be easily replicated using commodity hardware. Our architecture consists of 2 main subsystems: a clustered storage solution, built on top of disk servers running GlusterFS file system, and a virtual machines execution environment. GlusterFS is a network file system able to perform parallel writes on multiple disk servers, providing this way live replication of data. High availability is also achieved via a network configuration using redundant switches and multiple paths between hypervisor hosts and disk servers. We also developed a set of management scripts to easily perform basic system administration tasks such as automatic deployment of new virtual machines, adaptive scheduling of virtual machines on hypervisor hosts, live migration and automated restart in case of hypervisor failures.
Implementation of a Big Data Accessing and Processing Platform for Medical Records in Cloud.
Yang, Chao-Tung; Liu, Jung-Chun; Chen, Shuo-Tsung; Lu, Hsin-Wen
2017-08-18
Big Data analysis has become a key factor of being innovative and competitive. Along with population growth worldwide and the trend aging of population in developed countries, the rate of the national medical care usage has been increasing. Due to the fact that individual medical data are usually scattered in different institutions and their data formats are varied, to integrate those data that continue increasing is challenging. In order to have scalable load capacity for these data platforms, we must build them in good platform architecture. Some issues must be considered in order to use the cloud computing to quickly integrate big medical data into database for easy analyzing, searching, and filtering big data to obtain valuable information.This work builds a cloud storage system with HBase of Hadoop for storing and analyzing big data of medical records and improves the performance of importing data into database. The data of medical records are stored in HBase database platform for big data analysis. This system performs distributed computing on medical records data processing through Hadoop MapReduce programming, and to provide functions, including keyword search, data filtering, and basic statistics for HBase database. This system uses the Put with the single-threaded method and the CompleteBulkload mechanism to import medical data. From the experimental results, we find that when the file size is less than 300MB, the Put with single-threaded method is used and when the file size is larger than 300MB, the CompleteBulkload mechanism is used to improve the performance of data import into database. This system provides a web interface that allows users to search data, filter out meaningful information through the web, and analyze and convert data in suitable forms that will be helpful for medical staff and institutions.
High Resolution Nature Runs and the Big Data Challenge
NASA Technical Reports Server (NTRS)
Webster, W. Phillip; Duffy, Daniel Q.
2015-01-01
NASA's Global Modeling and Assimilation Office at Goddard Space Flight Center is undertaking a series of very computationally intensive Nature Runs and a downscaled reanalysis. The nature runs use the GEOS-5 as an Atmospheric General Circulation Model (AGCM) while the reanalysis uses the GEOS-5 in Data Assimilation mode. This paper will present computational challenges from three runs, two of which are AGCM and one is downscaled reanalysis using the full DAS. The nature runs will be completed at two surface grid resolutions, 7 and 3 kilometers and 72 vertical levels. The 7 km run spanned 2 years (2005-2006) and produced 4 PB of data while the 3 km run will span one year and generate 4 BP of data. The downscaled reanalysis (MERRA-II Modern-Era Reanalysis for Research and Applications) will cover 15 years and generate 1 PB of data. Our efforts to address the big data challenges of climate science, we are moving toward a notion of Climate Analytics-as-a-Service (CAaaS), a specialization of the concept of business process-as-a-service that is an evolving extension of IaaS, PaaS, and SaaS enabled by cloud computing. In this presentation, we will describe two projects that demonstrate this shift. MERRA Analytic Services (MERRA/AS) is an example of cloud-enabled CAaaS. MERRA/AS enables MapReduce analytics over MERRA reanalysis data collection by bringing together the high-performance computing, scalable data management, and a domain-specific climate data services API. NASA's High-Performance Science Cloud (HPSC) is an example of the type of compute-storage fabric required to support CAaaS. The HPSC comprises a high speed Infinib and network, high performance file systems and object storage, and a virtual system environments specific for data intensive, science applications. These technologies are providing a new tier in the data and analytic services stack that helps connect earthbound, enterprise-level data and computational resources to new customers and new mobility-driven applications and modes of work. In our experience, CAaaS lowers the barriers and risk to organizational change, fosters innovation and experimentation, and provides the agility required to meet our customers' increasing and changing needs
A Big Data Platform for Storing, Accessing, Mining and Learning Geospatial Data
NASA Astrophysics Data System (ADS)
Yang, C. P.; Bambacus, M.; Duffy, D.; Little, M. M.
2017-12-01
Big Data is becoming a norm in geoscience domains. A platform that is capable to effiently manage, access, analyze, mine, and learn the big data for new information and knowledge is desired. This paper introduces our latest effort on developing such a platform based on our past years' experiences on cloud and high performance computing, analyzing big data, comparing big data containers, and mining big geospatial data for new information. The platform includes four layers: a) the bottom layer includes a computing infrastructure with proper network, computer, and storage systems; b) the 2nd layer is a cloud computing layer based on virtualization to provide on demand computing services for upper layers; c) the 3rd layer is big data containers that are customized for dealing with different types of data and functionalities; d) the 4th layer is a big data presentation layer that supports the effient management, access, analyses, mining and learning of big geospatial data.
Hydrodynamics and Water Quality forecasting over a Cloud Computing environment: INDIGO-DataCloud
NASA Astrophysics Data System (ADS)
Aguilar Gómez, Fernando; de Lucas, Jesús Marco; García, Daniel; Monteoliva, Agustín
2017-04-01
Algae Bloom due to eutrophication is an extended problem for water reservoirs and lakes that impacts directly in water quality. It can create a dead zone that lacks enough oxygen to support life and it can also be human harmful, so it must be controlled in water masses for supplying, bathing or other uses. Hydrodynamic and Water Quality modelling can contribute to forecast the status of the water system in order to alert authorities before an algae bloom event occurs. It can be used to predict scenarios and find solutions to reduce the harmful impact of the blooms. High resolution models need to process a big amount of data using a robust enough computing infrastructure. INDIGO-DataCloud (https://www.indigo-datacloud.eu/) is an European Commission funded project that aims at developing a data and computing platform targeting scientific communities, deployable on multiple hardware and provisioned over hybrid (private or public) e-infrastructures. The project addresses the development of solutions for different Case Studies using different Cloud-based alternatives. In the first INDIGO software release, a set of components are ready to manage the deployment of services to perform N number of Delft3D simulations (for calibrating or scenario definition) over a Cloud Computing environment, using the Docker technology: TOSCA requirement description, Docker repository, Orchestrator, AAI (Authorization, Authentication) and OneData (Distributed Storage System). Moreover, the Future Gateway portal based on Liferay, provides an user-friendly interface where the user can configure the simulations. Due to the data approach of INDIGO, the developed solutions can contribute to manage the full data life cycle of a project, thanks to different tools to manage datasets or even metadata. Furthermore, the cloud environment contributes to provide a dynamic, scalable and easy-to-use framework for non-IT experts users. This framework is potentially capable to automatize the processing of forecasting applying periodic tasks. For instance, a user can forecast every month the hydrodynamics and water quality status of a reservoir starting from a base model and supplying new data gathered from the instrumentation or observations. This interactive presentation aims to show the use of INDIGO solutions in a particular forecasting use case and to inspire others in the use of a Cloud framework for their applications.
The successful of finite element to invent particle cleaning system by air jet in hard disk drive
NASA Astrophysics Data System (ADS)
Jai-Ngam, Nualpun; Tangchaichit, Kaitfa
2018-02-01
Hard Disk Drive manufacturing has faced very challenging with the increasing demand of high capacity drives for Cloud-based storage. Particle adhesion has also become increasingly important in HDD to gain more reliability of storage capacity. The ability to clean on surfaces is more complicated in removing such particles without damaging the surface. This research is aim to improve the particle cleaning in HSA by using finite element to develop the air flow model then invent the prototype of air cleaning system to remove particle from surface. Surface cleaning by air pressure can be applied as alternative for the removal of solid particulate contaminants that is adhering on a solid surface. These technical and economic challenges have driven the process development from traditional way that chemical solvent cleaning. The focus of this study is to develop alternative way from scrub, ultrasonic, mega sonic on surface cleaning principles to serve as a foundation for the development of new processes to meet current state-of-the-art process requirements and minimize the waste from chemical cleaning for environment safety.
Generating a Corpus of Mobile Forensic Images for Masquerading user Experimentation.
Guido, Mark; Brooks, Marc; Grover, Justin; Katz, Eric; Ondricek, Jared; Rogers, Marcus; Sharpe, Lauren
2016-11-01
The Periodic Mobile Forensics (PMF) system investigates user behavior on mobile devices. It applies forensic techniques to an enterprise mobile infrastructure, utilizing an on-device agent named TractorBeam. The agent collects changed storage locations for later acquisition, reconstruction, and analysis. TractorBeam provides its data to an enterprise infrastructure that consists of a cloud-based queuing service, relational database, and analytical framework for running forensic processes. During a 3-month experiment with Purdue University, TractorBeam was utilized in a simulated operational setting across 34 users to evaluate techniques to identify masquerading users (i.e., users other than the intended device user). The research team surmises that all masqueraders are undesirable to an enterprise, even when a masquerader lacks malicious intent. The PMF system reconstructed 821 forensic images, extracted one million audit events, and accurately detected masqueraders. Evaluation revealed that developed methods reduced storage requirements 50-fold. This paper describes the PMF architecture, performance of TractorBeam throughout the protocol, and results of the masquerading user analysis. © 2016 American Academy of Forensic Sciences.
NASA Astrophysics Data System (ADS)
Li, Ming; Yin, Hongxi; Xing, Fangyuan; Wang, Jingchao; Wang, Honghuan
2016-02-01
With the features of network virtualization and resource programming, Software Defined Optical Network (SDON) is considered as the future development trend of optical network, provisioning a more flexible, efficient and open network function, supporting intraconnection and interconnection of data centers. Meanwhile cloud platform can provide powerful computing, storage and management capabilities. In this paper, with the coordination of SDON and cloud platform, a multi-domain SDON architecture based on cloud control plane has been proposed, which is composed of data centers with database (DB), path computation element (PCE), SDON controller and orchestrator. In addition, the structure of the multidomain SDON orchestrator and OpenFlow-enabled optical node are proposed to realize the combination of centralized and distributed effective management and control platform. Finally, the functional verification and demonstration are performed through our optical experiment network.
Design and deployment of an elastic network test-bed in IHEP data center based on SDN
NASA Astrophysics Data System (ADS)
Zeng, Shan; Qi, Fazhi; Chen, Gang
2017-10-01
High energy physics experiments produce huge amounts of raw data, while because of the sharing characteristics of the network resources, there is no guarantee of the available bandwidth for each experiment which may cause link congestion problems. On the other side, with the development of cloud computing technologies, IHEP have established a cloud platform based on OpenStack which can ensure the flexibility of the computing and storage resources, and more and more computing applications have been deployed on virtual machines established by OpenStack. However, under the traditional network architecture, network capability can’t be required elastically, which becomes the bottleneck of restricting the flexible application of cloud computing. In order to solve the above problems, we propose an elastic cloud data center network architecture based on SDN, and we also design a high performance controller cluster based on OpenDaylight. In the end, we present our current test results.
A Cloud-Based Infrastructure for Near-Real-Time Processing and Dissemination of NPP Data
NASA Astrophysics Data System (ADS)
Evans, J. D.; Valente, E. G.; Chettri, S. S.
2011-12-01
We are building a scalable cloud-based infrastructure for generating and disseminating near-real-time data products from a variety of geospatial and meteorological data sources, including the new National Polar-Orbiting Environmental Satellite System (NPOESS) Preparatory Project (NPP). Our approach relies on linking Direct Broadcast and other data streams to a suite of scientific algorithms coordinated by NASA's International Polar-Orbiter Processing Package (IPOPP). The resulting data products are directly accessible to a wide variety of end-user applications, via industry-standard protocols such as OGC Web Services, Unidata Local Data Manager, or OPeNDAP, using open source software components. The processing chain employs on-demand computing resources from Amazon.com's Elastic Compute Cloud and NASA's Nebula cloud services. Our current prototype targets short-term weather forecasting, in collaboration with NASA's Short-term Prediction Research and Transition (SPoRT) program and the National Weather Service. Direct Broadcast is especially crucial for NPP, whose current ground segment is unlikely to deliver data quickly enough for short-term weather forecasters and other near-real-time users. Direct Broadcast also allows full local control over data handling, from the receiving antenna to end-user applications: this provides opportunities to streamline processes for data ingest, processing, and dissemination, and thus to make interpreted data products (Environmental Data Records) available to practitioners within minutes of data capture at the sensor. Cloud computing lets us grow and shrink computing resources to meet large and rapid fluctuations in data availability (twice daily for polar orbiters) - and similarly large fluctuations in demand from our target (near-real-time) users. This offers a compelling business case for cloud computing: the processing or dissemination systems can grow arbitrarily large to sustain near-real time data access despite surges in data volumes or user demand, but that computing capacity (and hourly costs) can be dropped almost instantly once the surge passes. Cloud computing also allows low-risk experimentation with a variety of machine architectures (processor types; bandwidth, memory, and storage capacities, etc.) and of system configurations (including massively parallel computing patterns). Finally, our service-based approach (in which user applications invoke software processes on a Web-accessible server) facilitates access into datasets of arbitrary size and resolution, and allows users to request and receive tailored products on demand. To maximize the usefulness and impact of our technology, we have emphasized open, industry-standard software interfaces. We are also using and developing open source software to facilitate the widespread adoption of similar, derived, or interoperable systems for processing and serving near-real-time data from NPP and other sources.
Testing as a Service with HammerCloud
NASA Astrophysics Data System (ADS)
Medrano Llamas, Ramón; Barrand, Quentin; Elmsheuser, Johannes; Legger, Federica; Sciacca, Gianfranco; Sciabà, Andrea; van der Ster, Daniel
2014-06-01
HammerCloud was designed and born under the needs of the grid community to test the resources and automate operations from a user perspective. The recent developments in the IT space propose a shift to the software defined data centres, in which every layer of the infrastructure can be offered as a service. Testing and monitoring is an integral part of the development, validation and operations of big systems, like the grid. This area is not escaping the paradigm shift and we are starting to perceive as natural the Testing as a Service (TaaS) offerings, which allow testing any infrastructure service, such as the Infrastructure as a Service (IaaS) platforms being deployed in many grid sites, both from the functional and stressing perspectives. This work will review the recent developments in HammerCloud and its evolution to a TaaS conception, in particular its deployment on the Agile Infrastructure platform at CERN and the testing of many IaaS providers across Europe in the context of experiment requirements. The first section will review the architectural changes that a service running in the cloud needs, such an orchestration service or new storage requirements in order to provide functional and stress testing. The second section will review the first tests of infrastructure providers on the perspective of the challenges discovered from the architectural point of view. Finally, the third section will evaluate future requirements of scalability and features to increase testing productivity.
NASA Astrophysics Data System (ADS)
Grandi, C.; Italiano, A.; Salomoni, D.; Calabrese Melcarne, A. K.
2011-12-01
WNoDeS, an acronym for Worker Nodes on Demand Service, is software developed at CNAF-Tier1, the National Computing Centre of the Italian Institute for Nuclear Physics (INFN) located in Bologna. WNoDeS provides on demand, integrated access to both Grid and Cloud resources through virtualization technologies. Besides the traditional use of computing resources in batch mode, users need to have interactive and local access to a number of systems. WNoDeS can dynamically select these computers instantiating Virtual Machines, according to the requirements (computing, storage and network resources) of users through either the Open Cloud Computing Interface API, or through a web console. An interactive use is usually limited to activities in user space, i.e. where the machine configuration is not modified. In some other instances the activity concerns development and testing of services and thus implies the modification of the system configuration (and, therefore, root-access to the resource). The former use case is a simple extension of the WNoDeS approach, where the resource is provided in interactive mode. The latter implies saving the virtual image at the end of each user session so that it can be presented to the user at subsequent requests. This work describes how the LHC experiments at INFN-Bologna are testing and making use of these dynamically created ad-hoc machines via WNoDeS to support flexible, interactive analysis and software development at the INFN Tier-1 Computing Centre.
GEWEX Cloud Systems Study (GCSS)
NASA Technical Reports Server (NTRS)
Moncrieff, Mitch
1993-01-01
The Global Energy and Water Cycle Experiment (GEWEX) Cloud Systems Study (GCSS) program seeks to improve the physical understanding of sub-grid scale cloud processes and their representation in parameterization schemes. By improving the description and understanding of key cloud system processes, GCSS aims to develop the necessary parameterizations in climate and numerical weather prediction (NWP) models. GCSS will address these issues mainly through the development and use of cloud-resolving or cumulus ensemble models to generate realizations of a set of archetypal cloud systems. The focus of GCSS is on mesoscale cloud systems, including precipitating convectively-driven cloud systems like MCS's and boundary layer clouds, rather than individual clouds, and on their large-scale effects. Some of the key scientific issues confronting GCSS that particularly relate to research activities in the central U.S. are presented.
Menu-driven cloud computing and resource sharing for R and Bioconductor
Bolouri, Hamid; Angerman, Michael
2011-01-01
Summary: We report CRdata.org, a cloud-based, free, open-source web server for running analyses and sharing data and R scripts with others. In addition to using the free, public service, CRdata users can launch their own private Amazon Elastic Computing Cloud (EC2) nodes and store private data and scripts on Amazon's Simple Storage Service (S3) with user-controlled access rights. All CRdata services are provided via point-and-click menus. Availability and Implementation: CRdata is open-source and free under the permissive MIT License (opensource.org/licenses/mit-license.php). The source code is in Ruby (ruby-lang.org/en/) and available at: github.com/seerdata/crdata. Contact: hbolouri@fhcrc.org PMID:21685055
Methods for Quantitative Interpretation of Retarding Field Analyzer Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Calvey, J.R.; Crittenden, J.A.; Dugan, G.F.
2011-03-28
Over the course of the CesrTA program at Cornell, over 30 Retarding Field Analyzers (RFAs) have been installed in the CESR storage ring, and a great deal of data has been taken with them. These devices measure the local electron cloud density and energy distribution, and can be used to evaluate the efficacy of different cloud mitigation techniques. Obtaining a quantitative understanding of RFA data requires use of cloud simulation programs, as well as a detailed model of the detector itself. In a drift region, the RFA can be modeled by postprocessing the output of a simulation code, and onemore » can obtain best fit values for important simulation parameters with a chi-square minimization method.« less
CE-ACCE: The Cloud Enabled Advanced sCience Compute Environment
NASA Astrophysics Data System (ADS)
Cinquini, L.; Freeborn, D. J.; Hardman, S. H.; Wong, C.
2017-12-01
Traditionally, Earth Science data from NASA remote sensing instruments has been processed by building custom data processing pipelines (often based on a common workflow engine or framework) which are typically deployed and run on an internal cluster of computing resources. This approach has some intrinsic limitations: it requires each mission to develop and deploy a custom software package on top of the adopted framework; it makes use of dedicated hardware, network and storage resources, which must be specifically purchased, maintained and re-purposed at mission completion; and computing services cannot be scaled on demand beyond the capability of the available servers.More recently, the rise of Cloud computing, coupled with other advances in containerization technology (most prominently, Docker) and micro-services architecture, has enabled a new paradigm, whereby space mission data can be processed through standard system architectures, which can be seamlessly deployed and scaled on demand on either on-premise clusters, or commercial Cloud providers. In this talk, we will present one such architecture named CE-ACCE ("Cloud Enabled Advanced sCience Compute Environment"), which we have been developing at the NASA Jet Propulsion Laboratory over the past year. CE-ACCE is based on the Apache OODT ("Object Oriented Data Technology") suite of services for full data lifecycle management, which are turned into a composable array of Docker images, and complemented by a plug-in model for mission-specific customization. We have applied this infrastructure to both flying and upcoming NASA missions, such as ECOSTRESS and SMAP, and demonstrated deployment on the Amazon Cloud, either using simple EC2 instances, or advanced AWS services such as Amazon Lambda and ECS (EC2 Container Services).
Design and Implementation of Cloud-Centric Configuration Repository for DIY IoT Applications
Ahmad, Shabir; Kim, Do Hyeun
2018-01-01
The Do-It-Yourself (DIY) vision for the design of a smart and customizable IoT application demands the involvement of the general public in its development process. The general public lacks the technical knowledge for programming state-of-the-art prototyping and development kits. The latest IoT kits, for example, Raspberry Pi, are revolutionizing the DIY paradigm for IoT, and more than ever, a DIY intuitive programming interface is required to enable the masses to interact with and customize the behavior of remote IoT devices on the Internet. However, in most cases, these DIY toolkits store the resultant configuration data in local storage and, thus, cannot be accessed remotely. This paper presents the novel implementation of such a system, which not only enables the general public to customize the behavior of remote IoT devices through a visual interface, but also makes the configuration available everywhere and anytime by leveraging the power of cloud-based platforms. The interface enables the visualization of the resources exposed by remote embedded resources in the form of graphical virtual objects (VOs). These VOs are used to create the service design through simple operations like drag-and-drop and the setting of properties. The configuration created as a result is maintained as an XML document, which is ingested by the cloud platform, thus making it available to be used anywhere. We use the HTTP approach for the communication between the cloud and IoT toolbox and the cloud and real devices, but for communication between the toolbox and actual resources, CoAP is used. Finally, a smart home case study has been implemented and presented in order to assess the effectiveness of the proposed work. PMID:29415450
Design and Implementation of Cloud-Centric Configuration Repository for DIY IoT Applications.
Ahmad, Shabir; Hang, Lei; Kim, Do Hyeun
2018-02-06
The Do-It-Yourself (DIY) vision for the design of a smart and customizable IoT application demands the involvement of the general public in its development process. The general public lacks the technical knowledge for programming state-of-the-art prototyping and development kits. The latest IoT kits, for example, Raspberry Pi, are revolutionizing the DIY paradigm for IoT, and more than ever, a DIY intuitive programming interface is required to enable the masses to interact with and customize the behavior of remote IoT devices on the Internet. However, in most cases, these DIY toolkits store the resultant configuration data in local storage and, thus, cannot be accessed remotely. This paper presents the novel implementation of such a system, which not only enables the general public to customize the behavior of remote IoT devices through a visual interface, but also makes the configuration available everywhere and anytime by leveraging the power of cloud-based platforms. The interface enables the visualization of the resources exposed by remote embedded resources in the form of graphical virtual objects (VOs). These VOs are used to create the service design through simple operations like drag-and-drop and the setting of properties. The configuration created as a result is maintained as an XML document, which is ingested by the cloud platform, thus making it available to be used anywhere. We use the HTTP approach for the communication between the cloud and IoT toolbox and the cloud and real devices, but for communication between the toolbox and actual resources, CoAP is used. Finally, a smart home case study has been implemented and presented in order to assess the effectiveness of the proposed work.
A Standardized Based Approach to Managing Atmosphere Studies For Wind Energy Research
NASA Astrophysics Data System (ADS)
Stephan, E.; Sivaraman, C.
2015-12-01
Atmosphere to Electrons (A2e) is a multi-year U.S. Department of Energy (DOE) research initiative targeting significant reductions in the cost of wind energy through an improved understanding of the complex physics governing wind flow into and through wind farms. Better insight into the flow physics has the potential to reduce wind farm energy losses by up to 20%, to reduce annual operational costs by hundreds of millions of dollars, and to improve project financing terms to more closely resemble traditional capital projects. The Data Archive and Portal (DAP) is a key capability of the A2e initiative. The DAP is a cloud-based distributed system known as the 'Wind Cloud' that functions as a repository for all A2e data. This data includes numerous historic and on-going field studies involving in situ and remote sensing instruments, simulations, and scientific analysis. Significantly it is the integration and sharing of these diverse data sets through the DAP that is key to meeting the goals of A2e. This cloud will be accessible via an open and easy-to navigate user interface that facilitates community data access, interaction, and collaboration. DAP management is working with the community, industry, and international standards bodies to develop standards for wind data and to capture important characteristics of all data in the Wind Cloud. Security will be provided to facilitate storage of proprietary data alongside publicly accessible data in the Wind Cloud, and the capability to generate anonymized data will be provided to facilitate using private data by non-privileged users (when appropriate). Finally, limited computing capabilities will be provided to facilitate co-located data analysis, validation, and generation of derived products in support of A2e science.
Makowski, Dale
2016-01-01
This paper sets out the basics for approaching the selection and implementation of a cloud-based communication system to support a business continuity programme, including: • consideration for how a cloud-based communication system can enhance a business continuity programme; • descriptions of some of the more popular features of a cloud-based communication system; • options to evaluate when selecting a cloud-based communication system; • considerations for how to design a system to be most effective for an organisation; • best practices for how to conduct the initial load of data to a cloud-based communication system; • best practices for how to conduct an initial validation of the data loaded to a cloud-based communication system; • considerations for how to keep contact information in the cloud-based communication system current and accurate; • best practices for conducting ongoing system testing; • considerations for how to conduct user training; • review of other potential uses of a cloud-based communication system; and • review of other tools and features many cloud-based communication systems may offer.
Advancing global marine biogeography research with open-source GIS software and cloud-computing
Fujioka, Ei; Vanden Berghe, Edward; Donnelly, Ben; Castillo, Julio; Cleary, Jesse; Holmes, Chris; McKnight, Sean; Halpin, patrick
2012-01-01
Across many scientific domains, the ability to aggregate disparate datasets enables more meaningful global analyses. Within marine biology, the Census of Marine Life served as the catalyst for such a global data aggregation effort. Under the Census framework, the Ocean Biogeographic Information System was established to coordinate an unprecedented aggregation of global marine biogeography data. The OBIS data system now contains 31.3 million observations, freely accessible through a geospatial portal. The challenges of storing, querying, disseminating, and mapping a global data collection of this complexity and magnitude are significant. In the face of declining performance and expanding feature requests, a redevelopment of the OBIS data system was undertaken. Following an Open Source philosophy, the OBIS technology stack was rebuilt using PostgreSQL, PostGIS, GeoServer and OpenLayers. This approach has markedly improved the performance and online user experience while maintaining a standards-compliant and interoperable framework. Due to the distributed nature of the project and increasing needs for storage, scalability and deployment flexibility, the entire hardware and software stack was built on a Cloud Computing environment. The flexibility of the platform, combined with the power of the application stack, enabled rapid re-development of the OBIS infrastructure, and ensured complete standards-compliance.
A survey and taxonomy on energy efficient resource allocation techniques for cloud computing systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hameed, Abdul; Khoshkbarforoushha, Alireza; Ranjan, Rajiv
In a cloud computing paradigm, energy efficient allocation of different virtualized ICT resources (servers, storage disks, and networks, and the like) is a complex problem due to the presence of heterogeneous application (e.g., content delivery networks, MapReduce, web applications, and the like) workloads having contentious allocation requirements in terms of ICT resource capacities (e.g., network bandwidth, processing speed, response time, etc.). Several recent papers have tried to address the issue of improving energy efficiency in allocating cloud resources to applications with varying degree of success. However, to the best of our knowledge there is no published literature on this subjectmore » that clearly articulates the research problem and provides research taxonomy for succinct classification of existing techniques. Hence, the main aim of this paper is to identify open challenges associated with energy efficient resource allocation. In this regard, the study, first, outlines the problem and existing hardware and software-based techniques available for this purpose. Furthermore, available techniques already presented in the literature are summarized based on the energy-efficient research dimension taxonomy. The advantages and disadvantages of the existing techniques are comprehensively analyzed against the proposed research dimension taxonomy namely: resource adaption policy, objective function, allocation method, allocation operation, and interoperability.« less
Email authentication using symmetric and asymmetric key algorithm encryption
NASA Astrophysics Data System (ADS)
Halim, Mohamad Azhar Abdul; Wen, Chuah Chai; Rahmi, Isredza; Abdullah, Nurul Azma; Rahman, Nurul Hidayah Ab.
2017-10-01
Protection of sensitive or classified data from unauthorized access, hackers and other personals is virtue. Storage of data is done in devices such as USB, external hard disk, laptops, I-Pad or at cloud. Cloud computing presents with both ups and downs. However, storing information elsewhere increases risk of being attacked by hackers. Besides, the risk of losing the device or being stolen is increased in case of storage in portable devices. There are array of mediums of communications and even emails used to send data or information but these technologies come along with severe weaknesses such as absence of confidentiality where the message sent can be altered and sent to the recipient. No proofs are shown to the recipient that the message received is altered. The recipient would not find out unless he or she checks with the sender. Without encrypted of data or message, sniffing tools and software can be used to hack and read the information since it is in plaintext. Therefore, an electronic mail authentication is proposed, namely Hybrid Encryption System (HES). The security of HES is protected using asymmetric and symmetric key algorithms. The asymmetric algorithm is RSA and symmetric algorithm is Advance Encryption Standard. With the combination for both algorithms in the HES may provide the confidentiality and authenticity to the electronic documents send from the sender to the recipient. In a nutshell, the HES will help users to protect their valuable documentation and data from illegal third party user.
DAΦNE operation with electron-cloud-clearing electrodes.
Alesini, D; Drago, A; Gallo, A; Guiducci, S; Milardi, C; Stella, A; Zobov, M; De Santis, S; Demma, T; Raimondi, P
2013-03-22
The effects of an electron cloud (e-cloud) on beam dynamics are one of the major factors limiting performances of high intensity positron, proton, and ion storage rings. In the electron-positron collider DAΦNE, namely, a horizontal beam instability due to the electron-cloud effect has been identified as one of the main limitations on the maximum stored positron beam current and as a source of beam quality deterioration. During the last machine shutdown in order to mitigate such instability, special electrodes have been inserted in all dipole and wiggler magnets of the positron ring. It has been the first installation all over the world of this type since long metallic electrodes have been installed in all arcs of the collider positron ring and are currently used during the machine operation in collision. This has allowed a number of unprecedented measurements (e-cloud instabilities growth rate, transverse beam size variation, tune shifts along the bunch train) where the e-cloud contribution is clearly evidenced by turning the electrodes on and off. In this Letter we briefly describe a novel design of the electrodes, while the main focus is on experimental measurements. Here we report all results that clearly indicate the effectiveness of the electrodes for e-cloud suppression.
Enhancing User Access to Australian marine data - the Australian Ocean Data Network
NASA Astrophysics Data System (ADS)
Proctor, R.; Mancini, S.; Blain, P. J.
2017-12-01
The Integrated Marine Observing System (IMOS) is a national project funded by the Australian government established to deliver ocean observations to the marine and climate science community. Now in its 10th year its mission is to undertake systematic and sustained observations and to turn them into data, products and analyses that can be freely used and reused for broad societal benefits. As IMOS has matured as an observing system the expectation of the system's availability and reliability has also increased and IMOS is now seen as delivering `operational' information; it does this through the Australian Ocean Data Network (AODN). The AODN runs its services in the commercial cloud service Amazon Web Services. This has enabled the AODN to improve the system architecture, utilizing more advanced features like object storage (S3 - Simple Storage Service) and autoscaling features, and introducing new checking and logging procedures in a pipeline approach. This has improved data availability and resilience while protecting against human errors in data handling and providing a more efficient ingestion process. Many of these features are available through AODN to the wider Australian marine and science community enabling the `family' of AODN to grow, thereby enabling rapid access to an increasing collection of ocean observations.
Secure Skyline Queries on Cloud Platform.
Liu, Jinfei; Yang, Juncheng; Xiong, Li; Pei, Jian
2017-04-01
Outsourcing data and computation to cloud server provides a cost-effective way to support large scale data storage and query processing. However, due to security and privacy concerns, sensitive data (e.g., medical records) need to be protected from the cloud server and other unauthorized users. One approach is to outsource encrypted data to the cloud server and have the cloud server perform query processing on the encrypted data only. It remains a challenging task to support various queries over encrypted data in a secure and efficient way such that the cloud server does not gain any knowledge about the data, query, and query result. In this paper, we study the problem of secure skyline queries over encrypted data. The skyline query is particularly important for multi-criteria decision making but also presents significant challenges due to its complex computations. We propose a fully secure skyline query protocol on data encrypted using semantically-secure encryption. As a key subroutine, we present a new secure dominance protocol, which can be also used as a building block for other queries. Finally, we provide both serial and parallelized implementations and empirically study the protocols in terms of efficiency and scalability under different parameter settings, verifying the feasibility of our proposed solutions.
Electron cloud generation and trapping in a quadrupole magnet at the Los Alamos proton storage ring
NASA Astrophysics Data System (ADS)
Macek, Robert J.; Browman, Andrew A.; Ledford, John E.; Borden, Michael J.; O'Hara, James F.; McCrady, Rodney C.; Rybarcyk, Lawrence J.; Spickermann, Thomas; Zaugg, Thomas J.; Pivi, Mauro T. F.
2008-01-01
Recent beam physics studies on the two-stream e-p instability at the LANL proton storage ring (PSR) have focused on the role of the electron cloud generated in quadrupole magnets where primary electrons, which seed beam-induced multipacting, are expected to be largest due to grazing angle losses from the beam halo. A new diagnostic to measure electron cloud formation and trapping in a quadrupole magnet has been developed, installed, and successfully tested at PSR. Beam studies using this diagnostic show that the “prompt” electron flux striking the wall in a quadrupole is comparable to the prompt signal in the adjacent drift space. In addition, the “swept” electron signal, obtained using the sweeping feature of the diagnostic after the beam was extracted from the ring, was larger than expected and decayed slowly with an exponential time constant of 50 to 100μs. Other measurements include the cumulative energy spectra of prompt electrons and the variation of both prompt and swept electron signals with beam intensity. Experimental results were also obtained which suggest that a good fraction of the electrons observed in the adjacent drift space for the typical beam conditions in the 2006 run cycle were seeded by electrons ejected from the quadrupole.
An innovative privacy preserving technique for incremental datasets on cloud computing.
Aldeen, Yousra Abdul Alsahib S; Salleh, Mazleena; Aljeroudi, Yazan
2016-08-01
Cloud computing (CC) is a magnificent service-based delivery with gigantic computer processing power and data storage across connected communications channels. It imparted overwhelming technological impetus in the internet (web) mediated IT industry, where users can easily share private data for further analysis and mining. Furthermore, user affable CC services enable to deploy sundry applications economically. Meanwhile, simple data sharing impelled various phishing attacks and malware assisted security threats. Some privacy sensitive applications like health services on cloud that are built with several economic and operational benefits necessitate enhanced security. Thus, absolute cyberspace security and mitigation against phishing blitz became mandatory to protect overall data privacy. Typically, diverse applications datasets are anonymized with better privacy to owners without providing all secrecy requirements to the newly added records. Some proposed techniques emphasized this issue by re-anonymizing the datasets from the scratch. The utmost privacy protection over incremental datasets on CC is far from being achieved. Certainly, the distribution of huge datasets volume across multiple storage nodes limits the privacy preservation. In this view, we propose a new anonymization technique to attain better privacy protection with high data utility over distributed and incremental datasets on CC. The proficiency of data privacy preservation and improved confidentiality requirements is demonstrated through performance evaluation. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Asbjornsen, H.; Alvarado-Barrientos, M. S.; Bruijnzeel, L. A.; Dawson, T. E.; Geissert, D.; Goldsmith, G. R.; Gomez-Cardenas, M.; Gomez-Tagle, A.; Gotsch, S. F.; Holwerda, F.; McDonnell, J. J.; Munoz Villers, L. E.; Tobon, C.
2013-05-01
Land use conversion and climate change threaten the hydrological services from tropical montane cloud forests (TMCFs), but knowledge about cloud forest ecohydrology and the effects of global change drivers is limited. Here, we present a synthesis of research that traced the hydrologic sources, fluxes and flowpaths under different land cover types degraded pasture, regenerating forest, mature forest, pine reforestation) in a seasonally dry TMCF in Veracruz, Mexico. We used hydrological (cloud water interception, CWI; streamflow) and ecophysiological measurements (transpiration, E; foliar uptake, FU) in combination with stable isotope techniques to elucidate to these ecohydrological processes. Results revealed that CWI was ≤2% of total annual rainfall due to low fog occurrence and wind speeds. Fog without rainfall reduced E by a factor of 4-5 relative to sunny conditions and by a factor of 2 relative to overcast conditions; the water 'gained' from fog suppression was ~80-100 mm year-1 relative to sunny conditions. At the canopy scale, FU resulted in the recovery of 9% of total E, suggesting a crucial role in alleviating water deficit; but not sufficient to offset the 17% water loss from nighttime E. Trees primarily utilized water from 30-50 cm soil depth, while water reaching the stream was derived from deep, 'old' water that was distinct from 'new' rainwater and plant water. Soils had high infiltration rates and water storage capacity, which contributed to the relatively low rainfall-runoff response, mainly generated from deep subsurface flowpaths. Conversion of mature forest to pasture or forest regeneration on former TMCF increased annual water yield by 600 mm and 300 mm, respectively, while planting pine on degraded pastures reduced water yield by 365 mm. Our results suggest that the ecophysiological effects of fog via suppressed E and FU have a greater impact on water yield than direct inputs from CWI in this TMCF. Rapid vertical rainfall percolation and recharge result in a largely groundwater driven system whereby streamflow dynamics is uncoupled from plant water uptake, and water storage and buffering capacity are exceptionally high. These factors, combined with the soil properties, resulted in reduced dry season flows due to land use conversion to pasture only being detected towards the end of the dry season. Projected lifting of the cloud base associated with regional climate change combined with declining rainfall may significantly alter ecohydrological functions of these TMCFs.
Satellite-based estimates of groundwater depletion in the Badain Jaran Desert, China
NASA Astrophysics Data System (ADS)
Jiao, Jiu Jimmy; Zhang, Xiaotao; Wang, Xusheng
2015-03-01
Despite prevailing dry conditions, groundwater-fed lakes are found among the earth's tallest sand dunes in the Badain Jaran Desert, China. Indirect evidence suggests that some lakes are shrinking. However, relatively few studies have been carried out to assess the regional groundwater conditions and the fate of the lakes due to the remoteness and severity of the desert environment. Here we use satellite information to demonstrate an ongoing slow decrease in both lake level and groundwater storage. Specifically, we use Ice, Cloud, and land Elevation Satellite altimetry data to quantify water levels of the lakes and show overall decreases from 2003 to 2009. We also use water storage changes from the Gravity Recovery and Climate Experiment and simulated soil and water changes from the Global Land Data Assimilation System to demonstrate long-term groundwater depletion in the desert. Rainfall increase driven by climate change has increased soil water and groundwater storage to a certain degree but not enough to compensate for the long-term decline. If countermeasures are not taken to control the pumping, many lakes will continue to shrink, causing an ecological and environmental disaster in the fragile desert oases.
InSAR Deformation Time Series Processed On-Demand in the Cloud
NASA Astrophysics Data System (ADS)
Horn, W. B.; Weeden, R.; Dimarchi, H.; Arko, S. A.; Hogenson, K.
2017-12-01
During this past year, ASF has developed a cloud-based on-demand processing system known as HyP3 (http://hyp3.asf.alaska.edu/), the Hybrid Pluggable Processing Pipeline, for Synthetic Aperture Radar (SAR) data. The system makes it easy for a user who doesn't have the time or inclination to install and use complex SAR processing software to leverage SAR data in their research or operations. One such processing algorithm is generation of a deformation time series product, which is a series of images representing ground displacements over time, which can be computed using a time series of interferometric SAR (InSAR) products. The set of software tools necessary to generate this useful product are difficult to install, configure, and use. Moreover, for a long time series with many images, the processing of just the interferograms can take days. Principally built by three undergraduate students at the ASF DAAC, the deformation time series processing relies the new Amazon Batch service, which enables processing of jobs with complex interconnected dependencies in a straightforward and efficient manner. In the case of generating a deformation time series product from a stack of single-look complex SAR images, the system uses Batch to serialize the up-front processing, interferogram generation, optional tropospheric correction, and deformation time series generation. The most time consuming portion is the interferogram generation, because even for a fairly small stack of images many interferograms need to be processed. By using AWS Batch, the interferograms are all generated in parallel; the entire process completes in hours rather than days. Additionally, the individual interferograms are saved in Amazon's cloud storage, so that when new data is acquired in the stack, an updated time series product can be generated with minimal addiitonal processing. This presentation will focus on the development techniques and enabling technologies that were used in developing the time series processing in the ASF HyP3 system. Data and process flow from job submission through to order completion will be shown, highlighting the benefits of the cloud for each step.
The future of Stardust science
NASA Astrophysics Data System (ADS)
Westphal, A. J.; Bridges, J. C.; Brownlee, D. E.; Butterworth, A. L.; de Gregorio, B. T.; Dominguez, G.; Flynn, G. J.; Gainsforth, Z.; Ishii, H. A.; Joswiak, D.; Nittler, L. R.; Ogliore, R. C.; Palma, R.; Pepin, R. O.; Stephan, T.; Zolensky, M. E.
2017-09-01
Recent observations indicate that >99% of the small bodies in the solar system reside in its outer reaches—in the Kuiper Belt and Oort Cloud. Kuiper Belt bodies are probably the best-preserved representatives of the icy planetesimals that dominated the bulk of the solid mass in the early solar system. They likely contain preserved materials inherited from the protosolar cloud, held in cryogenic storage since the formation of the solar system. Despite their importance, they are relatively underrepresented in our extraterrestrial sample collections by many orders of magnitude ( 1013 by mass) as compared with the asteroids, represented by meteorites, which are composed of materials that have generally been strongly altered by thermal and aqueous processes. We have only begun to scratch the surface in understanding Kuiper Belt objects, but it is already clear that the very limited samples of them that we have in our laboratories hold the promise of dramatically expanding our understanding of the formation of the solar system. Stardust returned the first samples from a known small solar system body, the Jupiter-family comet 81P/Wild 2, and, in a separate collector, the first solid samples from the local interstellar medium. The first decade of Stardust research resulted in more than 142 peer-reviewed publications, including 15 papers in Science. Analyses of these amazing samples continue to yield unexpected discoveries and to raise new questions about the history of the early solar system. We identify nine high-priority scientific objectives for future Stardust analyses that address important unsolved problems in planetary science.
Processing Shotgun Proteomics Data on the Amazon Cloud with the Trans-Proteomic Pipeline*
Slagel, Joseph; Mendoza, Luis; Shteynberg, David; Deutsch, Eric W.; Moritz, Robert L.
2015-01-01
Cloud computing, where scalable, on-demand compute cycles and storage are available as a service, has the potential to accelerate mass spectrometry-based proteomics research by providing simple, expandable, and affordable large-scale computing to all laboratories regardless of location or information technology expertise. We present new cloud computing functionality for the Trans-Proteomic Pipeline, a free and open-source suite of tools for the processing and analysis of tandem mass spectrometry datasets. Enabled with Amazon Web Services cloud computing, the Trans-Proteomic Pipeline now accesses large scale computing resources, limited only by the available Amazon Web Services infrastructure, for all users. The Trans-Proteomic Pipeline runs in an environment fully hosted on Amazon Web Services, where all software and data reside on cloud resources to tackle large search studies. In addition, it can also be run on a local computer with computationally intensive tasks launched onto the Amazon Elastic Compute Cloud service to greatly decrease analysis times. We describe the new Trans-Proteomic Pipeline cloud service components, compare the relative performance and costs of various Elastic Compute Cloud service instance types, and present on-line tutorials that enable users to learn how to deploy cloud computing technology rapidly with the Trans-Proteomic Pipeline. We provide tools for estimating the necessary computing resources and costs given the scale of a job and demonstrate the use of cloud enabled Trans-Proteomic Pipeline by performing over 1100 tandem mass spectrometry files through four proteomic search engines in 9 h and at a very low cost. PMID:25418363
Processing shotgun proteomics data on the Amazon cloud with the trans-proteomic pipeline.
Slagel, Joseph; Mendoza, Luis; Shteynberg, David; Deutsch, Eric W; Moritz, Robert L
2015-02-01
Cloud computing, where scalable, on-demand compute cycles and storage are available as a service, has the potential to accelerate mass spectrometry-based proteomics research by providing simple, expandable, and affordable large-scale computing to all laboratories regardless of location or information technology expertise. We present new cloud computing functionality for the Trans-Proteomic Pipeline, a free and open-source suite of tools for the processing and analysis of tandem mass spectrometry datasets. Enabled with Amazon Web Services cloud computing, the Trans-Proteomic Pipeline now accesses large scale computing resources, limited only by the available Amazon Web Services infrastructure, for all users. The Trans-Proteomic Pipeline runs in an environment fully hosted on Amazon Web Services, where all software and data reside on cloud resources to tackle large search studies. In addition, it can also be run on a local computer with computationally intensive tasks launched onto the Amazon Elastic Compute Cloud service to greatly decrease analysis times. We describe the new Trans-Proteomic Pipeline cloud service components, compare the relative performance and costs of various Elastic Compute Cloud service instance types, and present on-line tutorials that enable users to learn how to deploy cloud computing technology rapidly with the Trans-Proteomic Pipeline. We provide tools for estimating the necessary computing resources and costs given the scale of a job and demonstrate the use of cloud enabled Trans-Proteomic Pipeline by performing over 1100 tandem mass spectrometry files through four proteomic search engines in 9 h and at a very low cost. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
COMBAT: mobile-Cloud-based cOmpute/coMmunications infrastructure for BATtlefield applications
NASA Astrophysics Data System (ADS)
Soyata, Tolga; Muraleedharan, Rajani; Langdon, Jonathan; Funai, Colin; Ames, Scott; Kwon, Minseok; Heinzelman, Wendi
2012-05-01
The amount of data processed annually over the Internet has crossed the zetabyte boundary, yet this Big Data cannot be efficiently processed or stored using today's mobile devices. Parallel to this explosive growth in data, a substantial increase in mobile compute-capability and the advances in cloud computing have brought the state-of-the- art in mobile-cloud computing to an inflection point, where the right architecture may allow mobile devices to run applications utilizing Big Data and intensive computing. In this paper, we propose the MObile Cloud-based Hybrid Architecture (MOCHA), which formulates a solution to permit mobile-cloud computing applications such as object recognition in the battlefield by introducing a mid-stage compute- and storage-layer, called the cloudlet. MOCHA is built on the key observation that many mobile-cloud applications have the following characteristics: 1) they are compute-intensive, requiring the compute-power of a supercomputer, and 2) they use Big Data, requiring a communications link to cloud-based database sources in near-real-time. In this paper, we describe the operation of MOCHA in battlefield applications, by formulating the aforementioned mobile and cloudlet to be housed within a soldier's vest and inside a military vehicle, respectively, and enabling access to the cloud through high latency satellite links. We provide simulations using the traditional mobile-cloud approach as well as utilizing MOCHA with a mid-stage cloudlet to quantify the utility of this architecture. We show that the MOCHA platform for mobile-cloud computing promises a future for critical battlefield applications that access Big Data, which is currently not possible using existing technology.
Sensitive method for the determination of different S(IV) species in cloud and fog water.
Lammel, G
1996-08-01
Suppressed ion chromatography has been applied to the determination of S(IV) species in cloud and fog water in the range 0.012-2.4 mg S(IV)-S/L. The samples have been preserved prior to storage and S(IV) species have been determined as hydroxy methanesulfonate (HMS) together with the low molecular weight carboxylic acid anions, formate and acetate. Samples have been divided and treated differently such that total S(IV) as well as the non-oxidizable fraction of S(IV) (as given by the reactivity with H(2)O(2), added in surplus) could be determined. The difference between the two corresponds to the S(IV) fraction subjected to oxididation, which is of paramount interest in cloud and fogwater chemistry.
Satellite Imagery Analysis for Automated Global Food Security Forecasting
NASA Astrophysics Data System (ADS)
Moody, D.; Brumby, S. P.; Chartrand, R.; Keisler, R.; Mathis, M.; Beneke, C. M.; Nicholaeff, D.; Skillman, S.; Warren, M. S.; Poehnelt, J.
2017-12-01
The recent computing performance revolution has driven improvements in sensor, communication, and storage technology. Multi-decadal remote sensing datasets at the petabyte scale are now available in commercial clouds, with new satellite constellations generating petabytes/year of daily high-resolution global coverage imagery. Cloud computing and storage, combined with recent advances in machine learning, are enabling understanding of the world at a scale and at a level of detail never before feasible. We present results from an ongoing effort to develop satellite imagery analysis tools that aggregate temporal, spatial, and spectral information and that can scale with the high-rate and dimensionality of imagery being collected. We focus on the problem of monitoring food crop productivity across the Middle East and North Africa, and show how an analysis-ready, multi-sensor data platform enables quick prototyping of satellite imagery analysis algorithms, from land use/land cover classification and natural resource mapping, to yearly and monthly vegetative health change trends at the structural field level.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nicolae, Bogdan; Riteau, Pierre; Keahey, Kate
Storage elasticity on IaaS clouds is a crucial feature in the age of data-intensive computing, especially when considering fluctuations of I/O throughput. This paper provides a transparent solution that automatically boosts I/O bandwidth during peaks for underlying virtual disks, effectively avoiding over-provisioning without performance loss. The authors' proposal relies on the idea of leveraging short-lived virtual disks of better performance characteristics (and thus more expensive) to act during peaks as a caching layer for the persistent virtual disks where the application data is stored. Furthermore, they introduce a performance and cost prediction methodology that can be used both independently tomore » estimate in advance what trade-off between performance and cost is possible, as well as an optimization technique that enables better cache size selection to meet the desired performance level with minimal cost. The authors demonstrate the benefits of their proposal both for microbenchmarks and for two real-life applications using large-scale experiments.« less
A Framework and Improvements of the Korea Cloud Services Certification System.
Jeon, Hangoo; Seo, Kwang-Kyu
2015-01-01
Cloud computing service is an evolving paradigm that affects a large part of the ICT industry and provides new opportunities for ICT service providers such as the deployment of new business models and the realization of economies of scale by increasing efficiency of resource utilization. However, despite benefits of cloud services, there are some obstacles to adopt such as lack of assessing and comparing the service quality of cloud services regarding availability, security, and reliability. In order to adopt the successful cloud service and activate it, it is necessary to establish the cloud service certification system to ensure service quality and performance of cloud services. This paper proposes a framework and improvements of the Korea certification system of cloud service. In order to develop it, the critical issues related to service quality, performance, and certification of cloud service are identified and the systematic framework for the certification system of cloud services and service provider domains are developed. Improvements of the developed Korea certification system of cloud services are also proposed.
A Framework and Improvements of the Korea Cloud Services Certification System
Jeon, Hangoo
2015-01-01
Cloud computing service is an evolving paradigm that affects a large part of the ICT industry and provides new opportunities for ICT service providers such as the deployment of new business models and the realization of economies of scale by increasing efficiency of resource utilization. However, despite benefits of cloud services, there are some obstacles to adopt such as lack of assessing and comparing the service quality of cloud services regarding availability, security, and reliability. In order to adopt the successful cloud service and activate it, it is necessary to establish the cloud service certification system to ensure service quality and performance of cloud services. This paper proposes a framework and improvements of the Korea certification system of cloud service. In order to develop it, the critical issues related to service quality, performance, and certification of cloud service are identified and the systematic framework for the certification system of cloud services and service provider domains are developed. Improvements of the developed Korea certification system of cloud services are also proposed. PMID:26125049